Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't push a zim file if the test failed / implement Quarantine #10

Open
jmontleon opened this issue Jun 1, 2021 · 4 comments
Open
Labels
enhancement New feature or request stale
Milestone

Comments

@jmontleon
Copy link

Problem

The task generated a garbage zim file. The test results in the log backs this up, as does the size of ~1MB instead of the typical ~45MB. If the test failed it probably shouldn't be published/made available.

If I follow the link https://download.kiwix.org/zim/archlinux_en_all_maxi.zim this is what I get though. I guess kiwix is probably just looking at the latest available copy.

[INFO] Checking zim file /data/archlinux_en_all_nopic_2021-05.zim
[INFO] Verifying ZIM-file structure integrity...
[INFO] Avoiding redundant checksum test (already performed by the integrity check).
[INFO] Searching for metadata entries...
[INFO] Searching for Favicon...
[INFO] Searching for main page...
[INFO] Verifying Articles' content...
[INFO] Searching for redundant articles...
  Verifying Similar Articles for redundancies...
[ERROR] Invalid internal links found:
  The following links:
- Arch_Linux
(A/Arch_Linux) were not found in article A/Main_page
[INFO] Overall Test Status: Fail
[INFO] Total time taken by zimcheck: 0 seconds.

Reproducing steps

This zim has broken occasionally before but it seems like a transient issue that usually gets resolved on the next build.

@rgaudin
Copy link
Member

rgaudin commented Jun 2, 2021

Thank you @jmontleon for your report. We actually have that in place already in our receiver code. See here but as you can see here it is currently disabled.
We disabled it at some point because it was creating a bottleneck but maybe it's OK to bring it back… @kelson42 ?

Ultimate goal is to stop relying on this. Zims will only be published by the CMS (to come) if satisfying defined criteria (zimcheck status from the zimfarm being a source).

@kelson42
Copy link
Contributor

kelson42 commented Jun 2, 2021

@rgaudin @jmontleon The whole problem is known indeed and the plan is clear: we will develop (actually it should already be online if we would not be late!). The CMS should check the zimcheck json output (available in next release) and based on threshold decided to let go through the quarantine or not.

@kelson42 kelson42 transferred this issue from openzim/zimfarm Jun 2, 2021
@kelson42 kelson42 added the enhancement New feature or request label Jun 2, 2021
@kelson42 kelson42 changed the title Don't push a zim file if the test failed Don't push a zim file if the test failed / implement Quarantine Jun 2, 2021
@openzim openzim deleted a comment Jun 2, 2021
@stale
Copy link

stale bot commented Aug 2, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

@stale stale bot added the stale label Aug 2, 2021
@kelson42 kelson42 added this to the M2 milestone Dec 21, 2021
@stale stale bot removed the stale label Dec 21, 2021
@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

@stale stale bot added the stale label Mar 2, 2022
@rgaudin rgaudin modified the milestones: M2, M3 May 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

3 participants