Skip to content

Actions: huggingface/datatrove

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
1,306 workflow runs
1,306 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

updated url filter blocklists
Secret Leaks #170: Commit 1d25288 pushed by guipenedo
December 3, 2024 15:48 39s multilingual
December 3, 2024 15:48 39s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #342: Pull request #285 synchronize by guipenedo
November 29, 2024 22:25 2m 53s multilingual
November 29, 2024 22:25 2m 53s
remove dumb print
Secret Leaks #169: Commit 696e40f pushed by guipenedo
November 29, 2024 22:25 16s multilingual
November 29, 2024 22:25 16s
Resolve issue 308
Test & Check Code Quality #341: Pull request #309 opened by habanoz
November 29, 2024 20:26 2m 5s habanoz:main
November 29, 2024 20:26 2m 5s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #340: Pull request #285 synchronize by guipenedo
November 29, 2024 18:36 2m 28s multilingual
November 29, 2024 18:36 2m 28s
reuse word tokenizations between blocks
Secret Leaks #168: Commit b1ccdb8 pushed by guipenedo
November 29, 2024 18:36 21s multilingual
November 29, 2024 18:36 21s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #339: Pull request #285 synchronize by guipenedo
November 29, 2024 13:50 2m 41s multilingual
November 29, 2024 13:50 2m 41s
fixes for empty folders
Secret Leaks #167: Commit 610560c pushed by guipenedo
November 29, 2024 13:50 15s multilingual
November 29, 2024 13:50 15s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #338: Pull request #285 synchronize by guipenedo
November 28, 2024 14:50 2m 42s multilingual
November 28, 2024 14:50 2m 42s
fix missing language tokenizer
Secret Leaks #166: Commit ea3adf9 pushed by guipenedo
November 28, 2024 14:50 22s multilingual
November 28, 2024 14:50 22s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #337: Pull request #285 synchronize by guipenedo
November 28, 2024 13:00 2m 32s multilingual
November 28, 2024 13:00 2m 32s
fix for no .remove file
Secret Leaks #165: Commit 42b1e10 pushed by guipenedo
November 28, 2024 13:00 19s multilingual
November 28, 2024 13:00 19s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #336: Pull request #285 synchronize by guipenedo
November 27, 2024 23:45 3m 48s multilingual
November 27, 2024 23:45 3m 48s
remove progress message
Secret Leaks #164: Commit f7a0267 pushed by guipenedo
November 27, 2024 23:45 21s multilingual
November 27, 2024 23:45 21s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #335: Pull request #285 synchronize by guipenedo
November 27, 2024 23:43 2m 49s multilingual
November 27, 2024 23:43 2m 49s
add local version
Secret Leaks #163: Commit 6412f31 pushed by guipenedo
November 27, 2024 23:43 22s multilingual
November 27, 2024 23:43 22s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #334: Pull request #285 synchronize by guipenedo
November 27, 2024 15:15 2m 39s multilingual
November 27, 2024 15:15 2m 39s
add dependency
Secret Leaks #162: Commit 0b5591a pushed by guipenedo
November 27, 2024 15:15 22s multilingual
November 27, 2024 15:15 22s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #333: Pull request #285 synchronize by guipenedo
November 27, 2024 15:12 2m 24s multilingual
November 27, 2024 15:12 2m 24s
updated work_tokenizer assignments and added burmese
Secret Leaks #161: Commit a2ceb48 pushed by guipenedo
November 27, 2024 15:11 22s multilingual
November 27, 2024 15:11 22s
[fixbug]: Fixed the issue in MinhashBuildIndex where get_datafolder w…
Secret Leaks #160: Commit fe81883 pushed by guipenedo
November 27, 2024 14:55 24s main
November 27, 2024 14:55 24s
[fixbug]: Fixed the issue in MinhashBuildIndex where get_datafolder w…
Test & Check Code Quality #332: Commit fe81883 pushed by guipenedo
November 27, 2024 14:55 1m 55s main
November 27, 2024 14:55 1m 55s
[fixbug]: Fixed the issue in MinhashBuildIndex where get_datafolder w…
Test & Check Code Quality #331: Pull request #307 opened by Youggls
November 27, 2024 14:53 1m 54s Youggls:main
November 27, 2024 14:53 1m 54s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #330: Pull request #285 synchronize by guipenedo
November 27, 2024 09:30 2m 34s multilingual
November 27, 2024 09:30 2m 34s
network limiting
Secret Leaks #159: Commit cf4668a pushed by guipenedo
November 27, 2024 09:30 17s multilingual
November 27, 2024 09:30 17s