v0.9.2
Features
- server: harden a bit the weights choice to save on disk
- server: better errors for warmup and TP
- server: Support for env value for GPTQ_BITS and GPTQ_GROUPSIZE
- server: Implements sharding for non divisible
vocab_size
- launcher: add arg validation and drop subprocess
- router: explicit warning if revision is not set
Fix
- server: Fixing RW code (it's remote code so the Arch checking doesn't work to see which weights to keep
- server: T5 weights names
- server: Adding logger import to t5_modeling.py by @akowalsk
- server: Bug fixes for GPTQ_BITS environment variable passthrough by @ssmi153
- server: GPTQ Env vars: catch correct type of error by @ssmi153
- server: blacklist local files
New Contributors
- @akowalsk made their first contribution in #585
- @ssmi153 made their first contribution in #590
- @gary149 made their first contribution in #611
Full Changelog: v0.9.1...v0.9.2