Releases: undreamai/LLMUnity
Releases Β· undreamai/LLMUnity
Release v2.4.1
π Features
- Static library linking on mobile (fixes iOS signing) (PR: #289)
π Fixes
- Fix support for extras (flash attention, iQ quants) (PR: #292)
Release v2.4.0
π Features
- iOS deployment (PR: #267)
- Improve building process (PR: #282)
- Add structured output / function calling sample (PR: #281)
- Update LlamaLib to v1.2.0 (llama.cpp b4218) (PR: #283)
π Fixes
- Clear temp build directory before building (PR: #278)
π¦ General
- Remove support for extras (flash attention, iQ quants) (PR: #284)
- remove support for LLM base prompt (PR: #285)
Release v2.3.0
π Features
- Implement Retrieval Augmented Generation (RAG) in LLMUnity (PR: #246)
π Fixes
- Fixed build conflict, endless import of resources. (PR: #266)
Release v2.2.4
π Features
- Add Phi-3.5 and Llama 3.2 models (PR: #255)
- Speedup LLMCharacter warmup (PR: #257)
π Fixes
- Fix handling of incomplete requests (PR: #251)
- Fix Unity locking of DLLs during cross-platform build (PR: #252)
- Allow spaces in lora paths (PR: #254)
π¦ General
- Set default context size to 8192 and allow to adjust with a UI slider (PR: #258)
Release v2.2.3
π Features
- LlamaLib v1.1.12: SSL certificate & API key for server, Support more AMD GPUs (PR: #241)
- Server security with API key and SSL (PR: #238)
- Show server command for easier deployment (PR #239)
π Fixes
- Fix multiple LLM crash on Windows (PR: #242)
- Exclude system prompt from saving of chat history (PR: #240)
Release v2.2.2
π Features
- Allow to set the LLMCharacter slot (PR: #231)
π Fixes
- fix adding grammar from StreamingAssets (PR: #229)
- fix library setup restart when interrupted (PR: #232)
- Remove unnecessary Android linking in IL2CPP builds (PR: #233)
Release v2.2.1
π Fixes
- Fix naming showing full path when loading model (PR: #224)
- Fix parallel prompts (PR: #226)
Release v2.2.0
π Features
- Update to latest llama.cpp (b3617) (PR: #210)
- Integrate Llama 3.1 and Gemma2 models in model dropdown
- Implement embedding and lora adapter functionality (PR: #210)
- Read context length and warn if it is very large (PR: #211)
- Setup allowing to use extra features: flash attention and IQ quants (PR: #216)
- Allow HTTP request retries for remote server (PR: #217)
- Allow to set lora weights at startup, add unit test (PR: #219)
- allow relative StreamingAssets paths for models (PR: #221)
π Fixes
- Fix set template for remote setup (PR: #208)
- fix crash when stopping scene before LLM creation (PR: #214)
π¦ General
- Documentation/point to gguf format for lora (PR: #215)
Release v2.1.1
π Fixes
- Resolve build directory creation
Release v2.1.0
π Features
- Android deployment (PR: #194)
- Allow to download models on startup with resumable download functionality (PR: #196)
- LLM model manager (PR: #196)
- Add Llama 3 7B and Qwen2 0.5B models (PR: #198)
- Start LLM always asynchronously (PR: #199)
- Add contributing guidelines (PR: #201)