Skip to content

Commit

Permalink
mistral support & meditron readme update
Browse files Browse the repository at this point in the history
  • Loading branch information
martinjaggi authored Nov 29, 2023
1 parent a8822f8 commit 162a0d7
Showing 1 changed file with 7 additions and 5 deletions.
12 changes: 7 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ This library enables pre-training and fine-tuning of large language models (LLMs
Our repository is a modification of the [original Megatron-LM codebase](https://github.com/NVIDIA/Megatron-LM) by Nvidia.

Added key features include:
- [Llama](https://arxiv.org/abs/2302.13971), [Llama 2](https://arxiv.org/abs/2307.09288), [Code Llama](https://arxiv.org/abs/2308.12950) and [Falcon](https://huggingface.co/tiiuae) support
- support training of large models (70B Llama 2, 65B Llama 1, 34B Code Llama, and 40B Falcon) on commodity hardware on multiple nodes
- architectures supported: [Llama](https://arxiv.org/abs/2302.13971), [Llama 2](https://arxiv.org/abs/2307.09288), [Code Llama](https://arxiv.org/abs/2308.12950), [Falcon](https://huggingface.co/tiiuae) and [Mistral](https://arxiv.org/abs/2310.06825)
- support training of large models (70B Llama 2, 65B Llama 1, 34B Code Llama, 40B Falcon and Mistral) on commodity hardware on multiple nodes
- 3-way parallelism: tensor parallel, pipeline parallel and data parallel training (inherited from Megatron)
- full pretraining, finetuning and instruct tuning support
- Support for special tokens & tokenizers
Expand All @@ -35,9 +35,10 @@ make html
```

# Example models trained with *Megatron-LLM*
70B Llama 2 [1](https://huggingface.co/OpenAssistant/llama2-70b-oasst-sft-v10),
40B Falcon [1](https://huggingface.co/OpenAssistant/falcon-40b-megacode2-oasst),
13B Code Llama [1](https://huggingface.co/OpenAssistant/codellama-13b-oasst-sft-v10), ...
70B Llama2: [meditron 70b](https://huggingface.co/epfl-llm/meditron-70b), [llama2-70b-oasst-sft-v10](https://huggingface.co/OpenAssistant/llama2-70b-oasst-sft-v10),
40B Falcon: [falcon-40b-megacode2-oasst](https://huggingface.co/OpenAssistant/falcon-40b-megacode2-oasst),
13B Code Llama: [codellama-13b-oasst-sft-v10](https://huggingface.co/OpenAssistant/codellama-13b-oasst-sft-v10),
7B Llama2: [meditron 7b](https://huggingface.co/epfl-llm/meditron-7b), ...
(Let us know about yours!)

# Citation
Expand All @@ -50,6 +51,7 @@ If you use this software please cite it:
Andreas Köpf and
Kyle Matoba and
Amirkeivan Mohtashami and
Xingyao Wang and
Olivia Simin Fan and
Axel Marmet and
Deniz Bayazit and
Expand Down

0 comments on commit 162a0d7

Please sign in to comment.