LLMs from scratch - Rust

This project aims to provide Rust code that follows the incredible text, Build An LLM From Scratch by Sebastian Raschka. The book provides arguably the most clearest step by step walkthrough for building a GPT-style LLM. Listed below are the titles for each of the 7 Chapters of the book.

Understanding large language models
Working with text data
Coding attention mechanisms
Implementing a GPT model from scratch to generate text
Pretraining an unlabeled data
Fine-tuning for classification
Fine-tuning to follow instructions

The code (see associated github repo) provided in the book is all written in PyTorch (understandably so). In this project, we translate all of the PyTorch code into Rust code by using the Candle crate, which is a minimalist ML Framework.

Usage

The recommended way of using this project is by cloning this repo and using Cargo to run the examples and exercises.

# SSH
git clone git@github.com:nerdai/llms-from-scratch-rs.git

# HTTPS
git clone https://github.com/nerdai/llms-from-scratch-rs.git

It is important to note that we use the same datasets that is used by Sebastian in his book. Use the command below to download the data in a subfolder called data/ which will eventually be used by the examples and exercises of the book.

mkdir -p 'data/'
wget 'https://raw.githubusercontent.com/rabst/LLMs-from-scratch/main/ch02/01_main-chapter-code/the-verdict.txt' -O 'data/the-verdict.txt'

Navigating the code

Users have the option of reading the code via their chosen IDE and the cloned repo, or by using the project's docs.

Running `Examples` and `Exercises`

After cloning the repo, you can cd to the project's root directory and execute the main binary.

# Run code for Example 05.07
cargo run example 05.07

# Run code for Exercise 5.5
cargo run exercise 5.5

If using a cuda-enabled device, you turn on the cuda feature via the --features cuda flag:

# Run code for Example 05.07
cargo run --features cuda example 05.07

# Run code for Exercise 5.5
cargo run --features cuda exercise 5.5

Listing `Examples`

To list the Examples, use the following command:

cargo run list --examples

A snippet of the output is pasted below.

EXAMPLES:
+-------+----------------------------------------------------------------------+
| Id    | Description                                                          |
+==============================================================================+
| 02.01 | Example usage of `listings::ch02::sample_read_text`                  |
|-------+----------------------------------------------------------------------|
| 02.02 | Use candle to generate an Embedding Layer.                           |
|-------+----------------------------------------------------------------------|
| 02.03 | Create absolute postiional embeddings.                               |
|-------+----------------------------------------------------------------------|
| 03.01 | Computing attention scores as a dot product.                         |
...
|-------+----------------------------------------------------------------------|
| 06.13 | Example usage of `train_classifier_simple` and `plot_values`         |
|       | function.                                                            |
|-------+----------------------------------------------------------------------|
| 06.14 | Loading fine-tuned model and calculate performance on whole train,   |
|       | val and test sets.                                                   |
|-------+----------------------------------------------------------------------|
| 06.15 | Example usage of `classify_review`.                                  |
+-------+----------------------------------------------------------------------+

Listing `Exercises`

One can similarly list the Exercises using:

cargo run list --exercises

# first few lines of output
EXERCISES:
+-----+------------------------------------------------------------------------+
| Id  | Statement                                                              |
+==============================================================================+
| 2.1 | Byte pair encoding of unknown words                                    |
|     |                                                                        |
|     | Try the BPE tokenizer from the tiktoken library on the unknown words   |
|     | 'Akwirw ier' and print the individual token IDs. Then, call the decode |
|     | function on each of the resulting integers in this list to reproduce   |
|     | the mapping shown in figure 2.11. Lastly, call the decode method on    |
|     | the token IDs to check whether it can reconstruct the original input,  |
|     | 'Akwirw ier.'                                                          |
|-----+------------------------------------------------------------------------|
| 2.2 | Data loaders with different strides and context sizes                  |
|     |                                                                        |
|     | To develop more intuition for how the data loader works, try to run it |
|     | with different settings such as `max_length=2` and `stride=2`, and     |
|     | `max_length=8` and `stride=2`.                                         |
|-----+------------------------------------------------------------------------|
...
|-----+------------------------------------------------------------------------|
| 6.2 | Fine-tuning the whole model                                            |
|     |                                                                        |
|     | Instead of fine-tuning just the final transformer block, fine-tune the |
|     | entire model and assess the effect on predictive performance.          |
|-----+------------------------------------------------------------------------|
| 6.3 | Fine-tuning the first vs. last token                                   |
|     |                                                                        |
|     | Try fine-tuning the first output token. Notice the changes in          |
|     | predictive performance compared to fine-tuning the last output token.  |
+-----+------------------------------------------------------------------------+

[Alternative Usage] Installing from `crates.io`

Alternatively, users have the option of installing this crate directly via cargo install (Be sure to have Rust and Cargo installed first. See here for installation instructions.):

cargo install llms-from-scratch-rs

Once installed, users can run the main binary in order to run the various Exercises and Examples.

# Run code for Example 05.07
cargo run example 05.07

# Run code for Exercise 5.5
cargo run exercsise 5.5

Name		Name	Last commit message	Last commit date
Latest commit History 216 Commits
.github		.github
.vscode		.vscode
data		data
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMs from scratch - Rust

Usage

Navigating the code

Running `Examples` and `Exercises`

Listing `Examples`

Listing `Exercises`

[Alternative Usage] Installing from `crates.io`

About

Releases

Packages

Contributors 2

Languages

License

nerdai/llms-from-scratch-rs

Folders and files

Latest commit

History

Repository files navigation

LLMs from scratch - Rust

Usage

Navigating the code

Running Examples and Exercises

Listing Examples

Listing Exercises

[Alternative Usage] Installing from crates.io

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Running `Examples` and `Exercises`

Listing `Examples`

Listing `Exercises`

[Alternative Usage] Installing from `crates.io`

Packages