Gemma-2 2B fine-tuned for Structured Data Extraction

This project is a collection of notebook and a simple flask web server to serve Gemma-2 using llama-cpp.

The goal of this project is to fine-tune a model to get a better result on the task of to the task of extracting data into a structured format (JSON).

You will need to provide the output schema in openapi format and the text (context).

⛩️ Project Architecture

The project is divided between notebook for the fine-tuning, quantization and evaluation and python files.

Source	Description
➡️ Gemma-2 Finetuning	A notebook that shows how tofine-tune and quantize gemma2-2b-it using the unsloth and hugging-face libraries.
➡️ Server	A simple flask REST server using llama-cpp with a 4 bit quantized model.
➡️ CI/CD	A github action consisting of a formatting/linting step with ruff, testing with pytest and building the docker image.
➡️ Dockerfile	A mutlistage dockerfile to build the server with gunicorn.

📊 Details about the Dataset

The different finetuned models can be found in safetensors and GGUF format (4bit, 8bit) on the hugging-face hub at bastienp/Gemma-2-2B-it-JSON-data-extration.

Note: It also gives more details on how to use it with llama-cpp or unsloth.

💻 Installation

Dev setup

Recommended: Use the fast Python package installer and resolver uv from astral.

Alternatively, you can replace this command with pip. You can find the documentation for installing uv here.

Sync the dependencies with uv

uv venv .venv

source .venv/bin/activate

uv sync --all-extras --dev # in addition it adds pytest and ruff

Launch a flask dev server

flask --app src.web.app run --debug

To reproduce the fine-tuning, the easiest way is to use Google Collab (the free version is sufficient).

Run the tests (API testing)

pytest

Note: An example of how to call the API and the prompt format can be found in examplesexample_api_call.py.

👥 Deployment setup

In order to deploy the model the easiest way to go is to use the provided docker image.

Pull the image from github (buit from the CI):

docker pull ghcr.io/bastienpo/unsloth_finetuning:main

Note: Otherwise you can build the image yourself

docker build -tag unsloth_finetuning:0.0.1 .

Run the docker image

docker run -p 8000:8000 -d unsloth_finetuning:main # or 0.0.1

Make a post request

curl -i -H "Content-Type: application/json" -X POST -d '{"query": "How are you ?"}' http://localhost:8000/api/v1/chat/completions

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
examples		examples
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gemma-2 2B fine-tuned for Structured Data Extraction

⛩️ Project Architecture

📊 Details about the Dataset

💻 Installation

Dev setup

👥 Deployment setup

About

Packages

Languages

License

bastienpo/unsloth_finetuning

Folders and files

Latest commit

History

Repository files navigation

Gemma-2 2B fine-tuned for Structured Data Extraction

⛩️ Project Architecture

📊 Details about the Dataset

💻 Installation

Dev setup

👥 Deployment setup

About

Topics

Resources

License

Stars

Watchers

Forks

Packages 0

Languages

Packages