Skip to content

awaisakram64/finetune-data-generator

Repository files navigation

README

Fine-Tune Data Generator

Introduction

This project provides tools and scripts for generating JSONL files for fine-tuning GPT models. The goal is to create high-quality datasets that can be used to improve model performance on specific tasks.

Installation

  1. Clone the repository:

    git clone https://github.com/awaisakram64/finetune-data-generator.git
    cd finetune-data-generator
  2. Install dependencies:

    pip install -r requirements.txt

Usage

  1. Preprocess raw data:

    python finetune/data_preprocessing.py --input data/raw --output data/processed
  2. Generate JSONL files:

    python finetune/data_generation.py --input data/processed --output data/generated

Contributing

Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages