Skip to content

ModelTC/quant_horizon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

quant_horizon

quant_horizon is a benchmarking framework designed to evaluate the performance of different GPU kernels.

Prerequisites

To run the benchmark, you need to have the following installed:

  • PyTorch (with CUDA support)
  • CUDA Toolkit

We also provide some basic docker images:

# docker-hub python3.11 torch2.5.1 cuda124
docker pull llmcompression/llmc:pure-24112502-cu124
# docker-hub python3.11 torch2.5.1 cuda121
docker pull llmcompression/llmc:pure-24112502-cu121
# aliyun-hub python3.11 torch2.5.1 cuda124
docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-24112502-cu124
# aliyun-hub python3.11 torch2.5.1 cuda121
docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-24112502-cu121

# Then create a container
docker run --gpus all -itd --ipc=host --name [name]  -v [path]:[path] --entrypoint /bin/bash [image_id]

Make sure to install the necessary dependencies using:

cd quant_horizon
pip install -v -e .

Usage

Benchmark a single shape

cd examples
python bench_single_shape.py

Benchmark all shapes in the transformer model

cd examples
# You just need to put the config.json into the model_path folder.
python bench_model_shape.py --model [model_path] --tp 1 --bs 1 --seqlen 2048

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •