Tool | Description |
---|---|
Apache TVM | a compiler stack for deep learning systems. It is designed to close the gap between the productivity-focused deep learning frameworks, and the performance- and efficiency-focused hardware backends |
Hidet | An open-source deep learning compiler, written in Python. It supports end-to-end compilation of DNN models from PyTorch and ONNX to efficient cuda kernels. |
OpenVINO™ | an open-source toolkit for optimizing and deploying deep learning models. It provides boosted deep learning performance for vision, audio, and language models from popular frameworks like TensorFlow, PyTorch, and more. |
Speedster | automatically applies the best set of SOTA optimization techniques to achieve the maximum inference speed-up (latency, throughput, model size) physically possible on your hardware (single machine) |
Neural Magic SparseML | An open-source model optimization toolkit that enables you to create inference-optimized sparse models using pruning, quantization, and distillation algorithms |
Nvidia TensorRT | SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications. |
XLA | takes models from popular ML frameworks such as PyTorch, TensorFlow, and JAX, and optimizes them for high-performance execution across different hardware platforms including GPUs, CPUs, and ML accelerators |
Zoo | Description |
---|---|
Hailo Model Zoo | Provides deep learning models for various computer vision tasks. The pre-trained models can be used to create fast prototypes on Hailo devices |
Nvidia Pretrained AI Models | A collection of 600+ highly accurate models built by NVIDIA researchers and engineers using representative public and proprietary datasets for domain-specific tasks. |
OpenVINO Model Zoo | Browse through over 200 neural network models, both public and from Intel, and pick the right one for your solution. Types include object detection, classification, image segmentation, handwriting recognition, text to speech, pose estimation, and others. |
Pytorch Hub | Discover and publish models to a pre-trained model repository designed for research exploration. |
Torch Serve Model Zoo | Pre-trained and pre-packaged, ready to be served for inference with TorchServe |
Torch Vision Models | Contains definitions of models for addressing different tasks, including: image classification, pixelwise semantic segmentation, object detection, instance segmentation, person keypoint detection, video classification, and optical flow |
Technology | Description |
---|---|
WONNX | A GPU-accelerated ONNX inference run-time written 100% in Rust, ready for the web. |
Technology | Description |
---|---|
Axelera AI | Metis AI Platform – a holistic hardware and software solution for AI inference at the Edge – empowers computer vision applications to become more accessible and powerful than ever before. At its core is our new Metis AI Processing Unit (AIPU), which establishes new price/performance and performance/Watt standards. |
Blaize | Blaize Pathfinder and Xplorer AI Edge Platforms that are more efficient, more flexible, more accurate and more cost effective. Now you can deploy AI on the edge without sacrifice. |
DeGirium | DeGirum ORCA™ is flexible, efficient, and affordable AI accelerator IC. ORCA provides application developers the ability to create rich, sophisticated, and highly functional products at the power and price suitable for the edge. ORCA™ is powered by a very efficient compute architecture, with support for pruned models. |
Hailo | Hailo-8™ edge AI processor, featuring up to 26 tera-operations per second (TOPS), significantly outperforms all other edge processors. Its area and power efficiency are far superior to other leading solutions by a considerable order of magnitude – at a size smaller than a penny even including the required memory. |
Sapeon | SAPEON has an optimal architecture for low-latency, large-scale inference of deep neural networks. Our products are optimally designed to process artificial intelligence tasks faster, using less power by efficiently processing large amounts of data simultaneously. |
Tool | Description |
---|---|
UpTrain | An open-source, data-secure tool for ML practitioners to observe and refine their ML models by monitoring their performance, checking for (data) distribution shifts, and collecting edge cases to retrain them upon. |
MLC LLM | MLC LLM is a universal solution that allows any language models to be deployed natively on a diverse set of hardware backends and native applications, plus a productive framework for everyone to further optimize model performance for their own use cases. |