Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: add triton kernels to decrease latency of large batches (#2687)
* feat: add triton kernels to decrease latency of large batches * cast to int32 * fix kernel * fix kernel * disable triton on rocm * fix speculation * add slots filtering kernel
- Loading branch information