Skip to content

Commit

Permalink
feat: add triton kernels to decrease latency of large batches (#2687)
Browse files Browse the repository at this point in the history
* feat: add triton kernels to decrease latency of large batches

* cast to int32

* fix kernel

* fix kernel

* disable triton on rocm

* fix speculation

* add slots filtering kernel
  • Loading branch information
OlivierDehaene authored Oct 25, 2024
1 parent 0f346a3 commit 6f88bd9
Show file tree
Hide file tree
Showing 4 changed files with 649 additions and 194 deletions.
Loading

0 comments on commit 6f88bd9

Please sign in to comment.