High-throughput and memory-efficient inference and serving engine for LLMs
The package has no detailed description
unstable
0.16.0
python314Packages
Yes
No
Free
32