Tokasaurus: An LLM Inference Engine for High-Throughput Workloads (scalingintelligence.stanford.edu)
from yogthos@lemmy.ml to technology@lemmy.ml on 06 Jun 15:42
https://lemmy.ml/post/31262000

#technology

threaded - newest