Taalas’ silicon Llama achieves 17K tokens/sec per user, nearly 10X faster than the current state of the art, while costing 20X less to build, and consuming 10X less power. (taalas.com)
from yogthos@lemmy.ml to technology@lemmy.ml on 21 Feb 01:11
https://lemmy.ml/post/43472978

#technology

threaded - newest