Every flop counts: scaling a 300b mixture-of-experts Ling LLM without premium GPUs (arxiv.org)
from yogthos@lemmy.ml to technology@lemmy.ml on 21 May 2025 23:42
https://lemmy.ml/post/30467866

#technology

threaded - newest