MoonshotAI unveils Kimi's large-scale LLM serving architecture arxiv.org 18 points by slothfulhamster 2 days ago
ervinxie 2 days ago I have been wondering the reason why online generative AI can serving so many requests. This really gives me an explanation.
I have been wondering the reason why online generative AI can serving so many requests. This really gives me an explanation.