In this talk, Charles Frye gives a guided tour through the components of a self-hosted LLM service, from hardware considerations to engineering tools like ‘evals,’ all the way to the application layer. We’ll consider the open weights models, open source software, and infrastructure that power LLM applications. He will heavily shill the open source vLLM project.
About the Speaker
Charles Frye builds applications of neural networks at Modal. He got his PhD at Berkeley for work on neural network optimization. He previously worked at Weights & Biases and Full Stack Deep Learning.
Top comments (0)