Resource Optimization for ML Inference Serving
EB2 3211 Seminar Room 890 Oval Dr., RaleighTitle: Resource Optimization for ML Inference Serving Abstract: My research focuses on job scheduling and resource management in Machine Learning (ML) and Large Language Model (LLM) systems. With the growing popularity of deep learning models, minimizing the monetary costs and maximizing the goodput of inference-serving systems have become critical challenges. Addressing these challenges requires efficient…