MLSYS ENGINEERING

1.1. An intuitive understanding

MLSys is an emerging field at the intersection of machine learning and computer systems. When I think about MLSys, names like CUDA, Triton, PyTorch, and vLLM come to mind. If you have heard any of these, you have already touched MLSys. If not, we are about to explore each of them in this book. These are the software tools that enable machine learning models to run on hardware accelerators like GPUs or TPUs. We will look at them from an MLSys engineer's perspective. MLSys focuses on how software and hardware work together efficiently to run machine learning models.

From a practitioner's standpoint, we need MLSys knowledge to answer questions we face in real-world production:

  • How do I run my model faster on a GPU?
  • How do I reduce the GPU memory consumption?
  • How can I use multiple GPUs on multiple machines altogether?
  • And many others.