This article provides a good introduction to MLOps, as well as its practices and principles for standardizing and streamlining machine learning (ML) life cycles.
What is Machine Learning Operations (MLOps) and what are the leading industry standards for it? It's a question deserving of an answer, but which neither the literature nor the industry have yet supplied. Seeking to fill this void, this article provides good explanation of MLOps and its concepts for standardizing and streamlining ML life cycles. We'll begin defining basic terminology and outlining the high-level picture of (1) what qualifies as a good MLOps framework and (2) what it should be able to perform.
Many of the great technological strides of the last decade emerged from a desire to harness ever-increasing data to make predictions. Engineers envisioned a world of where bank fraud could be foreseen and prevented, where retail prices could be dynamically optimized, or where image classification could further the robotics industry. Though these examples were part of yesterday's dreams, they have become today's reality thanks to symbiotic developments of data science and artificial intelligence. Strong algorithmic models are at the epicenter of progress when it comes to performing successful prediction tasks.
The last decade has been something of a golden age for Big Data and Artificial Intelligence. Behind the growing prosperity of information are squadrons of data scientists flexing their ingenuity by creating, as if out of thin air, machine learning models for every big and tiny problem in every sector. Companies are still hiring data scientists in droves to develop specialized models for their businesses. However, in their enthusiasm, they often fall prey to the same mistake. If a businesses focuses too much on developing ML models and too little on actually deploying and maintaining them, they may not achieve their desired results.
The goal of data scientists is to dive into solving the minutia of isolated business processes that contribute to the overall success of the business by developing a ML model. As their name implies, their work focuses on (1) managing data and (2) creating models. In terms of managing data, they are responsible for extraction, cleaning, augmentation, normalization, standardization, feature engineering, feature selection, and any other tasks associated with curating available and relevant statistical information. Creating models requires a much more normative, problem-solving perspective. It involves algorithm selection, model training, tuning and evaluation in a way that is ideal for a specific information system. In the end, data scientists finish the job by producing a successful model, which tells the business what to do to have good performance, where "good performance" could be anything, from fraud detection to revenue maximization.
But the story doesn't end here. Building a successful model takes a lot of work, but even after its eventual completion there's no 'happily ever after'. So, what comes next? How do we move from code in Jupyter Notebook to a fully-functioned and automated solution deployed in production? Moreover, what can we do after we deployed our solution? Here is where MLOps fits in.
MLOps is a relatively new term. It refers to a practice that aims to design production frameworks to make further integrations and maintenance of machine learning (ML) models seamless and efficient.
After deploying your ML model, a good MLOps framework should be able to constantly monitor how your model and system/application perform, instantly detect (and auto-correct) model--or system--specific issues, seamlessly integrate new features, and periodically version your artifacts (such as model configurations or the data the model was trained on).
We describe each of those pillars below.
Monitoring is embedded in all engineering practices. Machine learning should be no different. MLOps best practices encourage making expected behavior visible and to set standards that models should adhere to, rather than rely on a 'gut feeling'. That being said, it's essential to track model and app performance metrics, specifically:
Detection engines are invaluable when it comes to building a secure and sustainable data infrastructure. The aim of a detection engine is to architect a fully automatic, self-healing system. Although it may have diminishing returns based on one's desired use-case, creating an automated engine that can detect a problem and act upon it is critical. An MLOps engineer defines the scope of "what-to-fix".
However, you probably don't want to give your algorithmic answer to this dilemma (i.e., your "solution") too much power to change components, but there are some good applications you might want to use an automated detection engine. A well-designed Automated Model Retraining feature can be among the most harmless inclusions in your solution,
Continuous Integration and Continuous Delivery (CI/CD) are among the core pillars of MLOps development. You should be able to implement new features and upgrades into your solution that will check if those modifications are compatible with the running model to avoid its disruption. For this, we use the following components:
Though version control for code is standard procedure in software development, machine learning solutions require a more involved process. This entails versioning the following artifacts:
This article serves as an introduction to the MLOps - an engineering discipline that aims to unify ML development (dev) and ML deployment (ops) to standardize and streamline the continuous delivery of high-performing models in production. I'll soon dive into the nuances of monitoring, detection, versioning, and integrating beyond the fly-by overview provided here. The remaining articles in this series will not only elucidate machine learning concepts vital to building frameworks that can successfully deploy and maintain ML model. I will also show you, step-by-step, how those concepts are implemented in Python. Beyond providing good practices and principles for standardizing the ML life cycle, I hope this series of short guides becomes an empowering tool for data scientists and others seeking to make strong and steady advances in the industry.
Join mlops.dev school to learn the best practices on ML in production.