MLOPs Zoomcamp (Week 2)

MLOPs Zoomcamp (Week 2)

Experiment Tracking and Model Management

Introduction

As we know machine learning and data science is a very iterative process. There is never a single approach that guarantees the solution to the problem using data. We need to try various things before we can reach any kind of conclusion. After the feature engineering step in machine learning life cycle, comes the model training step. The model training or model building step is one the most iterative steps in the entire machine learning lifecycle. We start with some base models with default hyperparameters and check performance of the model. Then we try a bunch of different values for hyperparameters for different models. It’s obvious that we are going to get various metrics for various combinations of hyperparameters with different models. Oftentimes it gets very messy and we can get confused within the experiment so, it is very important to keep record of all these experiments we try during the model building step so that the work can be reproducible, organized and optimized.

To solve this problem, we can keep the record of every single hyperparameter and models with respective metrics of the trial of model on training and testing data in spreadsheets ,which seems to be the most basic one and one which we all come up with in mind when we need to keep some records. But keeping record of experiments on spreadsheets is prone to human error as we are manually entering the record of each experiment. Besides this, it becomes very tedious to keep records of all the information about the experiment in order to reproduce the experiment.

Experiment tracking is one of the key features of MLOPs. Experiment tracking is the process of keeping track of all the relevant information from an ML experiment, which includes:

  • Source Code
  • Environment
  • Data
  • Model
  • Hyperparameters
  • Metrics

To tackle with this situation, we will be using `MLFLOW`, an open source platform for the machine learning lifecycle. In practice, it’s just a python package that can be installed with pip, and it contains four main modules:

  • Tracking
  • Models
  • Model Registry
  • Projects

Experiment Tracking with MLFlow

The MLFlow Tracking module allows you to organize your experiments into runs, each experiment run is a trial in ML experiment where we test model performance with various hyperparameters combination. It then helps to keep track of :-

  • Parameters
  • Metrics
  • Metadata
  • Artifacts (it can be anything from visualization to the entire dataset, something we would like to keep associated with the experiment)
  • Models

Model Management using Model Registry

At certain point in ML experiment, we might get few models with desired performance scores. It's likely that we would want to push these models to some kind of pool or model collection store, from where the engineer responsible for deployment of model would deploy the latest model into production, considering various factors such as the complexity of the model, metrics of the model and time it took to train the model. MLFlow helps us with Model Registry, which stores the model that are registered from ML experiment. From this Model Registry, the deployment engineer (just a representative term, it might differ in real scenario) takes the latest version of the model and deploys to production.

So these were my key takeaways from second week in #mlops-zoomcamp. Had a really amazing time learning new concepts and tools to assist practitioners and professionals during machine learning lifecycle. Happy learning!