How to detect drift with Evidently and MLFlow

Learn how to detect and monitor data drift in machine learning models using Evidently and MLflow. This blog provides a step-by-step tutorial using a mobile price prediction dataset, ensuring consistent model performance by tracking and visualizing drift insights over time.

GraphQL has a role beyond API Query Language- being the backbone of application Integration
background Coditation

How to detect drift with Evidently and MLFlow

Data Drift

Data drift, also known as concept drift, refers to the change in patterns of data over time. In the context of machine learning, data drift happens when the statistical properties of the target variable, which the model is trying to predict, change in the unseen data over time.
This change in data patterns can lead to a degradation of model performance because the assumptions that the model learned during training no longer hold. For instance, a model trained to predict customer churn based on historical data may start to perform poorly if the behavior of customers changes significantly due to new market conditions or changes in the company's policies.
There are several types of data drift:

  1. Sudden Drift: This is when the data distribution changes abruptly. This could be due to a change in data collection, a change in policy, or a sudden shift in user behavior.
  2. Incremental Drift: This is a slow and gradual change in data distribution over time. It can be challenging to detect because it happens slowly.
  3. Seasonal Drift: This type of drift is predictable and cyclical. It's often found in data related to fields like retail, finance, and weather where there are regular and predictable changes.

Detecting data drift can be challenging because it requires constant monitoring of the model's input and output data. Some indicators of data drift include a decrease in model performance, an increase in the number of errors, or a change in the distribution of predictions.
MLflow is an open-source platform that helps manage the end-to-end machine learning lifecycle. It includes tools for experiment tracking, model packaging, reproducibility, deployment, and a central model registry. MLflow is designed to work with any machine learning library and algorithm, simplifying the management of ML projects. You can find more about MLflow on their official website.

Evidently

To keep watch on data drift, monitoring model performance and taking precautionary measures becomes a need of time. Evidently is an open source python library that helps to do most of this.
Evidently works with tabular and text data and helps throughout models lifecycle with its reports, tests and monitoring.
For data-drift detection, Evidently has a set of statistical tests and default thresholds depending on type of feature (numeric or categorical). It also allows users to define custom drift detection methods and thresholds. It produces reports that give feature level as well as dataset level data-drift insights. Reports can be visualized as html or further used as json. It also has the capability to integrate with MLops tools like Airflow, MLflow, Metaflow etc.
In this Blog, the attempt is to perform data-drift analysis on a sample dataset and to integrate the evidently output with MLflow in a custom way.

Installation and Setup

For the purpose, we need to install and import the libraries like numpy, pandas, evidently, mlflow and datetime


import numpy as np
import pandas as pd
from evidently.pipeline.column mapping import ColumnMapping
from evidently.dashboard import Dashboard
from evidently.dashboard.tabs import DataDriftTab
import mlflow
from mlflow. tracking import MlflowClient
from datetime import datetime

Dataset

For this experiment, let’s pick up the mobile price prediction dataset with limited features. Features like battery_power, clock_speed, int_memory, mobile_wt, n_cores, ram are the continuous and numerical whereas dual_sim, four_g are the categorical ones. The dataset is divided into two equal halves as reference data (df_ref) and current data (df_curr).


df = pd. read csv ('mobile price.csv')
df ref = df.loc[:500,:]
df curr = df. loc [500:,:J.reset index (drop=True)

A drift is introduced in numeric features battery_power, ram and categorical feature dual_sim of the current dataset.


df curr['battery power'] = df curr['battery power']*1.3+100
df curr ' ram'! = df curr[' ram']*0.8-50
df curr.iloc[:150,:J.dual sim = df curr.iloc[:150,:J.dual sim.replace (0,1)

Code

Dataset and date variables are defined for the naming conventions of the drift reports and MLflow experiment / runs.


dataset= 'mobile price'
2 date = datetime.now().strftime ('Sy-sm-sd SH:%M:85')

Drift analysis can be done only on the features which are common to reference and current dataset. Also the column mapping is necessary for performing suitable statistical tests to calculate drift. Columns are mapped as numerical_features and categorical_features. 
We are using Dashboard with DataDriftTab to calculate covariate drift (i.e. changes in distribution of independent features). It requires reference data, current data and column mapping.


common features = [feature for feature in list(df ref.columns) if feature in list(df curr. columns)]
column mapping = ColumnMapping()
column mapping categorical features = ['dual sim', 'four g'1
column mapping.numerical features = ['battery power', 'clock speed' 'int memory' 'mobile wt' 'n cores' 'ram'!
covariate drift report = Dashboard (tabs= (DataDriftTab()])
covariate drift report. calculate(df ref, df curr, column mapping=column mapping)
covariate output = list(covariate drift report.analyzers results. values ())[0]

The output of the statistical test (p_value) is compared with the significance level (0.05 in this case), If it’s less than significance level then the feature is considered as drifted.


drifted features = []
[]
drift p value = {}
for key in list(covariate output.metrics.features.keys ()):

p val = covariate output.metrics.features [key].p value
if p val<0.05:
drifted_features.append (key)
drift_p_value.update({key: round (p_val,4)})

Output : 


drifted features ['battery_power', 'ram', 'dual_sim']
drift p value
{'battery power': 0.0, 'ram': 0.0, 'dual sim': 0.0}

Integration with MLflow

MLflow experiment is set with the dataset name. We can log different parameters of the experiment like date/time, dataset information, number of features and their counts, the results of the drift analysis and metrics like percent of the drift features, which are easy to extract from evidently reports.


client = MlflowClient ()
mlflow.set experiment (F' {dataset} Drift')
with mlflow.start run(run name = dataset + date) as run:
	mlflow.log param('date',date)
	mlflow.log param('reference data', 'df ref') mlflow.log param('current data', 'df curr')
	mlflow.log_param('n_ features', covariate_output.metrics.n_features) 
	miflow.log param('features', list(covariate output.metrics. features. keys () )) 
	mlflow.log param('n drifted features', covariate output.metrics.n drifted features) 
	mlflow.log param('drifted features'‚drifted features) 
	mlflow.log param('drifted features p vals',drift p value) 
	milflow.log param('dataset drift', covariate_output.metrics.dataset_drift) 
	mlflow.log metric('drifted features percent', covariate output.metrics.share drifted features*100)

MLflow Dashboard 

A new experiment gets created in MLflow and for each run parameters and metrics are logged.We can have different runs with different sets of data and also for the successive data cycles.

MLflow provides functionality to compare runs in tabular form as well as graphically using scatter, contour and parallel coordinate plots to keep track of data quality and drift.

Want to receive update about our upcoming podcast?

Thanks for joining our newsletter.
Oops! Something went wrong.

Latest Articles

Implementing Custom Instrumentation for Application Performance Monitoring (APM) Using OpenTelemetry

Application Performance Monitoring (APM) has become crucial for businesses to ensure optimal software performance and user experience. As applications grow more complex and distributed, the need for comprehensive monitoring solutions has never been greater. OpenTelemetry has emerged as a powerful, vendor-neutral framework for instrumenting, generating, collecting, and exporting telemetry data. This article explores how to implement custom instrumentation using OpenTelemetry for effective APM.

Mobile Engineering
time
5
 min read

Implementing Custom Evaluation Metrics in LangChain for Measuring AI Agent Performance

As AI and language models continue to advance at breakneck speed, the need to accurately gauge AI agent performance has never been more critical. LangChain, a go-to framework for building language model applications, comes equipped with its own set of evaluation tools. However, these off-the-shelf solutions often fall short when dealing with the intricacies of specialized AI applications. This article dives into the world of custom evaluation metrics in LangChain, showing you how to craft bespoke measures that truly capture the essence of your AI agent's performance.

AI/ML
time
5
 min read

Enhancing Quality Control with AI: Smarter Defect Detection in Manufacturing

In today's competitive manufacturing landscape, quality control is paramount. Traditional methods often struggle to maintain optimal standards. However, the integration of Artificial Intelligence (AI) is revolutionizing this domain. This article delves into the transformative impact of AI on quality control in manufacturing, highlighting specific use cases and their underlying architectures.

AI/ML
time
5
 min read