MLflow XGBoost flavor with probabilities

MLflow provides a convenient way to serve ML models with a single shell command:

mlflow models serve <path_to_the_model>

It installs model dependencies and starts inferences server with a REST API endpoint for inference called /invocations.

Let's build a simple XGBoost model and try to make a few predictions using MLflow.

We'll be using Iris dataset from scikit-learn following XGBoost Python example.

First, dependencies need to be installed:

pip install mlflow==2.14.2 xgboost==2.1.0 scikit-learn==1.5.1

Let's load and prepare training and test datasets:

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Prepare training dataset
dataset = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    dataset["data"], dataset["target"], test_size=0.2, random_state=42
)
# Print dataset (Scikit Learn Bunch object) information
print("Shape of data:", dataset.data.shape)
print("Feature names:", list(dataset.feature_names))
print("Shape of target:", dataset.target.shape)
print("Target names:", list(dataset.target_names))
# Print training and test data shapes
for name in "X_train, X_test, y_train, y_test".split(", "):
    print(f"{name}.shape: {locals()[name].shape}")

Should produce an output:

Shape of data: (150, 4)
Feature names: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
Shape of target: (150,)
Target names: ['setosa', 'versicolor', 'virginica']
X_train.shape: (120, 4)
X_test.shape: (30, 4)
y_train.shape: (120,)
y_test.shape: (30,)

The dataset has 4 features (sepal length, sepal width, petal length, petal width) and 3 classes of Iris flowers (setosa, versicolor, virginica).

Now let's train XGBoost model:

xgb_model = XGBClassifier(n_estimators=2, max_depth=2, learning_rate=1, objective="binary:logistic")
xgb_model.fit(X_train, y_train)

The model can be used directly to predict the flower classes:

xgb_model.predict(X_test)

It outputs 1D numpy array with the classes:

array([1, 0, 2, 1, 1, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2, 0, 2,
   0, 2, 2, 2, 2, 2, 0, 0])

However, sometimes it's essential to have probabilities of classification and XGBoost provides predict_proba method:

# predict probabilities of flower classes
xgb_model.predict_proba(X_test)

It outputs a 2D numpy array of the shape (30, 3) with the probabilities for each class:

array([[0.04561405, 0.9090471 , 0.04533876],
       [0.92584455, 0.03829525, 0.03586026],
       [0.03954182, 0.04928818, 0.91116995],
       [0.04473383, 0.89150536, 0.06376083],
       [0.04222472, 0.841501  , 0.11627424],
       [0.92584455, 0.03829525, 0.03586026],
       [0.04561405, 0.9090471 , 0.04533876],
       [0.03954182, 0.04928818, 0.91116995],
       [0.04473383, 0.89150536, 0.06376083],
       [0.04561405, 0.9090471 , 0.04533876],
       [0.03954182, 0.04928818, 0.91116995],
       [0.92584455, 0.03829525, 0.03586026],
       [0.92584455, 0.03829525, 0.03586026],
       [0.92584455, 0.03829525, 0.03586026],
       [0.92584455, 0.03829525, 0.03586026],
       [0.04473383, 0.89150536, 0.06376083],
       [0.03954182, 0.04928818, 0.91116995],
       [0.04561405, 0.9090471 , 0.04533876],
       [0.04561405, 0.9090471 , 0.04533876],
       [0.03954182, 0.04928818, 0.91116995],
       [0.92584455, 0.03829525, 0.03586026],
       [0.04453262, 0.14168864, 0.81377876],
       [0.92584455, 0.03829525, 0.03586026],
       [0.03954182, 0.04928818, 0.91116995],
       [0.03954182, 0.04928818, 0.91116995],
       [0.03954182, 0.04928818, 0.91116995],
       [0.03954182, 0.04928818, 0.91116995],
       [0.03954182, 0.04928818, 0.91116995],
       [0.92584455, 0.03829525, 0.03586026],
       [0.92584455, 0.03829525, 0.03586026]], dtype=float32)

Now let's save XGBoost model with MLflow:

import mlflow.xgboost
mlflow.xgboost.save_model(xgb_model, 'mlflow_xgb_iris_model')

MLflow creates mlflow_xgb_iris_model directory with these files:

MLmodel - MLflow model manifest
conda.yaml - Conda requirements for the model
model.xgb - the model itself
python_env.yaml - Python environment specification
requirements.txt - Python requirements used in python_env.yaml

The model can be served using MLflow CLI command:

mlflow models serve -m mlflow_xgb_iris_model --env-manager local
2024/07/13 11:39:32 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2024/07/13 11:39:32 INFO mlflow.pyfunc.backend: === Running command 'exec gunicorn --timeout=60 -b 127.0.0.1:5000 -w 1 ${GUNICORN_CMD_ARGS} -- mlflow.pyfunc.scoring_server.wsgi:app'
[2024-07-13 11:39:32 +0200] [32726] [INFO] Starting gunicorn 22.0.0
[2024-07-13 11:39:32 +0200] [32726] [INFO] Listening at: http://127.0.0.1:5000 (32726)
[2024-07-13 11:39:32 +0200] [32726] [INFO] Using worker: sync
[2024-07-13 11:39:32 +0200] [32728] [INFO] Booting worker with pid: 32728

--env-manager local argument is used to prevent MLflow from installing model dependencies, since they are already present in the virtual environment.

Now let's use X_test values to make request to MLflow inference API:

import numpy as np  # just for a better representation of result
import requests

MODEL_ENDPOINT = "http://127.0.0.1:5000/invocations"

response = requests.post(MODEL_ENDPOINT, json={"inputs": X_test.tolist()})

np.array(response.json()["predictions"])

Returned data contains predicted classes for X_test values:

array([1, 0, 2, 1, 1, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2, 0, 2,
   0, 2, 2, 2, 2, 2, 0, 0])

What if probabilities need to be calculated?

mlflow.xgboost model flavor calls only predict method and does not support predict_proba method.

That's why mlflow-xgboost-proba Python package has been created.

It implements mlflow_xgboost_proba MLflow flavour, which allows to run predict_proba method:

pip install mlflow-xgboost-proba==0.3.1

import mlflow_xgboost_proba
mlflow_xgboost_proba.save_model(xgb_model, 'mlflow_xgb_proba_iris_model')

Let's run inference of the new model flavor with MLflow on a port 5001 (instead of default 5000):

mlflow models serve -m mlflow_xgb_proba_iris_model --env-manager local --port 5001

and make API request again:

import numpy as np
import requests

MODEL_ENDPOINT = "http://127.0.0.1:5001/invocations"
response = requests.post(MODEL_ENDPOINT, json={"inputs": X_test.tolist()})
np.array(response.json()["predictions"])

This time probabilities of the classes are returned:

array([[0.04561405, 0.90904713, 0.04533876],
       [0.92584455, 0.03829525, 0.03586026],
       [0.03954182, 0.04928818, 0.91116995],
       [0.04473383, 0.89150536, 0.06376083],
       [0.04222472, 0.841501  , 0.11627424],
       [0.92584455, 0.03829525, 0.03586026],
       [0.04561405, 0.90904713, 0.04533876],
       [0.03954182, 0.04928818, 0.91116995],
       [0.04473383, 0.89150536, 0.06376083],
       [0.04561405, 0.90904713, 0.04533876],
       [0.03954182, 0.04928818, 0.91116995],
       [0.92584455, 0.03829525, 0.03586026],
       [0.92584455, 0.03829525, 0.03586026],
       [0.92584455, 0.03829525, 0.03586026],
       [0.92584455, 0.03829525, 0.03586026],
       [0.04473383, 0.89150536, 0.06376083],
       [0.03954182, 0.04928818, 0.91116995],
       [0.04561405, 0.90904713, 0.04533876],
       [0.04561405, 0.90904713, 0.04533876],
       [0.03954182, 0.04928818, 0.91116995],
       [0.92584455, 0.03829525, 0.03586026],
       [0.04453262, 0.14168864, 0.81377876],
       [0.92584455, 0.03829525, 0.03586026],
       [0.03954182, 0.04928818, 0.91116995],
       [0.03954182, 0.04928818, 0.91116995],
       [0.03954182, 0.04928818, 0.91116995],
       [0.03954182, 0.04928818, 0.91116995],
       [0.03954182, 0.04928818, 0.91116995],
       [0.92584455, 0.03829525, 0.03586026],
       [0.92584455, 0.03829525, 0.03586026]])

links

social