From Training to Production
To briefly recap, we have written code that downloaded and prepared data, trained a model, then validated the performance of model. However the code that we wrote is still somewhat ephemeral - the model only existed as long as we had the Jupyter notebook running. How do we put that model into a production environment for others to use with their own data?
After going through this section, you should be able to:
Describe the general process of MLOps from training to inference in production
Export models to binary Python objects using
pickleor Tensorflow’smodel.save()Import previously-exported models into new Python scripts
Model Persistence
Recall that, at a high level, the use of ML involves the following process:
Find or collect raw data about the process or function
Prepare the data for model training or fitting
Train the model using some of the prepared data
Validate the model using some of the prepared data
Deploy the model to analyze new data samples
We’ve look at pretty much all of these steps except for the last one which involves the topic of machine learning operations, or MLOps. In practice, we need a method for saving and deploying a model that has already been trained to an application where it can analyze new data. We certainly don’t want to have to retrain the model every time we start our application, for several reasons:
Model training requires data, which can be large and difficult to ship with our application
Training can be a time-consuming process
The training process might not be possible/reproducible on every device where we wish to deploy our application
All of those reasons motivate the need to be able to save and load models that have already been trained.
Here, we will look at a first method for saving and loading models to a file based on the Python
pickle module, which is part of the standard library. The method we mention has the advantage
that it is simple and can be used with many Python objects, not just models. However, it also comes
with security risks, which we will mention.
The pickle Module
The pickle module is part of the Python standard library and provides functions for serializing
and deserializing Python objects to and from a bytestream.
The process of converting a Python object to a bytestream is referred to as pickling the object, and the reverse process of taking a bytestream and converting it back to a Python object is called unpickling.
Once a Python object has been converted to a bytestream with pickle, the bytestream can then be written to a file. Later, we can read the bytes back out of the file and reconstitute the original Python object.
Many Python objects can be pickled, including the following:
builtin constants (True, False, None)
strings, bytes and bytearrays
some classes and class instances (specifically, the ones that implement
__getstate__())lists, dictionaries, and tuples of picklable objects.
In general, the models we have looked at from sklearn can be pickled.
Using the pickle module is straightforward, and it provides a similar API to that of JSON. We use the following methods for serialization:
pickle.dumps(obj)converts the Python object,obj, to a bytestream.pickle.dump(obj, file)converts the Python object,obj, to a bytestream and writes it tofile.
And similarly, for deserializing:
pickle.loads(bytes)converts thebytesobject to a Python object.pickle.load(file)reads the contents offileand converts the bytes to a Python.
Of course, the load() and loads() functions will fail if the bytes read in were not originally
created by the pickle module.
Practical Example - Iris Classifier
Let’s see this in action. Suppose we have just trained a linear classifier for the Iris data:
1from sklearn import datasets
2from sklearn.model_selection import train_test_split
3from sklearn.linear_model import SGDClassifier
4from sklearn.metrics import accuracy_score
5
6iris = datasets.load_iris()
7X = iris.data
8y = iris.target
9X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y, random_state=1)
10
11clf = SGDClassifier(loss="perceptron", alpha=0.01, random_state=1)
12clf.fit(X_train, y_train)
13print(f'accuracy = {accuracy_score(y_test, clf.predict(X_test))}')
We can use pickle to save the model to a file:
import pickle
with open('my_sgdclf.pkl', 'wb') as f:
pickle.dump(clf, f)
Tip
Use descriptive filenames when naming your pickle file. Consider naming it after the model, the source of the training data, the version (Git commit hash) of your code, etc.
Note the use of writing to the file in binary format (the 'wb' flag in the call to open).
This is important - the pickle output is a bytestream so without the b, the write will fail.
Next, create a brand new Python script to read the model back in from the file. Note we don’t need to re-train the model, but in this case we are pulling the raw Iris data again so we have something to test:
1from sklearn import datasets
2from sklearn.model_selection import train_test_split
3from sklearn.linear_model import SGDClassifier
4from sklearn.metrics import accuracy_score
5import pickle
6
7iris = datasets.load_iris()
8X = iris.data
9y = iris.target
10X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y, random_state=1)
11
12with open('my_sgdclf.pkl', 'rb') as f:
13 clf = pickle.load(f)
14print(f'accuracy = {accuracy_score(y_test, clf.predict(X_test))}')
Note
Note that in general, Python callables (e.g., functions) cannot be pickled. If you need to serialize
a callable, consider using the third-party cloudpickle package instead, available from pypi.
EXERCISE
Write a short Python script that loads in the pre-trained classifier, and classifies a sample
with sepal and petal measurements of [5.1, 3.5, 1.4, 0.2].
A Note on Security with pickle
We need to be very careful when using the pickle library to load Python objects. It is possible to
serialize code that could harm your machine when loaded. For that reason, it is recommended that you
only use pickle.load() and pickle.loads() on files and bytestreams that you know and trust
(i.e., that you wrote yourself). As a result, pickle is not a suitable solution for some cases;
for example, a web API or service that allows users to upload their own model and execute them on the
cloud.
Warning
Never use pickle to load a bytestream that you did not write yourself. You could do harm to your computer.
Serializing and Deserializing Tensorflow Models
The Python pickle module is great for serializing a sklearn model. However, for serializing a
Tensorflow model we recommend using the built in model.save() method. In general, attempting to
use pickle on Tensorflow models can lead to errors related to model objects not being
pickleable.
Practical Example - Mushroom Classifier
We’ll illustrate the techniques in this section using a model trained against the Mushroom dataset. Recall that dataset consisted of 8,124 samples each with 22 features and a binary classification (poisonous or edible).
1import random
2import pandas as pd
3from ucimlrepo import fetch_ucirepo
4from sklearn.model_selection import train_test_split
5from sklearn.metrics import classification_report
6import tensorflow as tf
7from tensorflow.keras import Sequential
8from tensorflow.keras.layers import Input, Dense
9
10tf.random.set_seed(123)
11random.seed(123)
12
13# Fetch dataset
14mushroom = fetch_ucirepo(id=73)
15X = mushroom.data.features
16y = mushroom.data.targets
17X_clean = X.drop(columns=['stalk-root'])
18
19# Encode data
20X_encoded = pd.get_dummies(X_clean)
21y_encoded = y['poisonous'].map({'p': 1, 'e': 0})
22
23# Split the dataset into training and testing sets
24X_train, X_test, y_train, y_test = train_test_split(
25 X_encoded, y_encoded, test_size=0.3, stratify=y_encoded, random_state=123
26)
27
28# Create model with sequential API
29model = Sequential([
30 Input(shape=(112,)),
31 Dense(10, activation='relu'),
32 Dense(1, activation='sigmoid')
33])
34
35# Compile the model with appropriate settings for binary classification
36model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
37
38# Train the model with the specified parameters
39model.fit(X_train, y_train, validation_split=0.2, epochs=5, batch_size=32, verbose=2)
40
41# Make predictions on the test data
42y_pred = model.predict(X_test)
43y_pred_final = (y_pred > 0.5).astype(int)
44print(classification_report(y_test,y_pred_final, digits=4))
Running the above code should produce output that looks similar to the following:
...
Epoch 1/5
143/143 - 1s - 6ms/step - accuracy: 0.8709 - loss: 0.3543 - val_accuracy: 0.9569 - val_loss: 0.1458
Epoch 2/5
143/143 - 0s - 1ms/step - accuracy: 0.9776 - loss: 0.0964 - val_accuracy: 0.9851 - val_loss: 0.0638
Epoch 3/5
143/143 - 0s - 2ms/step - accuracy: 0.9894 - loss: 0.0481 - val_accuracy: 0.9938 - val_loss: 0.0364
Epoch 4/5
143/143 - 0s - 2ms/step - accuracy: 0.9949 - loss: 0.0288 - val_accuracy: 0.9982 - val_loss: 0.0230
Epoch 5/5
143/143 - 0s - 2ms/step - accuracy: 0.9985 - loss: 0.0186 - val_accuracy: 0.9982 - val_loss: 0.0157
77/77 ━━━━━━━━━━━━━━━━━━━━ 0s 837us/step
precision recall f1-score support
0 0.9968 0.9992 0.9980 1263
1 0.9991 0.9966 0.9979 1175
accuracy 0.9979 2438
macro avg 0.9980 0.9979 0.9979 2438
weighted avg 0.9980 0.9979 0.9979 2438
It’s unlikely that a few more epochs will improve performance. We’re over 99% accuracy on both the test and validation sets, and the validation accuracy has started to plateau, so this seems like a good time to save the model.
We use the model.save() function, passing in a file name to use to save the model. I will use
the simple name mushroom_classifier.keras. It is a good habbit to save the models with a .keras extension.
model.save("mushroom_classifier.keras")
There should now be a file, mushroom_classifier.keras in the same directory as the script you
are running. If we inspect this file, we will see that it is a zip archive and about 34KB:
[mbs337-vm]$ file mushroom_classifier.keras
mushroom_classifier.keras: Zip archive data, at least v2.0 to extract, compression method=store
Note
Keras supports multiple file format versions for saving models. The latest version, v3, will automatically be used whenever the file name passed ends in the “.keras” extension. From the official docs:
“The new Keras v3 saving format, marked by the .keras extension, is a more simple, efficient format that implements name-based saving, ensuring what you load is exactly what you saved, from Python’s perspective. This makes debugging much easier, and it is the recommended format for Keras.”
At this point, we can load our model easily from the saved file into a new Python program. To illustrate, let’s start work in a brand new Python where we will implement the following code:
import tensorflow as tf
model = tf.keras.models.load_model('mushroom_classifier.keras')
Let’s evaluate our model on the training set to convince ourselves that this is indeed our pre-trained model. Again, we will load the original data so we have something to test:
1import pandas as pd
2from ucimlrepo import fetch_ucirepo
3from sklearn.model_selection import train_test_split
4import tensorflow as tf
5
6# Fetch dataset
7mushroom = fetch_ucirepo(id=73)
8X = mushroom.data.features
9y = mushroom.data.targets
10X_clean = X.drop(columns=['stalk-root'])
11
12# Encode data
13X_encoded = pd.get_dummies(X_clean)
14y_encoded = y['poisonous'].map({'p': 1, 'e': 0})
15
16# Split the dataset into training and testing sets
17X_train, X_test, y_train, y_test = train_test_split(
18 X_encoded, y_encoded, test_size=0.3, stratify=y_encoded, random_state=123
19)
20
21# Load the pre-trained model and evaluate
22model = tf.keras.models.load_model('mushroom_classifier.keras')
23print( model.evaluate(X_test, y_test, batch_size=32) )
The output from running this code should looks similar to:
...
77/77 ━━━━━━━━━━━━━━━━━━━━ 0s 987us/step - accuracy: 0.9979 - loss: 0.0171
[0.017112771049141884, 0.9979491233825684]
Indeed, we get 99% accuracy on the training set. We’re ready to deploy this model publicly.
Warning
Be very careful about the version of tensorflow you use to save the model and the version used to load the model. Changing major versions (e.g., tensorflow v1 to v2) can cause the model to fail to load, and even changing from 2.15 to 2.16 because 2.16 introduced a new major version of Keras (v3). The safest approach is always to use identical versions when saving and loading.
EXERCISE
Thought Experiment: Imagine you have built a dashboard for classifying mushrooms as edible or poisonous. What does the interface look like for a user to input data? How does the backend code capture that input?
Write a short Python script that loads in the pre-trained mushroom classifier, and classifies a sample with features:
{
"cap-shape": "x",
"cap-surface": "s",
"cap-color": "n",
"bruises": "t",
"odor": "p",
"gill-attachment": "f",
"gill-spacing": "c",
"gill-size": "n",
"gill-color": "k",
"stalk-shape": "e",
"stalk-root": "e",
"stalk-surface-above-ring": "s",
"stalk-surface-below-ring": "s",
"stalk-color-above-ring": "w",
"stalk-color-below-ring": "w",
"veil-type": "p",
"veil-color": "w",
"ring-number": "o",
"ring-type": "p",
"spore-print-color": "k",
"population": "s",
"habitat": "u"
}
What challenges exist in performing inference on one stand-alone sample?