From Training to Production =========================== To briefly recap, we have written code that downloaded and prepared data, trained a model, then validated the performance of model. However the code that we wrote is still somewhat ephemeral - the model only existed as long as we had the Jupyter notebook running. How do we put that model into a production environment for others to use with their own data? After going through this section, you should be able to: * Describe the general process of MLOps from training to inference in production * Export models to binary Python objects using ``pickle`` or Tensorflow's ``model.save()`` * Import previously-exported models into new Python scripts Model Persistence ----------------- Recall that, at a high level, the use of ML involves the following process: 1. Find or collect raw data about the process or function 2. Prepare the data for model training or fitting 3. Train the model using some of the prepared data 4. Validate the model using some of the prepared data 5. Deploy the model to analyze new data samples We've look at pretty much all of these steps except for the last one which involves the topic of machine learning operations, or MLOps. In practice, we need a method for saving and deploying a model that has already been trained to an application where it can analyze new data. We certainly don't want to have to retrain the model every time we start our application, for several reasons: 1. Model training requires data, which can be large and difficult to ship with our application 2. Training can be a time-consuming process 3. The training process might not be possible/reproducible on every device where we wish to deploy our application All of those reasons motivate the need to be able to save and load models that have already been trained. Here, we will look at a first method for saving and loading models to a file based on the Python ``pickle`` module, which is part of the standard library. The method we mention has the advantage that it is simple and can be used with many Python objects, not just models. However, it also comes with security risks, which we will mention. The ``pickle`` Module --------------------- The ``pickle`` module is part of the Python standard library and provides functions for serializing and deserializing Python objects to and from a bytestream. The process of converting a Python object to a bytestream is referred to as *pickling the object*, and the reverse process of taking a bytestream and converting it back to a Python object is called *unpickling*. Once a Python object has been converted to a bytestream with pickle, the bytestream can then be written to a file. Later, we can read the bytes back out of the file and reconstitute the original Python object. Many Python objects can be pickled, including the following: * builtin constants (True, False, None) * strings, bytes and bytearrays * *some* classes and class instances (specifically, the ones that implement ``__getstate__()``) * lists, dictionaries, and tuples of picklable objects. In general, the models we have looked at from sklearn can be pickled. Using the pickle module is straightforward, and it provides a similar API to that of JSON. We use the following methods for serialization: * ``pickle.dumps(obj)`` converts the Python object, ``obj``, to a bytestream. * ``pickle.dump(obj, file)`` converts the Python object, ``obj``, to a bytestream and writes it to ``file``. And similarly, for deserializing: * ``pickle.loads(bytes)`` converts the ``bytes`` object to a Python object. * ``pickle.load(file)`` reads the contents of ``file`` and converts the bytes to a Python. Of course, the ``load()`` and ``loads()`` functions will fail if the bytes read in were not originally created by the pickle module. Practical Example - Iris Classifier ----------------------------------- Let's see this in action. Suppose we have just trained a linear classifier for the Iris data: .. code-block:: python :linenos: from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.linear_model import SGDClassifier from sklearn.metrics import accuracy_score iris = datasets.load_iris() X = iris.data y = iris.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y, random_state=1) clf = SGDClassifier(loss="perceptron", alpha=0.01, random_state=1) clf.fit(X_train, y_train) print(f'accuracy = {accuracy_score(y_test, clf.predict(X_test))}') We can use pickle to save the model to a file: .. code-block:: python import pickle with open('my_sgdclf.pkl', 'wb') as f: pickle.dump(clf, f) .. tip:: Use descriptive filenames when naming your pickle file. Consider naming it after the model, the source of the training data, the version (Git commit hash) of your code, etc. Note the use of writing to the file in **binary format** (the ``'wb'`` flag in the call to ``open``). This is important - the pickle output is a bytestream so without the ``b``, the write will fail. Next, create a brand new Python script to read the model back in from the file. Note we don't need to re-train the model, but in this case we are pulling the raw Iris data again so we have something to test: .. code-block:: python :linenos: :emphasize-lines: 5,12-13 from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.linear_model import SGDClassifier from sklearn.metrics import accuracy_score import pickle iris = datasets.load_iris() X = iris.data y = iris.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y, random_state=1) with open('my_sgdclf.pkl', 'rb') as f: clf = pickle.load(f) print(f'accuracy = {accuracy_score(y_test, clf.predict(X_test))}') .. note:: Note that in general, Python callables (e.g., functions) *cannot* be pickled. If you need to serialize a callable, consider using the third-party ``cloudpickle`` package instead, available from pypi. EXERCISE ~~~~~~~~ Write a short Python script that loads in the pre-trained classifier, and classifies a sample with sepal and petal measurements of ``[5.1, 3.5, 1.4, 0.2]``. A Note on Security with ``pickle`` ----------------------------------- We need to be very careful when using the ``pickle`` library to load Python objects. It is possible to serialize code that could harm your machine when loaded. For that reason, it is recommended that you **only** use ``pickle.load()`` and ``pickle.loads()`` on files and bytestreams that you know and trust (i.e., that you wrote yourself). As a result, ``pickle`` is not a suitable solution for some cases; for example, a web API or service that allows users to upload their own model and execute them on the cloud. .. warning:: Never use pickle to load a bytestream that you did not write yourself. You could do harm to your computer. Serializing and Deserializing Tensorflow Models ----------------------------------------------- The Python ``pickle`` module is great for serializing a sklearn model. However, for serializing a Tensorflow model we recommend using the built in ``model.save()`` method. In general, attempting to use ``pickle`` on Tensorflow models can lead to errors related to model objects not being pickleable. Practical Example - Mushroom Classifier --------------------------------------- We'll illustrate the techniques in this section using a model trained against the Mushroom dataset. Recall that dataset consisted of 8,124 samples each with 22 features and a binary classification (poisonous or edible). .. code-block:: python :linenos: import random import pandas as pd from ucimlrepo import fetch_ucirepo from sklearn.model_selection import train_test_split from sklearn.metrics import classification_report import tensorflow as tf from tensorflow.keras import Sequential from tensorflow.keras.layers import Input, Dense tf.random.set_seed(123) random.seed(123) # Fetch dataset mushroom = fetch_ucirepo(id=73) X = mushroom.data.features y = mushroom.data.targets X_clean = X.drop(columns=['stalk-root']) # Encode data X_encoded = pd.get_dummies(X_clean) y_encoded = y['poisonous'].map({'p': 1, 'e': 0}) # Split the dataset into training and testing sets X_train, X_test, y_train, y_test = train_test_split( X_encoded, y_encoded, test_size=0.3, stratify=y_encoded, random_state=123 ) # Create model with sequential API model = Sequential([ Input(shape=(112,)), Dense(10, activation='relu'), Dense(1, activation='sigmoid') ]) # Compile the model with appropriate settings for binary classification model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) # Train the model with the specified parameters model.fit(X_train, y_train, validation_split=0.2, epochs=5, batch_size=32, verbose=2) # Make predictions on the test data y_pred = model.predict(X_test) y_pred_final = (y_pred > 0.5).astype(int) print(classification_report(y_test,y_pred_final, digits=4)) Running the above code should produce output that looks similar to the following: .. code-block:: console ... Epoch 1/5 143/143 - 1s - 6ms/step - accuracy: 0.8709 - loss: 0.3543 - val_accuracy: 0.9569 - val_loss: 0.1458 Epoch 2/5 143/143 - 0s - 1ms/step - accuracy: 0.9776 - loss: 0.0964 - val_accuracy: 0.9851 - val_loss: 0.0638 Epoch 3/5 143/143 - 0s - 2ms/step - accuracy: 0.9894 - loss: 0.0481 - val_accuracy: 0.9938 - val_loss: 0.0364 Epoch 4/5 143/143 - 0s - 2ms/step - accuracy: 0.9949 - loss: 0.0288 - val_accuracy: 0.9982 - val_loss: 0.0230 Epoch 5/5 143/143 - 0s - 2ms/step - accuracy: 0.9985 - loss: 0.0186 - val_accuracy: 0.9982 - val_loss: 0.0157 77/77 ━━━━━━━━━━━━━━━━━━━━ 0s 837us/step precision recall f1-score support 0 0.9968 0.9992 0.9980 1263 1 0.9991 0.9966 0.9979 1175 accuracy 0.9979 2438 macro avg 0.9980 0.9979 0.9979 2438 weighted avg 0.9980 0.9979 0.9979 2438 It's unlikely that a few more epochs will improve performance. We're over 99% accuracy on both the test and validation sets, and the validation accuracy has started to plateau, so this seems like a good time to save the model. We use the ``model.save()`` function, passing in a file name to use to save the model. I will use the simple name ``mushroom_classifier.keras``. It is a good habbit to save the models with a ``.keras`` extension. .. code-block:: python3 model.save("mushroom_classifier.keras") There should now be a file, ``mushroom_classifier.keras`` in the same directory as the script you are running. If we inspect this file, we will see that it is a zip archive and about 34KB: .. code-block:: console [mbs337-vm]$ file mushroom_classifier.keras mushroom_classifier.keras: Zip archive data, at least v2.0 to extract, compression method=store .. note:: Keras supports multiple file format versions for saving models. The latest version, v3, will automatically be used whenever the file name passed ends in the ".keras" extension. From the official docs: *"The new Keras v3 saving format, marked by the .keras extension, is a more simple, efficient format that implements name-based saving, ensuring what you load is exactly what you saved, from Python's perspective. This makes debugging much easier, and it is the recommended format for Keras."* At this point, we can load our model easily from the saved file into a new Python program. To illustrate, let's start work in a brand new Python where we will implement the following code: .. code-block:: python import tensorflow as tf model = tf.keras.models.load_model('mushroom_classifier.keras') Let's evaluate our model on the training set to convince ourselves that this is indeed our pre-trained model. Again, we will load the original data so we have something to test: .. code-block:: python :linenos: import pandas as pd from ucimlrepo import fetch_ucirepo from sklearn.model_selection import train_test_split import tensorflow as tf # Fetch dataset mushroom = fetch_ucirepo(id=73) X = mushroom.data.features y = mushroom.data.targets X_clean = X.drop(columns=['stalk-root']) # Encode data X_encoded = pd.get_dummies(X_clean) y_encoded = y['poisonous'].map({'p': 1, 'e': 0}) # Split the dataset into training and testing sets X_train, X_test, y_train, y_test = train_test_split( X_encoded, y_encoded, test_size=0.3, stratify=y_encoded, random_state=123 ) # Load the pre-trained model and evaluate model = tf.keras.models.load_model('mushroom_classifier.keras') print( model.evaluate(X_test, y_test, batch_size=32) ) The output from running this code should looks similar to: .. code-block:: console ... 77/77 ━━━━━━━━━━━━━━━━━━━━ 0s 987us/step - accuracy: 0.9979 - loss: 0.0171 [0.017112771049141884, 0.9979491233825684] Indeed, we get 99% accuracy on the training set. We're ready to deploy this model publicly. .. warning:: Be very careful about the version of tensorflow you use to save the model and the version used to load the model. Changing major versions (e.g., tensorflow v1 to v2) can cause the model to fail to load, and even changing from 2.15 to 2.16 because 2.16 introduced a new major version of Keras (v3). The safest approach is always to use identical versions when saving and loading. EXERCISE ~~~~~~~~ **Thought Experiment:** Imagine you have built a dashboard for classifying mushrooms as edible or poisonous. What does the interface look like for a user to input data? How does the backend code capture that input? Write a short Python script that loads in the pre-trained mushroom classifier, and classifies a sample with features: .. code-block:: text { "cap-shape": "x", "cap-surface": "s", "cap-color": "n", "bruises": "t", "odor": "p", "gill-attachment": "f", "gill-spacing": "c", "gill-size": "n", "gill-color": "k", "stalk-shape": "e", "stalk-root": "e", "stalk-surface-above-ring": "s", "stalk-surface-below-ring": "s", "stalk-color-above-ring": "w", "stalk-color-below-ring": "w", "veil-type": "p", "veil-color": "w", "ring-number": "o", "ring-type": "p", "spore-print-color": "k", "population": "s", "habitat": "u" } What challenges exist in performing inference on one stand-alone sample? Additional Resources -------------------- * Adapted from: `COE 379L: Software Design For Responsible Intelligent Systems `_ * `Python pickle Docs `_ * `Cloudpickle Python Package on Github `_ * `Keras model saving and loading `_