CHAPTER 19 Intermediate

Saving and Deploying Machine Learning Models

Updated: May 16, 2026

6 min read

# CHAPTER 19

Saving and Deploying Machine Learning Models

1. Introduction

You have trained an incredible model. It predicts house prices with 95% accuracy. But right now, the model is trapped inside your Jupyter Notebook. When you close VS Code, the model disappears from memory. To make your model useful to the world, you must save it to your hard drive as a file, and then load it onto a web server where an app can interact with it. In this chapter, we will bridge the gap between Data Science and Software Engineering by learning how to save and deploy our models.

2. Learning Objectives

By the end of this chapter, you will be able to:

Explain what model serialization is.

Save and load Scikit-learn models using joblib.

Understand the basics of REST APIs.

Build a simple Python web server using Flask to host your model.

Make predictions via HTTP requests.

3. Model Serialization (Saving the Model)

Serialization is the process of converting a Python object (like your trained Pipeline) into a byte stream so it can be saved to a file on your hard drive. Python has a built-in library for this called pickle. However, Scikit-learn models often contain massive NumPy arrays. The joblib library is heavily optimized to save large NumPy arrays quickly and efficiently.

Saving the Model:

python

12345678

import joblib
from sklearn.ensemble import RandomForestClassifier

# Assume 'my_pipeline' is fully trained
# Save it to the hard drive
joblib.dump(my_pipeline, &#039;house_price_model.pkl')

print("Model successfully saved!")

*You will now see a physical file named housepricemodel.pkl in your folder.*

4. Loading the Model

Imagine this is a completely different Python script running on a web server in the cloud. It doesn't need to import the data or train the model. It just loads the finished file.

python

123456789101112

import joblib
import numpy as np

# Load the model from the file
loaded_model = joblib.load(&#039;house_price_model.pkl')

# New raw data from a user (e.g., 2000 sq ft, 3 beds)
new_data = np.array([[2000, 3]])

# Predict! (The pipeline handles the scaling automatically)
prediction = loaded_model.predict(new_data)
print(f"Predicted Price: ${prediction[0]}")

5. Deployment Basics: What is an API?

To allow a mobile app or a website to use your Python model, you build an API (Application Programming Interface). Think of the API as a waiter in a restaurant.

1. The website (customer) sends an HTTP Request to the API (waiter): *"Here is a 2000 sq ft house."*

2. The API gives the data to the Model (chef). The Model calculates the prediction: *$250,000*.

3. The API returns an HTTP Response to the website: *"The predicted price is $250,000."*

6. Building an ML API with Flask

Flask is a micro web-framework for Python. It is incredibly easy to use. *(First, install it via terminal: pip install flask)*

Create a new file named app.py:

python

12345678910111213141516171819202122232425262728

from flask import Flask, request, jsonify
import joblib
import numpy as np

# Initialize the Flask app
app = Flask(__name__)

# Load the trained model
model = joblib.load(&#039;house_price_model.pkl')

# Create an API endpoint/route
@app.route(&#039;/predict', methods=['POST'])
def predict():
    # 1. Get the JSON data sent by the user
    data = request.get_json()
    
    # 2. Extract features (SqFt, Beds)
    features = np.array([[data[&#039;sqft'], data['beds']]])
    
    # 3. Make prediction
    pred = model.predict(features)
    
    # 4. Return the result as JSON
    return jsonify({"predicted_price": pred[0]})

# Run the server
if __name__ == &#039;__main__':
    app.run(port=5000, debug=True)

7. Testing the API

When you run python app.py, your computer turns into a web server hosting your model! A web developer can now send data to http://localhost:5000/predict formatted as JSON: {"sqft": 2000, "beds": 3} And the server will instantly reply with the predicted price!

8. Cloud Deployment

Running the server on your laptop is great for testing, but for production, you upload app.py, housepricemodel.pkl, and requirements.txt to a cloud provider like:

Heroku (Easiest for beginners)

AWS Elastic Beanstalk or Google Cloud Run

PythonAnywhere

9. Common Mistakes

Version Mismatch: If you train and save the model using Scikit-learn version 1.2 on your laptop, but the cloud server installs Scikit-learn version 1.0, the joblib.load() command will crash. Always ensure your requirements.txt specifies exact versions (e.g., scikit-learn==1.2.0).

Security: Never blindly unpickle/load a .pkl file you downloaded randomly from the internet. Pickle files can contain malicious code that executes as soon as you load it.

10. Best Practices

Save Pipelines, not just Models: As emphasized in Chapter 18, always save the entire Pipeline object. If you only save the Random Forest, your Flask app will have to manually run StandardScaler on the user's JSON data before predicting, which is a recipe for bugs.

11. Exercises

1. Write the two lines of code required to import joblib and save a model named svmmodel to a file called mysvm.pkl.

2. What format is commonly used to send and receive data over a web API?

12. MCQ Quiz with Answers

Question 1

Which Python library is specifically optimized and recommended by Scikit-learn for saving models containing large NumPy arrays?

Question 2

What is the role of a framework like Flask in Machine Learning deployment?

13. Interview Questions

Q: Explain the concept of Model Serialization and name the library used for it in Scikit-learn.

Q: If an iOS app needs to use your Python machine learning model to make predictions, how do you bridge the gap between the two different languages?

14. FAQs

Q: I heard FastAPI is better than Flask. Should I use it? A: FastAPI is a modern, incredibly fast framework that is becoming the new industry standard for ML APIs. While Flask is simpler for absolute beginners to grasp the concepts, transitioning to FastAPI for production workloads is highly recommended.

15. Summary

A machine learning model provides zero business value while trapped in a Jupyter Notebook. By serializing the model with joblib and wrapping it in a simple Flask REST API, you transform a mathematical equation into a scalable microservice that any application on the internet can interact with.

16. Next Chapter Recommendation

You have learned the entire theoretical and practical lifecycle of Machine Learning. It is time to put it to the test. In Chapter 20: Final Project, you will execute a complete, end-to-end Machine Learning project from raw data to a deployed application.

Browse All 21+ Subject Areas

Quick Links

Visual Algorithm Labs

Frontend Dev

The Future of Web Architecture in 2026

Practice Quizzes

Saving and Deploying Machine Learning Models #

1. Introduction #

2. Learning Objectives #

3. Model Serialization (Saving the Model) #

4. Loading the Model #

5. Deployment Basics: What is an API? #

6. Building an ML API with Flask #

7. Testing the API #

8. Cloud Deployment #

9. Common Mistakes #

10. Best Practices #

11. Exercises #

12. MCQ Quiz with Answers #

Which Python library is specifically optimized and recommended by Scikit-learn for saving models containing large NumPy arrays?

What is the role of a framework like Flask in Machine Learning deployment?

13. Interview Questions #

14. FAQs #

15. Summary #

16. Next Chapter Recommendation #

Finish this Chapter

Discussion

Explore More

📖 Related Tutorials 4

Send Feedback / Bug

Feedback Submitted!

Saving and Deploying Machine Learning Models

1. Introduction

2. Learning Objectives

3. Model Serialization (Saving the Model)

4. Loading the Model

5. Deployment Basics: What is an API?

6. Building an ML API with Flask

7. Testing the API

8. Cloud Deployment

9. Common Mistakes

10. Best Practices

11. Exercises

12. MCQ Quiz with Answers

13. Interview Questions

14. FAQs

15. Summary

16. Next Chapter Recommendation