Skip to main content
Regression Models
CHAPTER 19 Intermediate

Saving, Deploying, and Using Regression Models

Updated: May 16, 2026
6 min read

# CHAPTER 19

Saving, Deploying, and Using Regression Models

1. Introduction

Machine learning algorithms are completely useless if they remain trapped inside a Jupyter Notebook on your laptop. If you train a highly accurate real estate pricing model, the engineering team needs to integrate it into their website so customers can use it! In this chapter, we transition from Data Science to Software Engineering. We will learn how to serialize (save) a trained model to your hard drive, load it into a new script, and deploy it behind a web API.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Explain what model serialization means.
  • Save a trained scikit-learn Pipeline using joblib.
  • Load a saved model from the hard drive.
  • Understand the architecture of Model Deployment.
  • Build a basic Flask REST API to serve predictions.

3. Serialization (joblib vs pickle)

A trained model is just a Python object holding a massive matrix of weights and coefficients in RAM. Serialization is the process of translating that Python object into a binary file on your hard drive. While Python has a built-in library called pickle for this, the scikit-learn ecosystem heavily relies on massive NumPy arrays. Therefore, the official recommendation is to use joblib, which is highly optimized for saving massive matrices.

4. Saving the Model (And the Pipeline!)

CRITICAL RULE: Do not just save the model! If your training script used a StandardScaler to squash the data, the deployed model requires the exact same squashed scale. If you don't save the Scaler, your deployment will crash. This is why we always save the entire Pipeline object.
python
123456789101112131415161718
import joblib
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Ridge
import numpy as np

# 1. Mock Training Data
X_train = np.array([[1500, 3], [2000, 4], [2500, 4]])
y_train = np.array([300000, 400000, 500000])

# 2. Build and Train the Pipeline
final_pipeline = make_pipeline(StandardScaler(), Ridge(alpha=1.0))
final_pipeline.fit(X_train, y_train)

# 3. SAVE THE PIPELINE!
# This creates a file named 'house_price_model.pkl' in your folder.
joblib.dump(final_pipeline, 'house_price_model.pkl')
print("Model successfully saved to disk!")

5. Loading the Model (In a New File)

Imagine closing your laptop, opening a brand new Python file on a remote web server, and writing this code. You don't need to import the CSV or run fit() again!
python
1234567891011121314
import joblib
import numpy as np

# 1. Load the binary file back into RAM
loaded_pipeline = joblib.load('house_price_model.pkl')

# 2. Get user input from the website (e.g., 2200 sqft, 3 beds)
user_input = np.array([[2200, 3]])

# 3. Make the prediction!
# The pipeline AUTOMATICALLY applies the StandardScaler to the user input!
prediction = loaded_pipeline.predict(user_input)

print(f"Predicted Price for User: ${prediction[0]:.2f}")

6. Deployment Architecture (REST API)

How does a React/Angular website talk to your Python model? Through a REST API.
  1. 1. The Web Server (using Python's Flask or FastAPI framework) loads the joblib file into memory when it boots up.
  1. 2. A user clicks "Predict Price" on the website. The site sends a JSON package ({"sqft": 2200, "beds": 3}) via HTTP POST to your Python server.
  1. 3. The server extracts the numbers, converts them to a NumPy array, passes them through the loaded pipeline, and gets the prediction.
  1. 4. The server sends the prediction back as JSON ({"price": 420000}).

7. Mini Project: A Simple Flask API

Here is what the actual production Python code looks like to host your model on the web.
app.py
123456789101112131415161718192021222324252627282930313233
from flask import Flask, request, jsonify
import joblib
import numpy as np

app = Flask(__name__)

# Load the model ONCE when the server boots
model = joblib.load('house_price_model.pkl')

@app.route('/predict', methods=['POST'])
def predict():
    # 1. Receive JSON data from the website
    data = request.get_json()
    
    # 2. Extract features
    sqft = data['sqft']
    beds = data['beds']
    
    # 3. Format into a 2D array for scikit-learn
    features = np.array([[sqft, beds]])
    
    # 4. Predict
    prediction = model.predict(features)
    
    # 5. Return JSON response to the user
    return jsonify({
        'status': 'success',
        'predicted_price': round(prediction[0], 2)
    })

if __name__ == '__main__':
    # Run the web server!
    app.run(port=5000, debug=True)

8. Common Mistakes

  • Version Mismatch: If you train and save the model using scikit-learn version 1.2 on your laptop, but the production web server has scikit-learn version 0.24, joblib.load() will likely crash. The library versions must match exactly! Use a requirements.txt file or Docker to enforce this.
  • Missing Columns: The user must provide the exact same number of columns, in the exact same order, that the model was trained on.

9. Best Practices

  • Never retrain in production: The web server should *only* execute model.predict(). It should never execute model.fit(). Model training happens offline. The resulting .pkl file is then uploaded to the server.

10. Exercises

  1. 1. What is the fundamental difference in purpose between model.fit() and joblib.dump()?
  1. 2. Write the line of code required to load a model named salesrf.pkl into a variable named mymodel.

11. MCQ Quiz with Answers

Question 1

Why is it highly recommended to save a Pipeline (containing both the Scaler and the Model) rather than just the model itself?

Question 2

Which Python library is officially recommended for serializing Scikit-Learn models with large NumPy arrays?

12. Interview Questions

  • Q: Describe the end-to-end architecture of how a user on a website receives a prediction from a Scikit-Learn model running on a remote server.
  • Q: Explain why a library version mismatch between the training environment and the production environment is catastrophic for a pickled/joblib model.

13. FAQs

Q: Can I deploy my model to the cloud? A: Yes! You can wrap the Flask app shown above inside a Docker container and deploy it to AWS Elastic Beanstalk, Google Cloud Run, or Heroku in minutes!

14. Summary

A Data Scientist's job is not done until the model is usable. By serializing the entire preprocessing and modeling pipeline into a robust .pkl file via joblib, and wrapping that file inside a web API like Flask, you transform abstract mathematics into a tangible, revenue-generating software product.

15. Next Chapter Recommendation

You have acquired every single skill required to be a professional Machine Learning Engineer. It is time to prove it. In Chapter 20: Final Project, you will embark on the ultimate challenge: building a complete, end-to-end predictive application from raw CSV data to final deployment.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·