CHAPTER 10 Intermediate

Logistic Regression for Classification

Updated: May 16, 2026

5 min read

# CHAPTER 10

Logistic Regression for Classification

1. Introduction

In the previous chapter, we used Linear Regression to predict a continuous number (House Price). But what if our target is a category? What if we want to predict if an email is "Spam" or "Not Spam"? If a patient has a disease "Yes" or "No"? This is known as Classification. Despite its confusing name, Logistic Regression is the foundational algorithm used for classification in Machine Learning. In this chapter, we will build models that can make decisions.

2. Learning Objectives

By the end of this chapter, you will be able to:

Differentiate between Regression and Classification tasks.

Understand the concept of Binary Classification.

Explain how the Sigmoid function outputs probabilities.

Implement LogisticRegression in Scikit-learn.

Use predict_proba to analyze model confidence.

3. Classification Basics

Classification aims to draw a boundary (a decision boundary) between different classes of data.

Binary Classification: Predicting exactly two classes (0 or 1, Spam or Not Spam, Yes or No).

Multi-class Classification: Predicting three or more classes (Apple, Banana, or Orange).

4. Why Not Linear Regression?

If you try to fit a straight line (Linear Regression) to binary data (values of 0 and 1), the line will shoot off to infinity. The model might predict a value of 3.5 or -1.2, which makes no sense if the only valid answers are 0 and 1. Logistic Regression solves this by wrapping the straight line inside an S-shaped curve called the Sigmoid Function.

5. The Sigmoid Function

The Sigmoid function takes any mathematical output (from negative infinity to positive infinity) and squashes it into a number exactly between 0.0 and 1.0. Because the output is between 0 and 1, we can interpret it as a Probability.

If the output is 0.85, the model is 85% confident the email is Spam.

If the output is 0.10, the model is 10% confident the email is Spam (meaning it is likely Not Spam).

*Threshold:* By default, Scikit-learn sets a threshold of 0.5. Anything above 0.5 is classified as Class 1. Anything below is Class 0.

6. Mini Project: Spam Detection Model

Let's implement Logistic Regression to predict if an email is Spam based on two engineered features: "Number of Links" and "Number of Spelling Mistakes".

python

12345678910111213141516171819202122

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# 1. Data (X = [Links, Mistakes], y = [0: Not Spam, 1: Spam])
X = np.array([[1, 0], [2, 1], [10, 5], [12, 8], [3, 0], [15, 6]])
y = np.array([0, 0, 1, 1, 0, 1])

# 2. Split Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

# 3. Train Model
model = LogisticRegression()
model.fit(X_train, y_train)

# 4. Make Predictions (Outputs 0 or 1)
predictions = model.predict(X_test)

# 5. Evaluate
acc = accuracy_score(y_test, predictions)
print(f"Spam Detection Accuracy: {acc * 100}%")

7. Probability Predictions

The .predict() method forces the output into a strict 0 or 1. However, in business applications, knowing the *confidence* of the prediction is vital. We use .predict_proba() for this.

python

123456

# Predict probabilities for the Test Set
probabilities = model.predict_proba(X_test)

# The output is an array showing [Probability of 0, Probability of 1]
for i in range(len(X_test)):
    print(f"Email {i} - Prob Not Spam: {probabilities[i][0]:.2f} | Prob Spam: {probabilities[i][1]:.2f}")

*Business Use Case:* If the model says an email is 51% likely to be spam, you might send it to the Inbox anyway because false positives annoy users. You might manually set a threshold so it only goes to the Spam folder if the probability is > 0.90.

8. Common Mistakes

Forgetting to Scale: Logistic Regression uses an optimizer underneath that solves mathematical gradients. Unscaled data (like a column with values in the millions next to a column with 0s and 1s) will cause the algorithm to converge slowly or fail. Always use StandardScaler.

Class Imbalance: If your dataset has 99,000 normal transactions and 1,000 fraudulent ones, the model can achieve 99% accuracy by simply predicting "Normal" every time. We will address evaluating imbalanced data in Chapter 16.

9. Best Practices

Interpretability: Just like Linear Regression, you can check model.coef in Logistic Regression. This tells you which feature had the highest impact on classifying the data into Class 1.

10. Exercises

1. If the Sigmoid function outputs 0.30, and the default threshold is 0.5, which class (0 or 1) will Scikit-learn predict?

2. Write the Scikit-learn method used to view the exact probabilities of a prediction rather than just the final class.

11. MCQ Quiz with Answers

Question 1

Despite its name, Logistic Regression is used for what type of Machine Learning task?

Question 2

What is the mathematical function that squashes the output of Logistic Regression into a probability between 0 and 1?

12. Interview Questions

Q: Explain why Linear Regression cannot be used effectively for binary classification problems.

Q: In a real-world scenario like medical diagnosis, why might you use .predictproba() to check the raw probability instead of relying on .predict()?

13. FAQs

Q: Can Logistic Regression predict more than two classes? A: Yes. While it is fundamentally binary, Scikit-learn automatically handles Multi-class classification using techniques like "One-vs-Rest" (OVR) or "Multinomial", allowing it to predict multiple categories.

14. Summary

Logistic Regression is the gateway to Classification. By utilizing the Sigmoid function, it transforms raw mathematical outputs into interpretable probabilities. Whether you are predicting email spam, disease presence, or customer churn, this algorithm offers a fast, reliable, and highly interpretable baseline for any classification task.

15. Next Chapter Recommendation

Linear and Logistic regressions rely on mathematical equations drawing straight lines. But human logic often works like a flowchart: "If age > 18, and income > 50k, then approve loan." In Chapter 11: Decision Trees and Random Forests, we will explore algorithms that mimic human decision-making.

Browse All 21+ Subject Areas

Quick Links

Visual Algorithm Labs

Frontend Dev

The Future of Web Architecture in 2026

Practice Quizzes

Logistic Regression for Classification #

1. Introduction #

2. Learning Objectives #

3. Classification Basics #

4. Why Not Linear Regression? #

5. The Sigmoid Function #

6. Mini Project: Spam Detection Model #

7. Probability Predictions #

8. Common Mistakes #

9. Best Practices #

10. Exercises #

11. MCQ Quiz with Answers #

Despite its name, Logistic Regression is used for what type of Machine Learning task?

What is the mathematical function that squashes the output of Logistic Regression into a probability between 0 and 1?

12. Interview Questions #

13. FAQs #

14. Summary #

15. Next Chapter Recommendation #

Finish this Chapter

Discussion

Explore More

📖 Related Tutorials 4

Send Feedback / Bug

Feedback Submitted!

Logistic Regression for Classification

1. Introduction

2. Learning Objectives

3. Classification Basics

4. Why Not Linear Regression?

5. The Sigmoid Function

6. Mini Project: Spam Detection Model

7. Probability Predictions

8. Common Mistakes

9. Best Practices

10. Exercises

11. MCQ Quiz with Answers

12. Interview Questions

13. FAQs

14. Summary

15. Next Chapter Recommendation