CHAPTER 06
Intermediate
Logistic Regression for Classification
Updated: May 16, 2026
6 min read
# CHAPTER 6
Logistic Regression for Classification
1. Introduction
Despite its confusing name, Logistic Regression is NOT a regression algorithm; it is the most famous and foundational Classification algorithm in existence. It is the industry standard for Binary Classification (Yes/No problems). In this chapter, we will open the black box ofscikit-learn, understand the mathematical Sigmoid curve that powers it, and build a model to detect Spam emails.
2. Learning Objectives
By the end of this chapter, you will be able to:- Explain why standard Linear Regression fails at classification.
- Understand the mathematics of the Sigmoid Function.
-
Train a
LogisticRegressionmodel usingscikit-learn.
-
Output probability scores using
.predict_proba().
- Understand Decision Thresholds.
3. Why Linear Regression Fails
Imagine plotting tumor sizes. Small tumors are Benign (0), large tumors are Malignant (1). If you draw a straight Linear Regression line through these points, the line will continue forever. It might predict a massive tumor has a value of3.5. But classes must be 0 or 1. A prediction of 3.5 makes no sense!
Furthermore, a straight line is highly sensitive to extreme outliers, which will drastically shift the decision boundary and ruin predictions.
4. The Math: The Sigmoid Function
To fix this, Logistic Regression takes the straight line ($y = mx + b$) and forces it through a mathematical filter called the Sigmoid Function. The Sigmoid function squashes any number (from negative infinity to positive infinity) into a strict range exactly between0.0 and 1.0.
*The Result:* The algorithm no longer outputs nonsense like 3.5. It outputs a Probability.
If the model outputs 0.85, it means it is 85% confident the item belongs to Class 1.
5. Decision Thresholds
Once the model calculates a probability (e.g., 0.85), how does it make a final hard decision? It uses a Threshold. By default, Scikit-learn sets the threshold at0.50 (50%).
- If Probability $\ge 0.50 \rightarrow$ Predict Class 1.
- If Probability $< 0.50 \rightarrow$ Predict Class 0.
6. Mini Project: Email Spam Detection
Let's build a Logistic Regression model to predict if an email is Spam (1) or Safe (0) based on the number of links and the length of the email.
python
7. Extracting Probabilities (predictproba)
In business, you rarely want just a hard "1" or "0". You want to know *how confident* the AI is before acting. We use .predictproba() to see the raw Sigmoid percentages.
python
*Because 87.5% is greater than the 50% threshold, the model returned Class 1!*
8. Common Mistakes
- Assuming Logistic Regression draws curved boundaries: Despite using the curved Sigmoid function for probabilities, the actual physical Decision Boundary that Logistic Regression draws through the data is a perfectly straight line. If your data cannot be separated by a straight line, Logistic Regression will underfit.
-
Ignoring the Threshold: In a medical scenario (detecting cancer), you don't want to wait until the model is 50% sure. You might want to flag the patient if the model is even 15% sure! You can manually extract the probabilities using
predictprobaand write your own customif prob > 0.15logic to override the default 50% threshold.
9. Best Practices
-
Feature Scaling: Logistic Regression uses an internal optimizer to find the best boundary. If your features are on vastly different scales (e.g., Links: 1-10, Words: 100-5000), the optimizer will struggle. Always use a
StandardScaler(covered in Chapter 13).
10. Exercises
-
1.
If
.predictproba()outputs[0.30, 0.70], what hard class will.predict()output, assuming the default threshold?
- 2. Why is a standard linear line ($y = mx+b$) mathematically inappropriate for predicting binary classes like 0 or 1?
11. MCQ Quiz with Answers
Question 1
Despite its name, Logistic Regression is used for what type of Machine Learning task?
Question 2
What is the mathematical purpose of the Sigmoid function in Logistic Regression?
12. Interview Questions
- Q: Explain how a Logistic Regression model utilizes a Threshold to convert its raw mathematical output into a final class prediction.
- Q: In what specific business scenario would you manually lower the decision threshold of a Logistic Regression model from 0.50 to 0.10?
13. FAQs
Q: Can Logistic Regression handle Multiclass problems (e.g., Cat, Dog, Horse)? A: Yes! By default, Scikit-learn'sLogisticRegression handles multiclass classification by using a strategy called "One-vs-Rest" or "Multinomial," extending the math to output probabilities that sum to 100% across all classes.