CHAPTER 04
Beginner
Understanding Large Language Models (LLMs)
Updated: May 14, 2026
20 min read
# CHAPTER 4
Understanding Large Language Models (LLMs)
1. Introduction
The engine driving the modern Generative AI revolution is the Large Language Model (LLM). Systems like GPT-4, Claude, and Gemini are all LLMs. But what exactly makes them "Large," and how do they process human language? In this chapter, we will break open the black box of an LLM, exploring critical concepts like Tokens, Context Windows, and Parameters.2. Learning Objectives
By the end of this chapter, you will be able to:- Define what a Large Language Model is.
- Understand how AI breaks words into "Tokens".
- Explain the concept and limitations of the "Context Window".
- Comprehend what "Parameters" are and why model size matters.
3. Beginner-Friendly Explanation
Imagine a master linguist who has spent 5,000 years reading every book, article, and website ever published. If you give this linguist the sentence, *"The cat chased the..."*, they don't have to think very hard to guess that the next word is probably *"mouse"*. An LLM is a massive mathematical engine that does exactly this. It has consumed billions of pages of text, mapping the statistical probabilities of how humans use words. When you ask an LLM a question, it is essentially playing the world's most advanced, high-speed game of autocomplete, predicting the next most logical word based on everything it has ever read.4. What makes it "Large"?
LLMs are "Large" in two ways:- 1. Training Data: They are trained on a substantial portion of the entire public internet (terabytes of text).
- 2. Parameters: Think of a parameter as a single mathematical connection (a synapse) in the AI's artificial brain. Early models had a few million parameters. Modern models have over 1 Trillion parameters. More parameters generally mean the model is "smarter" and can understand more complex reasoning.
5. Tokens: The Currency of AI
Computers do not understand letters or words; they only understand numbers. Before an LLM reads your prompt, it chops your text into pieces called Tokens.-
A token can be an entire word (e.g.,
apple).
-
A token can be a syllable or chunk of a word (e.g.,
Ham+bur+ger).
6. The Context Window (Memory Limit)
An LLM does not have infinite memory. The Context Window is the maximum number of tokens the AI can "hold in its head" at one single time.- If a model has a Context Window of 8,000 tokens (approx. 6,000 words), and you paste a 10,000-word essay into the prompt, the AI will completely forget the first 4,000 words by the time it reaches the end!
- Modern breakthroughs have pushed Context Windows to massive sizes. Google's Gemini 1.5 Pro boasts a context window of up to 2 Million tokens, allowing you to upload entire textbooks and hour-long videos in a single prompt.
7. JSON Example: How APIs Count Tokens
When developers send data to an LLM, the API often returns metadata showing exactly how many tokens were consumed in the transaction.
json
8. Python Example: Token Estimation
If you are building an AI app, you can use thetiktoken library (OpenAI's official tokenizer) to count tokens *before* sending them to the API, ensuring you don't exceed the Context Window.
python
9. Mini Project
Calculate the Limits: You are using an open-source LLM with a maximum Context Window of 4,000 tokens. You have a conversation history with the bot that is 3,500 words long. Can you ask the bot a new question and get a response without it forgetting the beginning of the conversation? *(Answer: No. 3,500 words is roughly 4,600 tokens. You have already exceeded the 4,000 token Context Window limit. The AI will begin "forgetting" the earliest messages in the chat).*10. Best Practices
- Mind the Context: When building chatbots, developers must write code that actively truncates or summarizes the conversation history as the chat gets longer, preventing the user from accidentally exceeding the Context Window limit and crashing the app.
11. Common Mistakes
-
Assuming LLMs read letter-by-letter: Because LLMs read "Tokens" (chunks of words), they are notoriously terrible at spelling tasks or counting specific letters. If you ask an LLM, "How many r's are in the word strawberry?", it will often guess incorrectly because it doesn't see "s-t-r-a-w-b-e-r-r-y", it sees the mathematical token ID
[49832].
12. Exercises
- 1. Explain the difference between an LLM's "Training Data" and its "Context Window".
13. MCQs with Answers
Question 1
In the world of LLMs, what is a "Token"?
Question 2
If you paste a novel into an AI prompt that exceeds the model's Context Window limit, what will happen?
14. Interview Questions
- Q: What is a Context Window, and what strategies would you use in software development to ensure a user's chatbot session doesn't exceed it?
- Q: Explain why LLMs sometimes struggle with character-level manipulation (like counting specific letters in a word) due to tokenization.