LLM 101: Part 1 — What Even Is This Thing?

An internal guide for humans who work with humans who work with LLMs

Welcome to the first in a series of posts that will attempt to explain Large Language Models without making you want to fake a Zoom connection issue. If you've ever wondered what your engineering team is actually doing when they say they're "fine-tuning the model," or why ChatGPT sometimes sounds confidently wrong about easily verifiable facts, you're in the right place.

Let's Start with the Obvious Question

What is a Large Language Model?

An LLM is, fundamentally, a very large mathematical function that predicts what word (or really, what token — hold that thought) should come next in a sequence. That's it. That's the whole trick.

I know, I know — it feels like these things are thinking. They write poetry, debug code, explain quantum physics, and occasionally roast you with unexpected eloquence. But at their core, they're pattern-matching machines on an absolutely massive scale. They've been trained on so much text from the internet, books, and various other sources that they've developed a genuinely uncanny ability to predict plausible continuations.

Think of it like the world's most overachieving autocomplete. Your phone's keyboard suggests "the" after you type "to" because it's seen that pattern a million times. An LLM does the same thing, just with the entire context of human written knowledge and a few hundred billion parameters to help it along.

The Glossary (or: Words That Will Make You Sound Dangerous in Meetings)

Token: Not a coin you use at an arcade. A token is a chunk of text — could be a word, part of a word, or even a single character. LLMs don't read letter by letter like you do; they break text into these tokens. "Unbelievable" might be split into "un," "believ," and "able." Why? Efficiency. Computers gonna compute.

Parameters: These are the numbers — billions of them — that the model adjusts during training to get better at its job. Think of them as tiny dials. If you had 7 billion dials and knew exactly how to tune each one, you could build something that looks a lot like intelligence. Parameters are basically the "brain cells" of the model, though that analogy makes neuroscientists cry.

Context Window: How much text the model can "see" at once. If you're having a conversation with an LLM and it suddenly forgets what you said three messages ago, you've probably hit the edge of its context window. Current models can handle anywhere from a few thousand to a few million tokens of context. Yes, that's a wide range. Yes, it matters a lot.

Prompt: Whatever you type into the model. Prompting is an art form now, apparently. Some people make six figures writing really good ones. I wish I were joking.

Inference: The act of actually using a trained model to generate text. Training is teaching it; inference is letting it loose. When you ask ChatGPT a question, that's inference. We'll get into training in Part 3.

Hallucination: When the model makes something up with complete confidence. It's not lying — it doesn't know truth from fiction. It's just predicting the next token that seems plausible based on its training. Sometimes that's a real fact. Sometimes it's a completely invented citation. Always verify anything important.

How Does It Actually Work? (The 30,000-Foot View)

Here's the basic flow:

You give it text ("Write me a haiku about database migrations")
It breaks that into tokens (probably something like "Write," "me," "a," "ha," "iku," "about," "database," "migr," "ations")
It runs those tokens through its neural network — a massive mathematical structure with all those parameters we mentioned
It predicts what token should come next based on patterns it learned during training
It adds that token to the sequence and repeats until it decides it's done (or you run out of tokens)

The output? Hopefully a haiku. Maybe about databases. Possibly accurate.

What It's Good At

LLMs excel at:

Understanding and generating natural language (shocking, I know)
Summarizing text without completely mangling the meaning
Translating between languages with surprising nuance
Writing code (though you should definitely review it)
Explaining complex topics in different ways until one clicks
Maintaining a consistent voice across a long piece of text

What It's Terrible At

LLMs struggle with:

Math beyond basic arithmetic. They're predicting tokens, not calculating. Ask one to multiply 4,783 by 8,291 and watch it confidently give you something wildly wrong. (Though newer models can use tools like calculators to help.)
Knowing what they don't know. They'll give you an answer even when they have no idea. Every time.
Consistency across different phrasings. Ask the same question two different ways, get two different answers. It's not being capricious; it's just following different statistical paths through its training data.
Current events or real-time information. Most models have a knowledge cutoff date. They know things up to that date and nothing after, unless they can search the web.
Following precise instructions. They're fuzzy by nature. If you need exactly 500 words, it'll give you 487 or 523 and think it nailed it.

The Part Where I Set Expectations

LLMs are powerful tools. They're not magic, they're not sentient (despite what some venture capitalists would have you believe), and they're definitely not infallible. They're statistical models that happen to be good enough at predicting text that they can fake a lot of understanding.

That's not diminishing what they can do — it's remarkable. But it helps to know what's under the hood when you're deciding whether to trust one with a task.

In Part 2, we'll dive into how these things are actually trained (spoiler: it involves more text than any human will read in a thousand lifetimes). In Part 3, we'll talk about what "training" really means. Part 4 will cover fine-tuning and why we don't just train one giant model to do everything. Part 5 will demystify RAG — which sounds like a cleaning cloth but is actually one of the most useful techniques for making LLMs work with your company's specific data.

Questions? Concerns? Existential dread about the future of human communication? Hold them. We're just getting started.

Next: Part 2 — Training: or, How We Teach Computers to Predict Words by Showing Them the Entire Internet