Temperature: The Creativity Dial
Lesson 1, Material 5: Temperature Parameter Summary
Temperature is a crucial setting (0 to 2) that controls whether AI selects the most probable token or samples from multiple options, directly impacting output predictability versus creativity. At low temperatures like 0.2, the AI sharpens probability distributions, concentrating likelihood on top tokens—if "blue" has 45% probability at temperature 1.0, it jumps to 92% at 0.2, ensuring nearly identical responses every time. At high temperatures like 1.5, the distribution flattens, spreading probability across alternatives—"blue" drops to 28% while "clear" rises to 18%, enabling creative variety. Low temperature (0.0-0.5) excels for factual tasks requiring consistency: translations, technical documentation, data extraction, and question-answering, where "What is the capital of France?" reliably yields "Paris." High temperature (0.8-2.0) suits creative endeavors: storytelling, brainstorming, and marketing copy, where "Write a fantasy opening" produces diverse results like "The dragon's shadow fell" or "Magic died the day Elara was born." Different AI interfaces use preset defaults (typically 0.7-1.0 for conversation), but API access allows precise control—customer service bots use 0.3 for accuracy, creative assistants use 1.2 for variation. The trade-off: higher creativity risks hallucinations and factual errors.
You now understand how temperature transforms the same model from predictable assistant to creative collaborator, but this raises a critical question: does the AI remember your entire conversation history, or are there limits to what it can process at once?
Recap
You now understand that AI generates text by predicting one token at a time, always calculating probabilities and selecting the highest-probability option. But if AI always picks the most probable token, every response would be identical—so how does it produce different outputs?
The Temperature Parameter Controls Randomness
Temperature is a setting that controls whether AI picks the most likely token or samples from multiple probable options. It's a number typically ranging from 0 to 2 that you can adjust before the AI generates a response.
When you set temperature to a low value like 0.2, the AI behaves predictably—it almost always selects the highest-probability token at each step. When you set it to a high value like 1.5, the AI considers less probable tokens and might select them, creating varied and creative outputs.
Here's how it works: Remember that AI converts its raw scores into probabilities using softmax, creating a probability distribution where all possible tokens sum to 100%. Temperature modifies these probabilities before the AI makes its selection. A low temperature sharpens the distribution, concentrating probability on the most likely tokens. A high temperature flattens it, spreading probability more evenly across multiple options.
How Temperature Changes Token Selection
Let's see temperature in action with a concrete example. After the prompt "The sky is," suppose the AI calculates these probabilities at temperature 1.0:
- "blue" → 45%
- "clear" → 12%
- "cloudy" → 10%
- "gray" → 8%
- "dark" → 7%
- Other tokens → 18%
At temperature 0.2 (low), the AI sharpens this distribution dramatically. "Blue" might jump to 92% probability, while "clear" drops to 3%, and other options become negligible. The AI will almost certainly select "blue" every single time.
At temperature 1.5 (high), the distribution flattens. "Blue" might drop to 28%, "clear" rises to 18%, "cloudy" to 15%, and even less common options like "endless" or "beautiful" gain meaningful probability. The AI now has real variety in what it might select.
The key insight: temperature doesn't change which token is most likely—"blue" remains the top choice in our example. Instead, it changes whether the AI will actually pick that top choice or consider alternatives.
Low Temperature for Factual Tasks
You should use low temperature settings (0.0 to 0.5) when you need consistent, accurate, and predictable responses.
Low temperature works best for tasks like:
- Answering factual questions where there's one correct answer
- Translating text from one language to another
- Extracting specific information from documents
- Writing technical documentation or code
- Summarizing content where accuracy matters
For example, if you ask "What is the capital of France?" at temperature 0.2, the AI will reliably respond "Paris" because that's the overwhelmingly most probable answer. The low temperature ensures it won't randomly select creative but incorrect alternatives.
Even at temperature 0.0, responses aren't perfectly deterministic due to how AI systems process requests, but they're highly consistent.
High Temperature for Creative Tasks
You should use high temperature settings (0.8 to 2.0) when you want diverse, creative, and varied outputs.
High temperature works best for tasks like:
- Creative writing and storytelling
- Brainstorming ideas or generating alternatives
- Writing marketing copy with personality
- Generating fictional content or poetry
- Exploring different perspectives on open-ended questions
For instance, if you ask "Write an opening line for a fantasy novel" at temperature 1.2, each generation might produce completely different results: "The dragon's shadow fell across the burning village," or "Magic died the day Elara was born," or "Three moons rose over the Shattered Isles." This variety is exactly what you want for creative work.
However, high temperature comes with a trade-off: as creativity increases, factual accuracy tends to decrease. The AI might generate plausible-sounding but incorrect information (called hallucinations) or even nonsensical combinations when temperature is too high.
The Temperature Trade-Off in Practice
Different AI models use different default temperature settings, which explains some of the personality differences you've noticed when using various tools.
When you use ChatGPT, Claude, or Gemini through their web interfaces, each has preset temperature values optimized for conversational interaction—typically around 0.7 to 1.0. This creates a balance between staying accurate and sounding natural.
But when developers use these same models through their APIs (the programming interfaces that let apps connect to AI), they can specify exact temperature values. A customer service chatbot might use temperature 0.3 for consistent, accurate responses. A creative writing assistant might use temperature 1.2 for varied suggestions.
The same model can behave very differently depending solely on its temperature setting. Understanding this helps you recognize when an AI's creativity setting might be inappropriate for your task.
What's Next
You've learned how temperature controls whether AI chooses predictably or creatively, but this raises another question: does the AI remember everything from your entire conversation, or is there a limit to what it can process?