Day 7: OpenAI GPT Series
OpenAI’s GPT series is the protagonist that opened the LLM era. ChatGPT brought AI to the general public, and through its API, GPT models are the most widely used by developers.
OpenAI Model Selection Guide (as of April 2026)
| Model Family | Recommended For | Notes |
|---|---|---|
| GPT-5 series | Complex reasoning, agentic workflows | Accuracy first, Responses API recommended |
| GPT-4.1 series | General text/coding, quality-cost balance | Default choice for general backend APIs |
| GPT-4o / GPT-4o mini | Multimodal (text+image+voice), low latency | Ideal for real-time/interactive scenarios |
Pricing and supported models change frequently, so always check the official pages first rather than memorizing a fixed table.
- API pricing:
https://platform.openai.com/pricing - Model documentation:
https://developers.openai.com/api/docs/models
Basic Responses API Usage (Recommended)
# pip install openai
from openai import OpenAI
client = OpenAI(api_key="YOUR_API_KEY") # Environment variable recommended
# Basic text generation (Responses API)
response = client.responses.create(
model="gpt-4.1-mini",
input=[
{"role": "system", "content": "You are a friendly AI tutor."},
{"role": "user", "content": "What is a Transformer? Please explain briefly."},
],
temperature=0.7, # Creativity control (0=deterministic, 2=very random)
max_output_tokens=500,
)
print(response.output_text)
print(f"Tokens used: input={response.usage.input_tokens}, "
f"output={response.usage.output_tokens}")
Streaming Response Handling (Chat Completions Compatible Example)
from openai import OpenAI
client = OpenAI()
# Streaming: output tokens as they are generated
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "Tell me 3 advantages of Python."},
],
stream=True, # Enable streaming
)
full_response = ""
for chunk in stream:
content = chunk.choices[0].delta.content
if content:
print(content, end="", flush=True)
full_response += content
print(f"\n\nTotal response length: {len(full_response)} characters")
Multi-turn Conversation and System Prompt (Chat Completions Compatible Example)
from openai import OpenAI
client = OpenAI()
conversation = [
{"role": "system", "content": "You are a Python expert. Include code examples in your answers."},
]
def chat(user_message):
conversation.append({"role": "user", "content": user_message})
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=conversation,
temperature=0.3,
)
assistant_message = response.choices[0].message.content
conversation.append({"role": "assistant", "content": assistant_message})
return assistant_message
# Multi-turn conversation
print(chat("What is list comprehension?"))
print("---")
print(chat("Can I use that with dictionaries too?"))
# Remembers previous conversation context and responds accordingly
Practical Tips
| Scenario | Recommended Model | Reason |
|---|---|---|
| Simple classification, extraction | Lightweight model (e.g., 4o mini class) | Fast and cheap |
| Complex reasoning, analysis | Top-tier reasoning model (e.g., GPT-5/4.1 upper) | Accuracy first |
| Prototyping | Lightweight model | Cost savings |
| Image analysis | Multimodal model | Image input processing |
| Large batch processing | Lightweight model + Batch API | Cost efficiency |
Temperature setting guide: 0 for classification/extraction, 0.7 for general conversation, 1.0-1.5 for creative writing.
Today’s Exercises
- Get an OpenAI API key, choose a lightweight model, and send the question “What is the capital of South Korea?” Check the number of tokens used and estimate the cost.
- Send the same question 5 times each with temperature 0, 0.7, and 1.5, and compare the diversity of responses.
- Calculate how token costs increase as conversations grow longer in multi-turn dialogue. What problems arise when the conversation reaches 50 turns?