AI Beginner Crash Course
00Start 01Who Builds the Models 02Modalities 03The Models 04Where AI Runs 05Chat Interfaces 06Running AI Locally 07How It Fits Together 08AI Coding Agents 09Developer Toolkit 10Git & GitHub 11Database & Backend 12APIs & API Keys 13Hosting & Deployment
Course Overview

AI Beginner Crash Course

Cut through the buzzword confusion. Understand what all these names, tools, and concepts actually are — and how they relate to each other.

Scroll
01

Who Builds the Models

The companies at the top of the pyramid. They build the engines — everything else is built on top.

Frontier LLM Companies
Specialist Companies (Image, Video, Audio)
02

What AI Can Do

The types of content AI can work with. "Multimodal" means it handles more than one.

Aa
Text
Including code
Image
Photos, art, diagrams
Video
Generated clips & edits
Audio
Speech, music, sound
Input vs. Output: "Multimodal" doesn't tell you the full picture. A model might accept images as input but only output text. Always ask: what goes in, and what comes out?
03

The Models

The actual AI. The brain. Everything else is an interface sitting on top.

Model
Catch-all term for any AI brain.
LLM
Large Language Model. Text-focused.
Diffusion Model
Architecture behind image & video generation.
Foundation Model
Umbrella term for any large pre-trained model.
SLM
Small Language Model. Lighter, fewer resources.
Open-Weight
Model weights are public. You can download & run it yourself.
Closed
Only accessible through the company's apps or API.
Knowledge Cutoff
Models are frozen in time. They only know data up to their training date.
Company → Model Map
Company Model Input Output
Anthropic Claude Opus 4.5 TextImagesPDFs Text
Claude Sonnet 4.5 TextImagesPDFs Text
Claude Haiku 4.5 TextImagesPDFs Text
OpenAI GPT-5.2 TextImagesAudio TextAudio
GPT-5.2-Codex TextImages Text
DALL-E 3 Text Images
Sora 2 TextImages Video
Google Gemini 3 Pro TextImagesAudioVideo Text
Gemini 3 Flash TextImagesAudioVideo Text
Imagen 3 Text Images
Veo 2 TextImages Video
xAI Grok 4.1 TextImages Text
Aurora Text Images
Meta Llama 4 Scout TextImages Text
Llama 4 Maverick TextImages Text
DeepSeek DeepSeek-V3.2 Text Text
DeepSeek-R1 Text Text
Alibaba Cloud Qwen3-235B Text Text
Qwen3-Max Text Text
Qwen-Image Text Images
Mistral Mistral Large 3 Text Text
Devstral 2 Text Text
Cohere Command A Text Text
Command A Vision TextImages Text
AI21 Labs Jamba Large Text Text
Jamba2 Mini Text Text
Stability AI Stable Diffusion 3.5 TextImages Images
Stable Video 4D 2.0 TextImages Video
Stable Audio 2.5 Text Audio
Midjourney Midjourney V7 Text ImagesVideo
Runway Gen-4.5 TextImages Video
Black Forest Labs Flux 2 TextImages Images
04

Where AI Runs

Cloud, local, or a mix of both.

Cloud
Model runs on the company's servers. Access via the internet. No special hardware needed.
Trade-off: your data leaves your machine.
Local
Model runs on your own machine. Requires GPU + RAM. Only works with open-weight models.
Upside: complete privacy. Nothing leaves your computer.
Hybrid / Edge
Some processing on-device, complex tasks sent to cloud. Apple Intelligence is a key example.
Balance of privacy, speed, and capability.
05

Chat Interfaces

The apps you talk to. They're wrappers — not the AI itself.

First-Party (made by the model companies)
Third-Party (use other companies' models)
Interface Features (not model features)
Web Search
Interface searches the web, feeds results to the model as context.
Deep Research
Automated multi-search loops. Web search on steroids.
Memory
Interface stores preferences, injects them later. Model has zero memory.
Conversation History
Interface replays past messages to simulate continuity.
System Prompts
Hidden instructions that shape model behavior.
File Uploads
Interface converts files to text the model can read.
Image Generation
Routes to a separate image model (DALL-E, Imagen, etc.).
Voice Mode
Speech-to-text & text-to-speech. Model still processes text.
Artifacts / Canvas
Renders output in interactive views. Model just wrote text.
06

Running AI Locally

Your machine, your models. No internet, no subscriptions, complete privacy.

Where to Get Models
Local Chat Interfaces
Local Image Generation
07

How It All Fits Together

The model is simple. Everything else is layers on top.

Tools / Plugins Web search, file uploads, code execution, image generation (via separate model), memory, plugins, MCP, integrations
Interface (the app / wrapper) ChatGPT, Claude.ai, Gemini, Grok — conversation history, accounts, settings, system prompts, UI
Model (the engine) GPT, Claude, Gemini, Grok — text in, text out. No memory. No web. No files. Stateless.
08

AI Coding Agents

From chat to coworker. They live inside your project — reading, editing, running, iterating.

Browser-Based App Builders
First-Party Coding Agents (built by model companies)
How You Access Them
Agent Terminal (CLI) VS Code Desktop App Web Interface
Claude Code Claude Desktop claude.ai/code
Codex Standalone app chatgpt.com/codex
Gemini CLI Companion ext.
AI-Native IDEs (third-party wrappers)
IDE Extensions
Cloud Agents (fully autonomous)
09

The Developer Toolkit

Not AI products — the environment you work in. The workbench, the tools, the infrastructure.

Code Editors / IDEs
Terminal Apps
Languages
HTML CSS JavaScript TypeScript Python SQL Rust Go
Frameworks
React Next.js Vue Svelte Astro Django Flask FastAPI Node.js Express
Other Tools You'll See
10

Git & GitHub

Version control and the connective tissue of modern software development.

11

Database & Backend Services

Where your app stores data. Backend-as-a-Service platforms that bundle database, auth, storage, and APIs.

12

APIs & API Keys

How software talks to software.

API
Application Programming Interface. How your app talks to external services — Claude, Supabase, Stripe, anything. Think of it as a waiter: your app tells the API what it wants, the API goes to the kitchen (the service), and brings back the response.
API Key
A password that identifies your app and ties usage to your account. You get it from the service's dashboard, paste it into your project, and your app can talk to that service. The #1 thing that trips up beginners in vibe coding.
13

Hosting & Deployment

Putting it on the internet.

Static Site / Frontend Hosting
Full-Stack / Backend Hosting
The Big Cloud Providers