Docker & Containerization

INFO 153B/253B: Backend Web Architecture

Week 5

 

Kay Ashaolu - Instructor

Suk Min Hwang - GSI

Today's Agenda

  • Part 1: Prep Work Recap - Docker basics check-in
  • Part 2: Containers vs VMs - Why containers won
  • Part 3: Images & Layers - How Docker builds efficiently
  • Part 4: Dockerfile Deep Dive - Every command explained
  • Part 5: docker-compose - Multi-container development
  • Demo: Dockerize a Flask App (5 min)
  • In-Class Exploration: Containerize a Product API (45 min)

Part 1: Prep Work Recap

Quick check-in on O'Reilly Chapter 4

What You Learned in Prep

  • Containers vs VMs: Lighter weight, share OS kernel
  • Images: Blueprints for containers (read-only)
  • Containers: Running instances of images
  • Dockerfile: Recipe for building images
  • docker-compose: Run multiple containers together
Key insight: Docker packages your app WITH its environment. "It works on my machine" becomes "It works everywhere."

The Problem Docker Solves

  • Developer laptop: Python 3.11, Flask 3.0, macOS
  • Server: Python 3.9, Flask 2.0, Ubuntu
  • Result: "It works on my machine!" (but crashes in production)
The real world: Different team members have different OS versions, different Python versions, different installed packages. Docker makes this problem disappear.

Quick Check: Docker Vocabulary

TermDefinitionAnalogy
ImageRead-only template with your appRecipe / Blueprint
ContainerRunning instance of an imageCooked meal / Building
DockerfileInstructions to build an imageRecipe steps
RegistryStorage for images (Docker Hub)Recipe book
  • You build images from Dockerfiles
  • You run containers from images
  • You push/pull images to/from registries

Quick Check: Basic Commands

# Build an image from Dockerfile
docker build -t my-flask-app .

# Run a container from an image
docker run -p 5000:5000 my-flask-app

# List running containers
docker ps

# Stop a container
docker stop <container_id>

# List all images
docker images
  • -t tags (names) the image
  • -p maps ports (host:container)
  • . means use current directory's Dockerfile

Prep Recap: Key Takeaways

  • Docker solves: "Works on my machine" problem
  • Images are blueprints: Built once, run anywhere
  • Containers are isolated: Don't affect host system
  • Dockerfile is the recipe: FROM, COPY, RUN, CMD
The connection: Last week you built Flask APIs that run locally. This week: package them to run anywhere.

Part 2: Containers vs VMs

Why containers won

Virtual Machines: The Old Way

How VMs Work

  • Full OS for each application
  • Hypervisor manages VMs
  • Complete isolation
  • Heavy: GBs of overhead

VM Stack

+-------+ +-------+ +-------+
| App 1 | | App 2 | | App 3 |
+-------+ +-------+ +-------+
| Guest | | Guest | | Guest |
|  OS   | |  OS   | |  OS   |
+-------+-+-------+-+-------+
|       Hypervisor          |
+---------------------------+
|        Host OS            |
+---------------------------+
|        Hardware           |
+---------------------------+

Containers: The Modern Way

How Containers Work

  • Share host OS kernel
  • Docker daemon manages
  • Process-level isolation
  • Light: MBs of overhead

Container Stack

+-------+ +-------+ +-------+
| App 1 | | App 2 | | App 3 |
+-------+-+-------+-+-------+
|     Docker Engine         |
+---------------------------+
|        Host OS            |
+---------------------------+
|        Hardware           |
+---------------------------+
Key difference: No Guest OS layer! Apps share the host kernel.

Containers vs VMs: Comparison

AspectVirtual MachineContainer
SizeGBs (full OS)MBs (just app)
StartupMinutesSeconds
MemoryDedicated per VMShared
IsolationComplete (separate OS)Process-level
Density~10 per host~100+ per host
  • Containers are 10-100x more efficient
  • Same hardware runs more applications
  • Faster deployments, faster scaling

When to Use What

Use VMs When:

  • Need different OS (Windows on Linux)
  • Complete kernel isolation required
  • Running untrusted code
  • Legacy applications

Use Containers When:

  • Deploying microservices
  • CI/CD pipelines
  • Development environments
  • Cloud-native applications
In practice: 95% of new applications use containers. VMs are for specific edge cases.

The Docker Workflow

┌─────────────┐    docker build    ┌─────────────┐
│  Dockerfile │ ──────────────────▶│    Image    │
└─────────────┘                    └──────┬──────┘
                                          │
                                   docker run
                                          │
                                          ▼
                                   ┌─────────────┐
                                   │  Container  │
                                   └─────────────┘
  • Write a Dockerfile (recipe)
  • Build an image (docker build)
  • Run containers from the image (docker run)
  • Share the image via a registry (docker push/pull)

Part 3: Images & Layers

How Docker builds efficiently

Images Are Made of Layers

FROM python:3.11-slim      # Layer 1: Base Python image
WORKDIR /app               # Layer 2: Create directory
COPY requirements.txt .    # Layer 3: Copy requirements
RUN pip install -r ...     # Layer 4: Install dependencies
COPY app.py .              # Layer 5: Copy application
CMD ["python", "app.py"]   # Layer 6: Set startup command
  • Each Dockerfile instruction creates a layer
  • Layers are cached and reused
  • If a layer hasn't changed, Docker skips rebuilding it
  • This is why Docker builds are fast after the first time

Layer Caching in Action

First Build

Step 1/6: FROM python
 ---> Using cache
Step 2/6: WORKDIR
 ---> Running...
Step 3/6: COPY reqs
 ---> Running...
Step 4/6: RUN pip
 ---> Running... (slow)
Step 5/6: COPY app
 ---> Running...
Step 6/6: CMD
 ---> Running...

After Editing app.py

Step 1/6: FROM python
 ---> Using cache
Step 2/6: WORKDIR
 ---> Using cache
Step 3/6: COPY reqs
 ---> Using cache
Step 4/6: RUN pip
 ---> Using cache (fast!)
Step 5/6: COPY app
 ---> Running...
Step 6/6: CMD
 ---> Running...
Key insight: pip install is cached! Only app.py is rebuilt.

Dockerfile Order Matters

Bad: Slow Rebuilds

# Copy everything first
COPY . .

# Then install
RUN pip install -r requirements.txt

# Every code change
# reinstalls ALL dependencies!

Good: Fast Rebuilds

# Copy requirements FIRST
COPY requirements.txt .

# Install dependencies
RUN pip install -r requirements.txt

# THEN copy code
COPY . .
  • Put things that change rarely at the top (dependencies)
  • Put things that change often at the bottom (code)
  • When a layer changes, all layers after it rebuild

Base Images

# Full Python image (~1 GB)
FROM python:3.11

# Slim image (~150 MB) - no build tools
FROM python:3.11-slim

# Alpine image (~50 MB) - minimal Linux
FROM python:3.11-alpine
ImageSizeUse Case
python:3.11~1 GBNeed to compile packages (numpy)
python:3.11-slim~150 MBMost Flask apps (recommended)
python:3.11-alpine~50 MBSize critical, no C extensions

Inspecting Images

# List all images
docker images

# See image history (layers)
docker history flask-app

# Inspect image details
docker inspect flask-app

# Remove unused images
docker image prune
  • docker images shows all local images with sizes
  • docker history shows each layer and its size
  • Prune regularly to free disk space

Part 4: Dockerfile Deep Dive

Every command explained

Complete Dockerfile Example

# Base image with Python
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Copy requirements and install
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Document the port
EXPOSE 5000

# Default command
CMD ["python", "app.py"]
  • This is the standard pattern for Flask apps
  • Let's break down each instruction

FROM: Start with a Base

# REQUIRED: Every Dockerfile starts with FROM
FROM python:3.11-slim
  • Always first - defines the starting point
  • Format: image:tag
  • Where it comes from: Docker Hub (default registry)
  • Includes: OS, Python, pip, common tools
Pro tip: Always specify a version tag (3.11, not latest). "latest" can change and break your build.

WORKDIR: Set the Directory

# Create and switch to /app directory
WORKDIR /app

# All following commands run in /app
COPY requirements.txt .    # Copies to /app/requirements.txt
RUN pip install ...        # Runs in /app
COPY . .                   # Copies to /app
  • Creates the directory if it doesn't exist
  • Changes to that directory (like cd)
  • Persists for all following instructions
  • Convention: Use /app or /code

COPY: Add Your Files

# Copy single file
COPY requirements.txt .

# Copy multiple files
COPY app.py config.py ./

# Copy entire directory
COPY . .

# Copy with rename
COPY app.py /app/server.py
  • Source: Path relative to Dockerfile location
  • Destination: Path inside the container
  • . means current directory (WORKDIR)
  • Trailing slash on destination means "into this directory"

RUN: Execute Commands

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

# System updates (less common for slim images)
RUN apt-get update && apt-get install -y curl

# Multiple commands in one layer
RUN pip install flask && \
    pip install requests && \
    rm -rf /root/.cache
  • Executes shell commands during build
  • Creates a new layer for each RUN
  • Combine related commands with && to reduce layers
  • --no-cache-dir reduces image size

EXPOSE: Document Ports

# Document that the container listens on 5000
EXPOSE 5000

# Multiple ports
EXPOSE 5000 8080
  • Documentation only - doesn't actually publish ports
  • Tells humans and tools which ports the app uses
  • Still need -p flag when running: docker run -p 5000:5000
Common confusion: EXPOSE doesn't make ports accessible. You still need -p 5000:5000 when running.

CMD: Default Startup Command

# Exec form (preferred) - runs directly
CMD ["python", "app.py"]

# Shell form - runs in /bin/sh -c
CMD python app.py

# With Flask CLI
CMD ["flask", "run", "--host=0.0.0.0"]
  • Runs when container starts (not during build)
  • Exec form (brackets): Runs directly, handles signals properly
  • Shell form: Runs in a shell, more flexible but less clean
  • Only one CMD: If multiple, only the last one runs

ENV: Environment Variables

# Set environment variables
ENV FLASK_APP=app.py
ENV FLASK_ENV=production

# Multiple in one line
ENV FLASK_APP=app.py FLASK_ENV=production

# Use in later commands
RUN echo "App: $FLASK_APP"
  • Sets environment variables in the image
  • Persists when container runs
  • Can override with docker run -e VAR=value

Part 5: docker-compose

Multi-container development

Why docker-compose?

  • Problem: Real apps have multiple services
  • Flask app + PostgreSQL + Redis + Celery
  • Running separately: docker run for each (painful)
  • Solution: Define all services in one file
# Without docker-compose
docker run -d --name postgres postgres:15
docker run -d --name redis redis:7
docker run -d -p 5000:5000 --link postgres --link redis flask-app

# With docker-compose
docker-compose up

docker-compose.yml Structure

version: '3.8'

services:
  web:                          # Service name
    build: .                    # Build from Dockerfile
    ports:
      - "5000:5000"            # Port mapping
    environment:
      - FLASK_ENV=development  # Environment variables
    volumes:
      - .:/app                 # Mount code for hot reload
  • version: Compose file format version
  • services: Define each container
  • build: Path to Dockerfile
  • ports: Same as -p flag
  • volumes: Mount host directories into container

Volumes: Hot Reload Magic

services:
  web:
    build: .
    volumes:
      - .:/app    # Mount current directory to /app in container
  • Volumes mount host directories into container
  • Changes on host immediately appear in container
  • No need to rebuild when code changes!
  • Format: host_path:container_path
Development workflow: Edit code locally, Flask reloads automatically. No docker build needed during development!

Multi-Service Example

version: '3.8'

services:
  web:
    build: .
    ports:
      - "5000:5000"
    depends_on:
      - db
    environment:
      - DATABASE_URL=postgresql://postgres:password@db:5432/app

  db:
    image: postgres:15
    environment:
      - POSTGRES_PASSWORD=password
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

docker-compose Commands

# Start all services (detached)
docker-compose up -d

# Start with build
docker-compose up --build

# View logs
docker-compose logs -f web

# Stop all services
docker-compose down

# Stop and remove volumes
docker-compose down -v
  • up: Start services (builds if needed)
  • -d: Detached mode (background)
  • --build: Force rebuild
  • down: Stop and remove containers
  • -v: Also remove volumes (data loss!)

Development vs Production

Development

# docker-compose.yml
services:
  web:
    build: .
    volumes:
      - .:/app  # Hot reload
    environment:
      - FLASK_ENV=development

Production

# docker-compose.prod.yml
services:
  web:
    image: my-app:v1.0
    # No volumes!
    environment:
      - FLASK_ENV=production
  • Dev: Volumes for hot reload, debug mode
  • Prod: Pre-built image, no volumes, production mode

Summary

Key takeaways from today

What You Learned Today

  • Containers vs VMs: Lighter, faster, more efficient
  • Images & Layers: Order matters for caching
  • Dockerfile commands: FROM, WORKDIR, COPY, RUN, EXPOSE, CMD
  • docker-compose: Define multi-container apps in YAML
  • Volumes: Hot reload during development
Key insight: Your Flask app from Week 3 + a Dockerfile = runs anywhere.

Looking Ahead

WeekTopicBuilding On
6Validation & SchemasFlask-Smorest, Marshmallow
7SQLAlchemy + MigrationsPostgreSQL in containers
8Async Task QueuesRedis + rq in containers
Assignment 1: Due before Week 6 Friday lab.
Friday: Assignment 1 Help
Prep for Week 6: O'Reilly Chapter 5 - Flask-Smorest

Live Demo

Dockerize a Flask App

Demo: What We'll Build

  • Start with a simple Flask app (already written)
  • Create a Dockerfile from scratch
  • Build the image with docker build
  • Run the container with docker run
  • Test with curl - see it work!
  • Create docker-compose.yml for development
Watch along: I'll type everything live. You'll practice in the in-class exploration.

In-Class Exploration

Containerize a Product API

Project Overview

  • You receive a complete Flask Product API (already written)
  • Your job: containerize it!
  • Tasks:
    • Write a Dockerfile (15 min)
    • Build and run the container (10 min)
    • Write docker-compose.yml (15 min)
    • Test and submit (5 min)
Focus: This is about Docker, not Flask. The Flask code is provided and working.

Getting Started

  1. Click the GitHub Classroom link (in bCourses)
  2. Clone your repo:
    git clone <your-repo-url>
    cd in-class-exploration-week-5-<your-username>
  3. Verify Flask works locally first:
    python3 -m venv venv
    source venv/bin/activate
    pip install -r requirements.txt
    flask run
  4. Then containerize it!

Testing Your Container

# Build the image
docker build -t product-api .

# Run the container
docker run -p 5000:5000 product-api

# In another terminal, test it
curl http://localhost:5000/products
curl http://localhost:5000/health

# With docker-compose
docker-compose up

Submitting (at 10:45 AM)

  1. Commit everything you have (even if incomplete):
    git add .
    git commit -m "In-class exploration submission"
  2. Push to your repository:
    git push
  3. Submit on bCourses: Add your GitHub repo URL
Grading: Pass/No Pass based on engagement. Just show you worked on it!

Want to keep working? Continue after submitting - just push additional changes.

Questions?

 

Website: groups.ischool.berkeley.edu/i253/sp26

Email: kay@ischool.berkeley.edu