Docker basics — images, containers, and the commands you'll use every day

The previous post explained what containers are conceptually. This one is practical: by the end, you’ll understand how Docker is structured internally, how images work, know every Dockerfile instruction worth knowing, and have a running container you built yourself. Concepts stick once you’ve seen them work.

How Docker is structured

Docker is a container management platform. It separates the application layer from the infrastructure layer so that development, testing, and deployment can happen in consistent, repeatable environments. Three components do most of the work:

Docker Daemon (dockerd) — the background process that manages all Docker objects: images, containers, networks, and volumes. It listens for API requests from clients and executes them. The daemon can also talk to other daemons to manage distributed Docker services.

Docker Client (docker) — the CLI you interact with. Every command you run (docker build, docker run, docker ps) is an API call that the client sends to the daemon. One client can talk to multiple daemons.

Docker Registry — where images are stored. Docker Hub is the default public registry. Private registries (self-hosted or cloud-managed) are common in production. When you run docker pull, the image downloads from the registry into local storage. When you run docker run, Docker checks local storage first — only going to the registry if the image isn’t already there.

Together these form a client-server architecture: your terminal talks to docker, which talks to dockerd, which pulls from registries and manages the containers on your host.

Installing Docker

The cleanest way to install Docker on any Linux system is via the official install script:

curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
# Log out and back in for the group change to take effect

On macOS and Windows, install Docker Desktop from docker.com. It bundles everything you need and runs a lightweight Linux VM in the background to host the containers.

Verify the installation:

docker --version
docker run hello-world

That second command pulls a small image from Docker Hub and runs it. If you see “Hello from Docker!”, you’re set up.

How Docker images are structured

A Docker image is a file comprised of multiple read-only layers, stacked on top of each other. Each layer represents a set of filesystem changes — files added, modified, or deleted. When you start a container from an image, Docker adds a single writable layer on top. Everything in the image itself stays unchanged.

There are a few terms worth understanding:

Base image — built from scratch (FROM scratch). You’re responsible for everything: the filesystem, the runtime, the libraries. Used for building minimal images like alpine or busybox.

Parent image — starts from an existing image in a registry. This is what you do in almost every real Dockerfile: FROM node:20-alpine, FROM python:3.11-slim, etc.

Image layers — each instruction in a Dockerfile that modifies the filesystem (RUN, COPY, ADD) creates a new layer. Layers are cached individually, which is what makes builds fast.

Container layer — the writable layer Docker adds when a container starts. Changes here are ephemeral: they disappear when the container is removed unless you use a volume.

Docker manifests — metadata about an image: its layers, digest, and size. Manifest lists (also called multi-arch images) let a single image tag work across different CPU architectures and operating systems.

Layer data is stored on the host at:

/var/lib/docker/image/<storage-driver>/layerdb/sha256/

Two ways to create an image

Method 1: The interactive method (docker commit)

Run a container, make changes inside it, then commit the result as a new image. Useful for quick experiments, not for production.

# Start a detached container from a base image
docker run -itd --name my-container centos:latest bash

# Open a shell inside it
docker exec -it my-container bash

# Inside the container: make your changes
yum install -y nginx
exit

# Commit the container state as a new image
docker commit \
  --change='CMD ["nginx", "-g", "daemon off;"]' \
  -c "EXPOSE 80" \
  my-container \
  my-nginx:v1

The problem with this method: the steps are undocumented, not reproducible, and the images tend to be bloated. Don’t use it for anything that needs to run in CI or production.

Method 2: The Dockerfile method (the right way)

A Dockerfile is a plain text file of sequential instructions. Docker executes them top to bottom, and each instruction that changes the filesystem creates a new cached layer.

Build command:

docker build -t <image-name>:<tag> <path-to-dockerfile>
# Example:
docker build -t my-nginx:v1 .

The . at the end tells Docker to look for the Dockerfile in the current directory.

Every Dockerfile instruction, explained

FROM centos:latest

FROM — the first instruction in every Dockerfile. Specifies the base or parent image. Use specific tags (centos:7) rather than latest in production so builds are reproducible.

ARG BUILD_VERSION=1.0

ARG — defines a build-time variable. Can be overridden with --build-arg at build time. Unlike ENV, ARG values are not available inside running containers.

LABEL maintainer="you@example.com" version="1.0"

LABEL — attaches metadata as key-value pairs. Useful for tooling, automated image management, and documentation.

ENV NODE_ENV=production

ENV — sets environment variables that persist both during the build and in the running container. Use for configuration that the application needs at runtime.

WORKDIR /app

WORKDIR — sets the working directory for all subsequent instructions (RUN, CMD, ENTRYPOINT, COPY, ADD). Creates the directory if it doesn’t exist. Prefer this over RUN cd /app.

RUN yum install -y epel-release && \
    yum install -y nginx && \
    yum clean all

RUN — executes a command during the build and commits the result as a new layer. Chain multiple commands with && in a single RUN to keep layers small (more on this below).

COPY package*.json ./
ADD https://example.com/file.tar.gz /tmp/

COPY — copies files from the build context (your local machine) into the image. ADD does the same but additionally handles remote URLs and automatically extracts .tar archives. Prefer COPY unless you specifically need the extra capabilities of ADD.

EXPOSE 80

EXPOSE — documents which port the application listens on. This is metadata only — it doesn’t actually publish the port. Publishing happens at runtime with -p.

VOLUME ["/data"]

VOLUME — declares a mount point for external storage. Any data written to this path inside the container is stored in a Docker-managed volume on the host, persisting beyond the container’s lifecycle.

USER nginx

USER — sets the user (and optionally group) that subsequent instructions and the container process run as. Always set this to a non-root user in production images.

CMD ["nginx", "-g", "daemon off;"]
ENTRYPOINT ["/docker-entrypoint.sh"]

CMD — the default command when a container starts. Can be overridden by passing a command to docker run. ENTRYPOINT works similarly but is harder to override — use it when the container has a single well-defined purpose and CMD for the default arguments to that entrypoint.

A complete working example

FROM node:20-alpine

LABEL maintainer="you@thedigitaldrift.in"

WORKDIR /app

COPY package*.json ./
RUN npm ci --only=production

COPY . .

USER node

EXPOSE 3000

CMD ["node", "index.js"]

Build and run:

docker build -t my-app:v1 .
docker run -d -p 3000:3000 --name my-app my-app:v1

Layer caching and image size optimisation

This is where most people leave significant performance on the table.

Only RUN, COPY, and ADD instructions create filesystem layers that affect image size. Instructions like ENV, LABEL, EXPOSE, and WORKDIR only update image metadata — they don’t add bulk.

The caching rule: Docker re-executes an instruction only if that instruction or anything before it has changed. Structure your Dockerfile so frequently-changing instructions come last.

Slow (cache breaks early):

FROM node:20-alpine
COPY . .            # Changes on every code edit
RUN npm install     # Always re-runs, even if dependencies didn't change

Fast (cache breaks late):

FROM node:20-alpine
COPY package*.json ./   # Only changes when dependencies change
RUN npm install         # Cached until package.json changes
COPY . .                # Changes often, but npm install is already cached

Combine RUN instructions to eliminate intermediate layers:

# This creates 3 layers, and yum clean all can't shrink the cache in layer 1:
RUN yum install -y epel-release
RUN yum install -y nginx
RUN yum clean all

# This creates 1 layer, 31MB smaller in practice:
RUN yum install -y epel-release && \
    yum install -y nginx && \
    yum clean all

Use .dockerignore (like .gitignore) to exclude files from the build context — node_modules/, .git/, local .env files. This speeds up builds and prevents secrets from accidentally entering images.

Running and managing containers

docker run -d -p 3000:3000 --name my-app my-app:v1

Key flags:

-d — detached mode; runs in the background
-p 3000:3000 — publish port; format is host:container
--name — gives the container a readable name

Common runtime commands:

docker ps                    # List running containers
docker ps -a                 # Include stopped containers
docker logs my-app           # View output
docker logs -f my-app        # Follow logs live
docker exec -it my-app sh    # Shell into running container
docker stop my-app           # Stop gracefully
docker rm my-app             # Remove stopped container
docker rm -f my-app          # Stop and remove in one command

Inspecting images

docker images                        # List all local images
docker image inspect my-app:v1       # Full metadata in JSON
docker history my-app:v1             # Each layer, its size, and the instruction that created it

docker history is the most useful for debugging bloated images — it shows exactly which instruction is responsible for the most disk space.

The commands you’ll use daily

Command	What it does
`docker build`	Build an image from a Dockerfile
`docker run`	Start a container from an image
`docker ps`	List running containers
`docker ps -a`	List all containers including stopped
`docker logs`	View container output
`docker exec -it`	Run a command in a running container
`docker stop` / `docker rm`	Stop / remove containers
`docker images`	List local images
`docker history`	Show image layers and sizes
`docker pull`	Pull an image from a registry
`docker push`	Push an image to a registry
`docker commit`	Create image from a running container

Post 2 of the Docker series on The Digital Drift. Previous: What are containers? · Next: Docker on CentOS