Dockerfile reference

Docker can build images automatically by reading the instructions from a Dockerfile. A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image. This page describes the commands you can use in a Dockerfile.

Overview

The Dockerfile supports the following instructions:

Instruction	Description
ADD	Add local or remote files and directories.
ARG	Use build-time variables.
CMD	Specify default commands.
COPY	Copy files and directories.
ENTRYPOINT	Specify default executable.
ENV	Set environment variables.
EXPOSE	Describe which ports your application is listening on.
FROM	Create a new build stage from a base image.
HEALTHCHECK	Check a container's health on startup.
LABEL	Add metadata to an image.
MAINTAINER	Specify the author of an image.
ONBUILD	Specify instructions for when the image is used in a build.
RUN	Execute build commands.
SHELL	Set the default shell of an image.
STOPSIGNAL	Specify the system call signal for exiting a container.
USER	Set user and group ID.
VOLUME	Create volume mounts.
WORKDIR	Change working directory.

ADD

The ADD instruction and COPY have basically the same format and nature. However, ADD adds some additional functionality on top of COPY.

For example, the <source path> can be a URL, in which case the Docker engine will try to download the file from this link and place it in the <target path>. The downloaded file's permissions will automatically be set to 600. If this is not the desired permission, an additional RUN layer is needed to adjust the permissions. Additionally, if the downloaded file is a compressed archive, an additional RUN layer is also needed to extract it. Therefore, it is more reasonable to directly use the RUN instruction, then use the wget or curl tool to download, handle permissions, extract, and then clean up unnecessary files. So this functionality is not actually practical, and it is not recommended to use it.

If the <source path> is a tar compressed file, and the compression format is gzip, bzip2, or xz, the ADD instruction will automatically decompress this compressed file into the <target path>.

In some cases, this automatic decompression feature is very useful, like in the official ubuntu image:

FROM scratch
ADD ubuntu-xenial-core-cloudimg-amd64-root.tar.gz /
...

But in some cases, if we really want to copy a compressed file without extracting it, then we cannot use the ADD command.

In the Docker official Dockerfile best practices documentation, it is required to use COPY whenever possible, because the semantics of COPY are very clear – it just copies files, while ADD includes more complex functionality and its behavior may not be as clear. The most appropriate use case for ADD is the mentioned need for automatic decompression.

Additionally, it should be noted that the ADD instruction will cause the image build cache to become invalidated, which may slow down the image build process.

Therefore, when choosing between the COPY and ADD instructions, you can follow this principle: use the COPY instruction for all file copying operations, and only use ADD when automatic decompression is needed.

When using this instruction, you can also add the --chown=<user>:<group> option to change the file's owner user and group.

ADD --chown=55:mygroup files* /mydir/
ADD --chown=bin files* /mydir/
ADD --chown=1 files* /mydir/
ADD --chown=10:11 files* /mydir/

ARG

Format: ARG <parameter_name>[=<default_value>]

Build arguments have the same effect as ENV, both set environment variables. The difference is that the environment variables set by ARG during the build environment will not exist when the container is running. However, do not use ARG to store passwords or other sensitive information just because of this, as docker history can still show all the values.

The ARG instruction in the Dockerfile defines parameter names and their default values. This default value can be overridden in the docker build command with --build-arg <parameter_name>=<value>.

Using the ARG instruction flexibly allows you to build different images without modifying the Dockerfile.

The ARG instruction has a scope of effect. If it is specified before the FROM instruction, it can only be used in the FROM instruction.

ARG DOCKER_USERNAME=library

FROM ${DOCKER_USERNAME}/alpine

RUN set -x ; echo ${DOCKER_USERNAME}

Using the above Dockerfile, you will find that the value of the ${DOCKER_USERNAME} variable cannot be output. To output it correctly, you must specify ARG again after FROM.

# Only effective in FROM
ARG DOCKER_USERNAME=library

FROM ${DOCKER_USERNAME}/alpine

# To use it after FROM, you must specify it again
ARG DOCKER_USERNAME=library

RUN set -x ; echo ${DOCKER_USERNAME}

For multi-stage builds, pay special attention to this issue.

# This variable is effective in each FROM
ARG DOCKER_USERNAME=library

FROM ${DOCKER_USERNAME}/alpine

RUN set -x ; echo 1

FROM ${DOCKER_USERNAME}/alpine

RUN set -x ; echo 2

For the above Dockerfile, both FROM instructions can use ${DOCKER_USERNAME}. For variables used in each stage, they must be specified separately in each stage:

ARG DOCKER_USERNAME=library

FROM ${DOCKER_USERNAME}/alpine

# To use the variable after FROM, it must be specified separately in each stage
ARG DOCKER_USERNAME=library

RUN set -x ; echo ${DOCKER_USERNAME}

FROM ${DOCKER_USERNAME}/alpine

# To use the variable after FROM, it must be specified separately in each stage
ARG DOCKER_USERNAME=library

RUN set -x ; echo ${DOCKER_USERNAME}

CMD

The CMD instruction has a similar format to RUN, with two formats:

shell format: CMD <command>
exec format: CMD ["executable", "param1", "param2"...]
Parameter list format: CMD ["param1", "param2"...]. Used to specify the actual parameters after the ENTRYPOINT instruction is specified.

When introducing containers earlier, it was mentioned that Docker is not a virtual machine, and containers are processes. Since they are processes, a program and its parameters need to be specified when starting a container. The CMD instruction is used to specify the default startup command for the container's main process.

During runtime, a new command can be specified to replace this default command set in the image. For example, the default CMD for the ubuntu image is /bin/bash. If we directly run docker run -it ubuntu, we will enter bash. We can also specify a different command to run during runtime, such as docker run -it ubuntu cat /etc/os-release. This replaces the default /bin/bash command with cat /etc/os-release, outputting the system version information.

In terms of instruction format, the exec format is generally recommended, as it will be parsed as a JSON array during parsing, so double quotes " must be used, not single quotes.

If the shell format is used, the actual command will be wrapped as an argument to sh -c for execution. For example:

CMD echo $HOME

During actual execution, it will be changed to:

CMD [ "sh", "-c", "echo $HOME" ]

This is why we can use environment variables, because these environment variables will be parsed and processed by the shell.

Mentioning CMD inevitably brings up the issue of whether applications in the container should run in the foreground or background. This is a common confusion for beginners.

Docker is not a virtual machine, and applications in containers should run in the foreground, not like in virtual machines or physical machines where systemd is used to start background services. There is no concept of background services inside a container.

Some beginners write the CMD as:

CMD service nginx start

Then they find that the container exits immediately after execution. Even when using the systemctl command inside the container, they find that it cannot be executed at all. This is because they have not understood the concepts of foreground and background, and have not distinguished the difference between containers and virtual machines, still trying to understand containers from the perspective of traditional virtual machines.

For a container, its startup program is the container application process. The container exists for the main process, and when the main process exits, the container loses its purpose and exits as well. Other auxiliary processes are not something it needs to be concerned about.

Using the service nginx start command is an attempt to have the init system start the nginx service as a background daemon process. As mentioned earlier, CMD service nginx start will be interpreted as CMD [ "sh", "-c", "service nginx start"], so the main process is actually sh. When the service nginx start command ends, sh also ends, causing the main process to exit, and the container to exit naturally.

The correct approach is to directly execute the nginx executable file and require it to run in the foreground. For example:

CMD ["nginx", "-g", "daemon off;"]

COPY

COPY [--chown=<user>:<group>] <source_path>... <destination_path>
COPY [--chown=<user>:<group>] ["<source_path1>",... "<destination_path>"]

Similar to the `RUN` instruction, there are two formats, one similar to the command line, and one similar to a function call.

The `COPY` instruction copies files/directories from the `<source_path>` in the build context to the `<destination_path>` location in the new layer of the image. For example:

```docker
COPY package.json /usr/src/app/

<source_path> can be multiple paths, and can even be a wildcard, where the wildcard rules must satisfy Go's filepath.Match rules, such as:

COPY hom* /mydir/
COPY hom?.txt /mydir/

<destination_path> can be an absolute path inside the container, or a relative path relative to the working directory (the working directory can be specified using the WORKDIR instruction). The destination path does not need to be created beforehand. If the directory does not exist, it will be created before copying the files.

Additionally, it should be noted that when using the COPY instruction, various metadata of the source files will be preserved, such as read, write, and execute permissions, and file modification times. This feature is useful for image customization, especially when all the relevant files are managed using Git.

When using this instruction, you can also add the --chown=<user>:<group> option to change the owner and group of the files.

COPY --chown=55:mygroup files* /mydir/
COPY --chown=bin files* /mydir/
COPY --chown=1 files* /mydir/
COPY --chown=10:11 files* /mydir/

If the source path is a folder, when copying, it does not directly copy the folder, but copies the contents of the folder to the destination path.

ENTRYPOINT

The format of ENTRYPOINT is the same as the RUN instruction, divided into exec format and shell format.

The purpose of ENTRYPOINT is the same as CMD, both are used to specify the container startup program and its parameters. ENTRYPOINT can also be replaced at runtime, but it is slightly more cumbersome than CMD, and requires using the --entrypoint parameter of docker run to specify it.

Once ENTRYPOINT is specified, the meaning of CMD changes. It no longer directly runs its command, but instead treats the contents of CMD as arguments passed to the ENTRYPOINT instruction. In other words, when executed, it will become:

<ENTRYPOINT> "<CMD>"

So with CMD, why do we need ENTRYPOINT? What are the benefits of this <ENTRYPOINT> "<CMD>"? Let's look at a few scenarios.

Scenario 1: Make the image behave like a command

Suppose we need an image to know our current public IP address, we can first use CMD to implement it:

FROM ubuntu:18.04
RUN apt-get update \
    && apt-get install -y curl \
    && rm -rf /var/lib/apt/lists/*
CMD [ "curl", "-s", "http://myip.ipip.net" ]

If we use docker build -t myip . to build the image, if we need to query the current public IP, we only need to execute:

$ docker run myip
Current IP: 8.8.8.8 from California, USA

Well, it seems like we can use the image as a command now, but commands usually have arguments. What if we want to add arguments? From the CMD above, we can see that the actual command is curl, so if we want to display the HTTP header information, we need to add the -i parameter. Can we directly add the -i parameter to docker run myip?

$ docker run myip -i
docker: Error response from daemon: invalid header field value "oci runtime error: container_linux.go:247: starting container process caused \"exec: \\\"-i\\\": executable file not found in $PATH\"\n".

We can see an error that the executable file is not found, executable file not found. As mentioned earlier, what comes after the image name is the command, which will replace the default value of CMD at runtime. Therefore, here -i replaces the original CMD instead of being added to the end of the original curl -s http://myip.ipip.net. However, -i is not a command at all, so naturally it cannot be found.

So if we want to add the -i parameter, we have to re-enter the complete command:

$ docker run myip curl -s http://myip.ipip.net -i

This is obviously not a good solution, but using ENTRYPOINT can solve this problem. Now we re-implement this image using ENTRYPOINT:

FROM ubuntu:18.04
RUN apt-get update \
    && apt-get install -y curl \
    && rm -rf /var/lib/apt/lists/*
ENTRYPOINT [ "curl", "-s", "http://myip.ipip.net" ]

Now let's try using docker run myip -i again:

$ docker run myip
Current IP: 8.8.8.8 from: California, USA

$ docker run myip -i
HTTP/1.1 200 OK
Server: nginx/1.8.0
Date: Tue, 22 Nov 2016 05:12:40 GMT
Content-Type: text/html; charset=UTF-8
Vary: Accept-Encoding
X-Powered-By: PHP/5.6.24-1~dotdeb+7.1
X-Cache: MISS from cache-2
X-Cache-Lookup: MISS from cache-2:80
X-Cache: MISS from proxy-2_6
Transfer-Encoding: chunked
Via: 1.1 cache-2:80, 1.1 proxy-2_6:8006
Connection: keep-alive

Current IP: 8.8.8.8 from: California, USA

We can see that it worked this time. This is because when ENTRYPOINT exists, the contents of CMD will be passed as arguments to ENTRYPOINT, and here -i is the new CMD, so it will be passed as an argument to curl, achieving the desired effect.

Scenario 2: Preparation work before application startup

Starting a container is starting the main process, but sometimes, some preparation work is needed before starting the main process.

For example, for database types like mysql, some database configuration and initialization work may be needed, and these tasks need to be resolved before the final mysql server runs.

Additionally, you may want to avoid starting the service as the root user, thereby improving security, but before starting the service, you still need to perform some necessary preparation work as the root user, and finally switch to the service user identity to start the service. Alternatively, commands other than the service can still be executed as the root user for convenience like debugging.

These preparation tasks are unrelated to the container's CMD, and must be performed as a preprocessing step regardless of what the CMD is. In this case, you can write a script and put it in ENTRYPOINT to execute, and this script will take the arguments received (i.e., <CMD>) as the command to run at the end. For example, the official redis image does this:

FROM alpine:3.4
...
RUN addgroup -S redis && adduser -S -G redis redis
...
ENTRYPOINT ["docker-entrypoint.sh"]

EXPOSE 6379
CMD [ "redis-server" ]

You can see that it creates the redis user for the redis service, and finally specifies the ENTRYPOINT as the docker-entrypoint.sh script.

#!/bin/sh
# allow the container to be started with `--user`
if [ "$1" = 'redis-server' -a "$(id -u)" = '0' ]; then
	find . \! -user redis -exec chown redis '{}' +
	exec gosu redis "$0" "$@"
fi

exec "$@"

The contents of this script are to judge based on the contents of CMD. If it is redis-server, it will switch to the redis user identity to start the server. Otherwise, it will still run as the root user. For example:

$ docker run -it redis id
uid=0(root) gid=0(root) groups=0(root)

ENV

There are two formats:

ENV <key> <value>
ENV <key1>=<value1> <key2>=<value2>...

This instruction is simple, just setting environment variables. Subsequent instructions like RUN, or applications running at runtime, can directly use the environment variables defined here.

ENV VERSION=1.0 DEBUG=on \
    NAME="Happy Feet"

This example demonstrates how to break lines, and how to use double quotes to enclose values containing spaces, which is consistent with the behavior in Shell.

Once environment variables are defined, they can be used in subsequent instructions. For example, in the official node image Dockerfile, there is code like this:

ENV NODE_VERSION 7.2.0

RUN curl -SLO "https://nodejs.org/dist/v$NODE_VERSION/node-v$NODE_VERSION-linux-x64.tar.xz" \
  && curl -SLO "https://nodejs.org/dist/v$NODE_VERSION/SHASUMS256.txt.asc" \
  && gpg --batch --decrypt --output SHASUMS256.txt SHASUMS256.txt.asc \
  && grep " node-v$NODE_VERSION-linux-x64.tar.xz\$" SHASUMS256.txt | sha256sum -c - \
  && tar -xJf "node-v$NODE_VERSION-linux-x64.tar.xz" -C /usr/local --strip-components=1 \
  && rm "node-v$NODE_VERSION-linux-x64.tar.xz" SHASUMS256.txt.asc SHASUMS256.txt \
  && ln -s /usr/local/bin/node /usr/local/bin/nodejs

The following instructions support environment variable expansion: ADD, COPY, ENV, EXPOSE, FROM, LABEL, USER, WORKDIR, VOLUME, STOPSIGNAL, ONBUILD, RUN.

You can feel from this list of instructions that environment variables can be used in many places, which is very powerful. With environment variables, we can make more images from one Dockerfile, just by using different environment variables.

EXPOSE

The format is EXPOSE <port1> [<port2>...].

The EXPOSE instruction declares the ports that the container will provide services on when running. This is just a declaration, and the application will not actually open these ports when the container is running due to this declaration. There are two benefits to writing this declaration in the Dockerfile: one is to help the image user understand the daemon ports of this image service, which is convenient for configuring port mapping; the other is that when using random port mapping at runtime, i.e., docker run -P, it will automatically randomly map the ports exposed by EXPOSE.

The EXPOSE instruction should be distinguished from using -p <host port>:<container port> at runtime. -p maps the host port and container port, in other words, it exposes the corresponding port service of the container to external access, while EXPOSE only declares which ports the container intends to use, and does not automatically perform port mapping on the host.

FROM

In a Dockerfile, the FROM instruction specifies the base image, which is the starting point for building a new image. It is the first instruction in the Dockerfile, used to define the base environment for the build process.

Common uses of the `FROM` instruction

Building from an official image:
```
FROM image:tag
```
This usage specifies an existing official image as the base image. image is the name of the image, and tag is the version tag. For example, you can use ubuntu:18.04 as the base image.
Building from a custom image:
```
FROM <username>/<imagename>:<tag>
```
This usage specifies a custom image as the base image. <username> is the username on Docker Hub, <imagename> is the name of the image, and <tag> is the version tag.
Multi-stage build:
```
FROM <base-image> AS <stage-name>
```
This usage allows defining multiple build stages in a single Dockerfile, and using different base images for each stage. <base-image> is the base image, and <stage-name> is the name of the stage.

Multi-stage builds are typically used to utilize different tools and dependencies during the build process, and then copy the required files or executables from one stage to the next, reducing the final image size.
Building from the local file system:
```
FROM scratch
```
This usage indicates starting the build from an empty image, without using any existing base image. In this case, you need to add the required files and configurations yourself.

The FROM instruction can only appear once in a Dockerfile and must be the first instruction. It defines the starting point for the build process, and subsequent instructions will be based on this starting point.

HEALTHCHECK

Format:

HEALTHCHECK [options] CMD <command>: Set the command to check the container's health status.
HEALTHCHECK NONE: If the base image has a health check instruction, use this line to override its health check instruction.

The HEALTHCHECK instruction tells Docker how to determine if the container's state is normal. This is a new instruction introduced in Docker 1.12.

Before the HEALTHCHECK instruction, the Docker engine could only determine if a container was in an abnormal state by checking if the main process inside the container had exited. In many cases, this is not a problem, but if the program enters a deadlock or infinite loop state, the application process does not exit, but the container is no longer able to provide services. Before 1.12, Docker would not detect such a state in the container and would not reschedule, resulting in some containers that could no longer provide services but were still accepting user requests.

Since 1.12, Docker has provided the HEALTHCHECK instruction, which specifies a command to determine if the main process's service status is still normal, thereby more accurately reflecting the actual state of the container.

When a HEALTHCHECK instruction is specified in an image and a container is started from it, the initial state will be starting. After the HEALTHCHECK instruction passes, the state changes to healthy. If the check fails consecutively for a certain number of times, the state changes to unhealthy.

HEALTHCHECK supports the following options:

--interval=<interval>: The interval between two health checks, defaulting to 30 seconds.
--timeout=<duration>: The timeout for the health check command to run. If it exceeds this time, the current health check is considered a failure, defaulting to 30 seconds.
--retries=<number>: After consecutively failing the specified number of times, the container's state is considered unhealthy, defaulting to 3 times.

Like CMD and ENTRYPOINT, HEALTHCHECK can only appear once. If multiple instances are written, only the last one takes effect.

The command after HEALTHCHECK [options] CMD has the same format as ENTRYPOINT, with both shell and exec formats. The return value of the command determines the success or failure of the health check: 0: success; 1: failure; 2: reserved, do not use this value.

Suppose we have an image that is a simple Web service, and we want to add a health check to determine if its Web service is working properly. We can use curl to help with the determination. The HEALTHCHECK in the Dockerfile can be written like this:

FROM nginx
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*
HEALTHCHECK --interval=5s --timeout=3s \
  CMD curl -fs http://localhost/ || exit 1

Here, we set the check to run every 5 seconds (this interval is very short for testing purposes and should be relatively longer in practice), and if the health check command does not respond within 3 seconds, it is considered a failure. We use curl -fs http://localhost/ || exit 1 as the health check command.

Use docker build to build this image:

$ docker build -t myweb:v1 .

After building, we start a container:

$ docker run -d --name web -p 80:80 myweb:v1

After running this image, you can see the initial state is (health: starting) using docker container ls:

$ docker container ls
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                            PORTS               NAMES
03e28eb00bd0        myweb:v1            "nginx -g 'daemon off"   3 seconds ago       Up 2 seconds (health: starting)   80/tcp, 443/tcp     web

After waiting a few seconds and running docker container ls again, the health status will have changed to (healthy):

$ docker container ls
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                    PORTS               NAMES
03e28eb00bd0        myweb:v1            "nginx -g 'daemon off"   18 seconds ago      Up 16 seconds (healthy)   80/tcp, 443/tcp     web

If the health check fails consecutively more times than the retry count, the status will change to (unhealthy).

To help with troubleshooting, the output of the health check command (including stdout and stderr) is stored in the health status and can be viewed using docker inspect.

$ docker inspect --format '{{json .State.Health}}' web | python -m json.tool
{
    "FailingStreak": 0,
    "Log": [
        {
            "End": "2016-11-25T14:35:37.940957051Z",
            "ExitCode": 0,
            "Output": "<!DOCTYPE html>\n<html>\n<head>\n<title>Welcome to nginx!</title>\n<style>\n    body {\n        width: 35em;\n        margin: 0 auto;\n        font-family: Tahoma, Verdana, Arial, sans-serif;\n    }\n</style>\n</head>\n<body>\n<h1>Welcome to nginx!</h1>\n<p>If you see this page, the nginx web server is successfully installed and\nworking. Further configuration is required.</p>\n\n<p>For online documentation and support please refer to\n<a href=\"http://nginx.org/\">nginx.org</a>.<br/>\nCommercial support is available at\n<a href=\"http://nginx.com/\">nginx.com</a>.</p>\n\n<p><em>Thank you for using nginx.</em></p>\n</body>\n</html>\n",
            "Start": "2016-11-25T14:35:37.780192565Z"
        }
    ],
    "Status": "healthy"
}

LABEL

The LABEL instruction is used to add key-value pair metadata to an image.

LABEL <key>=<value> <key>=<value> <key>=<value> ...

We can also use some labels to declare the author of the image, documentation address, etc.:

LABEL org.opencontainers.image.authors="yeasy"

LABEL org.opencontainers.image.documentation="https://www.ubitools.com"

RUN

In a Dockerfile, the RUN instruction is used to execute commands inside the container. It can execute any valid command and shell script.

Common uses of the `RUN` instruction

Execute a single command:
```
RUN <command>
```
In this usage, <command> is the single command to be executed inside the container. For example:
```
RUN apt-get update
RUN apt-get install -y package
```
This will execute the commands to update the package lists and install a package inside the container respectively.
Execute multiple commands:
```
RUN <command1> && <command2>
```
This usage allows executing multiple commands consecutively on one line, using the && operator to ensure each command succeeds before executing the next one. For example:
```
RUN apt-get update && apt-get install -y package
```
This will execute the commands to update the package lists and install a package inside the container sequentially, ensuring the previous command succeeds before executing the next one.
Execute a shell script:
```
RUN /bin/bash -c "<script>"
```
This usage allows executing complex shell scripts inside the container. The script is placed inside double quotes, and /bin/bash -c is used to specify that Bash should interpret and execute the script. For example:
```
RUN /bin/bash -c "source setup.sh && build.sh"
```
This will execute the setup.sh script and build.sh script inside the container.

The RUN instruction can be used multiple times, and each instruction will execute a command inside the container. Each RUN instruction will create a new image layer on top of the previous one.

Notes:

Commands executed in a RUN instruction will permanently affect the container, so cleanup commands should be included to avoid unnecessary files and data in the image.
If you need to use environment variables in a RUN instruction, you can define them in the Dockerfile using the ENV instruction.

SHELL

Format: SHELL ["executable", "parameters"]

The SHELL instruction can specify the shell to be used for the RUN, ENTRYPOINT, and CMD instructions. The default on Linux is ["/bin/sh", "-c"].

SHELL ["/bin/sh", "-c"]

RUN ls ; ls

SHELL ["/bin/sh", "-cex"]

RUN ls ; ls

The two RUN instructions execute the same command, but the second RUN will print each command and exit on error.

When ENTRYPOINT and CMD are specified in shell format, the shell specified by the SHELL instruction will also become the shell for these two instructions.

SHELL ["/bin/sh", "-cex"]

# /bin/sh -cex "nginx"
ENTRYPOINT nginx

SHELL ["/bin/sh", "-cex"]

# /bin/sh -cex "nginx"
CMD nginx

STOPSIGNAL

The STOPSIGNAL instruction is used to set the system call signal to be used when sending a stop signal to the container. This instruction accepts the signal value or the corresponding signal name as an argument.

Syntax:

STOPSIGNAL signal

Where:

signal can be the numeric value or the name of the signal, such as SIGKILL.

If the STOPSIGNAL instruction is not specified, the SIGTERM signal will be sent when using the docker stop command. If the container does not stop within the specified time, the SIGKILL signal will be sent to forcibly terminate the container.

Example 1:

FROM ubuntu:18.04
STOPSIGNAL SIGTERM
CMD ["/usr/bin/bash", "-c", "while true; do sleep 1; done"]

In the above example, we set the SIGTERM signal to be sent as the stop signal. When the docker stop command is run, the container will first receive the SIGTERM signal, and if it does not stop normally within the given time, the SIGKILL signal will be sent to terminate the container.

Example 2:

FROM ubuntu:18.04
STOPSIGNAL 9
CMD ["/usr/bin/bash", "-c", "while true; do sleep 1; done"]

In this example, we use the signal value 9 to specify sending the SIGKILL signal, which will directly terminate the container process without waiting for a normal stop.

Note that even if STOPSIGNAL is set, Docker may still send the SIGKILL signal to terminate the container in certain situations, such as when the container is in an unrecoverable state.

USER

Format: USER <username>[:<group>]

The USER instruction is similar to WORKDIR, as they both change the environment state and affect subsequent layers. WORKDIR changes the working directory, while USER changes the user identity for executing subsequent RUN, CMD, and ENTRYPOINT commands.

Note that USER only helps you switch to the specified user; this user must be pre-created, otherwise, it cannot be switched to.

RUN groupadd -r redis && useradd -r -g redis redis
USER redis
RUN [ "redis-server" ]

If a script is executed as root but you want to change the identity during execution, such as running a service process as a pre-created user, do not use su or sudo, as they require complicated configuration and often fail in environments without a TTY. It is recommended to use gosu.

# Create the redis user, and use gosu to switch to another user to execute commands
RUN groupadd -r redis && useradd -r -g redis redis
# Download gosu
RUN wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/1.12/gosu-amd64" \
    && chmod +x /usr/local/bin/gosu \
    && gosu nobody true
# Set CMD, and execute it as another user
CMD [ "exec", "gosu", "redis", "redis-server" ]

VOLUME

Format:

VOLUME ["<path1>", "<path2>"...]
VOLUME <path>

As mentioned earlier, container storage layers should be kept read-only during runtime as much as possible. For applications that need to store dynamic data, such as databases, their database files should be stored in volumes. We will further introduce the concept of Docker volumes in later chapters. To prevent users from forgetting to mount directories for dynamic data as volumes during runtime, we can specify certain directories to be mounted as anonymous volumes in the Dockerfile. This way, even if the user does not specify a mount, the application can still run normally without writing a large amount of data to the container storage layer.

VOLUME /data

In this example, the /data directory will be automatically mounted as an anonymous volume during container runtime. Any data written to /data will not be recorded in the container storage layer, ensuring the stateless nature of the container storage layer. Of course, this mount setting can be overridden when running the container. For example:

$ docker run -d -v mydata:/data xxxx

In this command, the named volume mydata is mounted at the /data location, overriding the anonymous volume mount configuration defined in the Dockerfile.

WORKDIR

Format: WORKDIR <working directory path>

The WORKDIR instruction can be used to specify the working directory (or current directory). After this instruction, the current directory for subsequent layers will be set to the specified directory. If the directory does not exist, WORKDIR will create it for you.

Previously, we mentioned a common mistake made by beginners is to treat the Dockerfile like a shell script. This misunderstanding might also lead to the following error:

RUN cd /app
RUN echo "hello" > world.txt

If you build an image from this Dockerfile and run it, you will find that the /app/world.txt file is not found, or its content is not hello. The reason is straightforward: in the shell, two consecutive lines are part of the same process execution environment, so the memory state modified by the previous command will directly affect the next command. However, in a Dockerfile, these two RUN commands are executed in completely different container environments, which are two entirely separate containers. This is an error caused by a lack of understanding of the layered storage concept in Dockerfile builds.

As mentioned before, each RUN starts a new container, executes the command, and then commits the file changes. The first layer RUN cd /app only changes the working directory of the current process, which is merely an in-memory change and does not result in any file changes. By the time the second layer is reached, a brand new container is started, completely unrelated to the container from the first layer, so it cannot inherit the in-memory changes from the previous build process.

Therefore, if you need to change the working directory location for subsequent layers, you should use the WORKDIR instruction.

WORKDIR /app

RUN echo "hello" > world.txt

If you use a relative path in your WORKDIR instruction, the directory being switched to is relative to the previous WORKDIR:

WORKDIR /a
WORKDIR b
WORKDIR c

RUN pwd

The working directory for RUN pwd will be /a/b/c.

Dockerfile reference

Overview​

ADD​

ARG​

CMD​

COPY​

ENTRYPOINT​

Scenario 1: Make the image behave like a command​

Scenario 2: Preparation work before application startup​

ENV​

EXPOSE​

FROM​

Common uses of the FROM instruction​

HEALTHCHECK​

LABEL​

RUN​

Common uses of the RUN instruction​

SHELL​

STOPSIGNAL​

USER​

VOLUME​

WORKDIR​