1

Understanding Docker Layers and Caching

 2 years ago
source link: https://blog.geekyants.com/understanding-docker-layers-and-caching-11d79d072103
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Understanding Docker Layers and Caching

Let us understand the concepts of Layers and Caching in Dockers using examples.

1*MByZeDvds8YwfJHnX0peAQ.jpeg

Table of contents

  • Docker overview
  • Basic Docker Terminology 🐳
  • Docker Layers
  • Using Multi-Stage Builds
  • Avoid Caching
  • Understanding R/W Layer
  • Advantages of using Docker Layers
  • Conclusion

Docker overview

Docker is an open platform for developing, shipping, and running application. Read more about Docker here.

Basic Docker Terminology 🐳

Here are some basic Docker terminologies:

Docker Image

A Docker image is the blueprint of the Docker Container. The image of the application needs to be created for shipping any app. The Docker Image provides a convenient way to package applications and other preconfigured server environments to make development much more streamlined.

Docker Container

A Docker Container is a running instance of a Docker Image. Simply put, the Docker Image is pulled from a registry and it is executed as a Container.

Docker Layers

When building an image from scratch, Docker creates layers to make the successive deployments and builds efficient. Each layer is a diff/delta from the previous layer that was built before it.

Let us try to understand it with the help of an example.

For this article we use this Docker Sample Application along with its Dockerfile.

FROM node:12-alpine
RUN apk add --no-cache python2 g++ make
WORKDIR /app
COPY . .
RUN yarn install --production
CMD ["node", "src/index.js"]
EXPOSE 3000

Each line in this Dockerfile is a Docker layer and if not changed will be reused from the cached layers in later builds.

Running the build for the 1st time:

docker build -t getting-started .

For the first time, every layer will be built from scratch so the entire build process will take a relatively long time.

0*PZPX71J6ky7rQ2rM
0*wQkR7vDBVu1f2E63

We can see here that the base images are downloaded from the internet and the commands are run inside of it to create the image and take 175 secs.

Now, let us try to rebuild it:

docker build -t getting-started .
0*0offpE0VDxjgIydT

The build time now goes down to 4s 🤯🤯

This is what layering and caching in docker does. The subsequent builds are built from the cached layers that were created from the previous builds, and as no changes were made to Dockerfile all the layers were taken up from the cache.

Now, let us make changes in the Dockerfile and see how the cache behaves here.

We simply change the WORKDIR command in Dockerfile.

FROM node:12-alpine
RUN apk add --no-cache python2 g++ make
WORKDIR /app_temp
COPY . .
RUN yarn install --production
CMD ["node", "src/index.js"]
EXPOSE 3000

Now, building it gives a different result:

0*sVJ882mUjOJPNKke

Layers [1/5] [2/5] are cached whereas only [3/5] [4/5] [5/5] are again built. This is still better than building everything from scratch.

The layers can be reused in other images created.

Note that both adding and removing files will result in a new layer.

Using Multi-Stage Builds

One of the most challenging things about building images is keeping the image size down. Each instruction in the Dockerfile adds a layer to the image, and you need to remember to clean up any artifacts that you do not need before moving on to the next layer. This is where multi-stage builds help.

Updated Dockerfile:

# syntax=docker/dockerfile:1
FROM node:12-alpine as initial_builder
RUN apk add --no-cache python2 g++ make
WORKDIR /app
COPY . .
RUN yarn install --productionFROM alpine #Final build stage
WORKDIR /app
COPY --from=initial_builder /app /appCMD ["node", "src/index.js"]
EXPOSE 3000

In the final build stage just the built artifacts are brought from the previous stage into this new stage.

docker build multi-stage .

Now, let us compare the size between the 1st image and the final image.

docker image ls
0*AMrMo_kmH-goeoHb

The size drastically reduces here. 😎😎

Avoid Caching

Using — no-cache while building the image will always start building the image from scratch even if cached layers are available.

Understanding R/W Layer

0*6q3KCUjy4UoLWkmo

An image has many layers. When a container starts, only one read-write layer is attached on top of all the layers of images.

All the changes a container makes are made to the editable R/W layer and not to the underlying image layers. Therefore, a number of containers can use the same image with each having its own R/W layer.

Copy-on-Write (CoW) mechanism in its storage drivers. This mechanism satisfies the need of different containers to share the same image. However, when a single container performs operations such as modification of an image file, a duplicate image is created in the upper read-write layer.

Advantages of using Docker Layers

  • Good storage management
  • Faster builds
  • Faster deployments
  • Sharing across multiple containers
  • Enhanced scalability

Conclusion:

Docker Layers and Cache are important concepts when it comes to adopting good practices of creating any Docker infrastructure. Small tweaks here and there can increase the efficiency of scalability and deployments.

I have tried to explain the concepts in a simple and easy to understand language here to make readers interested into using these in their docker practices.

Hope you enjoyed the article, have a great day !!✌🏻✌🏻

This is a part of a series of articles to help understand Docker better. Find the other articles as follows:


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK