Docker: Why and What?
Imagine that you are trying to install a software on your local machine. First you download the installer file, then you run it and if you are lucky, then it gets installed. But if you are not, then you get an error because of some dependencies which are not on your machine. In that case, you try to trouble-shoot it by installing those dependencies. Then, you re-run the installer. If you are unlucky again, then another error comes up and so on.
Hence, it shows that we need a platform where we can run any application and just forget about whether we have the required dependencies for running the application in our local machine or not. Docker does exactly that for us. Docker is an ecosystem of different projects for creating and running containers. Now what are containers, we will come to that next.
Installation
Before proceeding forward, I would suggest that you create a free account on docker hub.
While signing up, you have to choose a docker id and password and enter your email. Docker id is a username of sorts and will be publicly visible to other people.
Then, download docker (community edition) for windows/mac/linux and run it.
Once you run it, you would see a logo with a whale with boxes on top on your top right. It might take a couple of minutes to get started. You can click on this icon and it will show you an option to log in.
Clicking on login will show you a pop-up where you can enter your docker Id and password. Once you are done, you can verify that docker is running by going to your terminal and entering docker version. If you get something like command not found, then that means docker is not installed. However, if you get a verbose description of docker and related commands then it means that docker installation is successful.
Now that docker is installed, lets run docker run hello-world on your terminal. Running it for the first time gives this response.
Let’s analyse what just happened. When we run docker run hello-world, something called Docker Client checked locally for the hello-world image in image cache. It could not find that so it reached out to something called Docker Server or Docker Daemon. The Docker daemon pulled the “hello-world” image from Docker Hub which is a free public repository of images. From the Docker Image, a Docker Container was created. It is nothing but an instance of an image running a program.
One also notices that when one runs docker run hello-world again, then it does not show the message that ‘Unable to find hello-world:latest locally’. It shows that the image downloaded from docker hub has now been stored locally in the image cache.
What is a container and what is an image?
To understand a container, let’s get some context about how an operating system functions on our computer. An operating system has a continuously running process called Kernel. Let’s say that we are running some applications on our computer. We have opened the chrome browser, command line terminal, a text editor and the slack app. Each of these requires all kinds of memory resources, networking resources, I/O resources etc which are provided by the computer hardware. Now how does each of these applications communicate with the hardware? Essentially, it makes something called a System Call. It is a request to kernel so that it can allow the application to interact with the hardware.
Now imagine, that in some hypothetical world, Chrome requires python version 2 as dependency, while Slack requires python version 3 as dependency. For some reason, at one point of time only one version of python can be there in the hard disk. So chrome would work properly but slack would not work properly. Although, this is a make believe situation, it gives us the basic motivation for using a container.
Now how can we solve this problem? We use the operating system technique called Namespacing to segment the hard disk resources and Control group to limit the amount of resources used per process.
Now while the application ie either chrome or slack makes a System Call to hard disk, the kernel makes out whether to direct it to the segment of hard disk containing python v2 or python v3.
In this way, one is able to run both chrome and slack on one’s machine. Thus, namespacing is nothing but isolating hardware resources per process or a group of processes.
Now, a container is something which is comprised in the red dotted rectangle. It has a running application or program or process that sends a system call to kernel. The kernel is going to process that request and direct it to a very specific portion of hardware containing RAM, CPU, Network or Hard Drive.
Now, if container is that, then what is an image and how can an image ie file create a running container? Anytime we talk about an image, we talk about a file system snapshot or a copy-paste of a very specific set of directories and files plus a startup command.
Thus, while creating a container out of an image, the file system snapshot of the image is copied into the hard disk of the container and the startup command is run on the container with the required set of hardware and software resources.
The concepts namespacing and control group are exclusive to linux operating system. Thus, when one installs and runs docker, one has a LVM (Linux Virtual Machine) running on the computer. This LVM acts as a layer or interface between your actual operating system and kernel. To verify that one has an LVM running, one can type on terminal the following
docker version
Once can see that the value against the key OS/Arch is linux/amd64 irrespective of the operating system.
Summary
This post describes the 101 of docker, images and containers. We went through what the basic need of using a docker container is, how an operating system manages the relationship between application software and hardware, need for dependency management, the concepts of namespacing and control-group, how a docker container solves the overall problem and finally what exactly is a docker image and a docker container.
Special thanks to Stephen Grider and his udemy course on Docker and Kubernetes, which helped me clear these concepts.