I’m looking for a job right now, so I’ve been talking to a lot of people in software engineering roles. I like to ask about projects they’re working on now and one thing that keeps coming up in these conversations is Docker.
I decided to try to learn how to use Docker. I installed it and started working through the tutorial, copying the commands and doing a lot of different things I did not totally understand.
Somewhere around the middle of this tutorial, I realized that, because I’ve never worked on a large-scale application, learning Docker (for me) means also learning about the alternative (i.e., virtual machines) and how to deploy big web applications in general.
I am not going to get into actual code in this post. I’m just going to discuss the components of the Docker ecosystem at a high level. If you’re trying to learn Docker, check out their free tutorials.
What is a container and why are they important?
Large applications have many pieces and parts with their own concerns. In order to host these pieces on the same machine, engineers have to devise ways to separate these pieces into their own controlled environments.
One way to do this is to use virtual machines, emulations of computer environments managed by a hypervisor.
Each virtual machine has its own guest operating system, which takes up a significant amount of disk space. This is where containers come in.
A container is an isolated environment within the host machine’s operating system. Docker allows you to setup a bunch of these containers on your host machine and share the operating system between them without polluting each container’s environment with unnecessary dependencies or configurations.
Containers trim the operating system fat created by virtual machines, and as a result, containers scale much more efficiently than virtual machines.
So, how do they work?
I’m still wrapping my brain around the mechanics of using containers, but I do understand how they work on a high level. Containers are instances of an image. An image is a reference to a file path structure. Generally, this would be the root directory of your application.
Multiple containers on your host machine can be instances of the same image, and Docker allows you to control what information, if any, is shared between these containers. In this way, images are similar to classes in object-oriented programming. Each instance of an image (container) can be completely isolated from every other instance if necessary.
Every time a container launches, it re-creates a file system based on the image. This means that containers do not persist data between sessions. To store data, Docker uses volumes.
There are a lot of different types of volumes, but the two main types are named volumes and bind mounts. The core difference between these two is that bind mounts allow you to designate where the data is stored — named volumes leave this to Docker to decide.
Do containers have access to each other?
They do if you want them to. The Docker documentation recommends separating every process into its own container. For a typical (very) small application, the front end, back end and database would have their own separate containers.
For this application to be useful, the front end container will need to have access to the back end container. And the back end container will need to have access to the database container.
With Docker, you can create networks and wire containers into the network where they can communicate with each other. You can setup multiple different networks on your host machine, and containers can be connected to multiple networks.
There are a lot of different kinds of networks. The default type in Docker is a
bridge networks allow standalone containers, on the same host machine, to communicate with each other.
I’m still learning the ins-and-outs of wiring up Docker and utilizing the different tools it provides. Next, I’m going to attempt to setup a basic application with a UI, API and database using Docker. I’ll talk about how that goes in a follow-up story.