There are tons of articles on the Internet on the topics of Containers, VMs and Dockers. While reading these articles I found that each one has mentioned one or more features about the Technology, which were not mentioned in any of the others. So, I decided to write a post on these topics, to summarize the articles that I have read and the videos I have seen. This is part 1 of the blog post series that I’ll be writing. Other than these topics, I have been reading a lot about Kubernetes, Spark, and Hadoop. The next posts in these series will have some high-level overview of those topics.
If you are a programmer or Techie, chances are you must have at least heard of the terms like Virtual Machines and Containers. It’d be hard not to, with all the attention it’s getting these days. Even the big Dogs like Google, VMWare and Amazon are building services to support it. So, Lets’ dive into these topics one by one …
Virtual Machines
Hardware Virtualization. A VM is essentially an emulation of a real computer that executes programs like a real Computer. They run on the top of Host machine using a hypervisor. VMs gained early popularity because the enabled higher levels of server utilization. That’s still true today.
With the increase in server processing power and memory, the standalone devices weren’t enough to process the enormous amount of data that was being produced. That’s why VMs were born.
Multiple VMs can be created on a single hardware device. A Hypervisor, which is a Software/Hardware/Firmware is responsible for creating this VMs. It sits between the Hardware and VMs and is necessary to virtualize the server.
Each VM has a unique Guest OS. VMs with different OS can run on a single Physical server. Each VM has its’ own binaries, libraries, applications. It maybe occupies GBs in size.
The main OS on which VMs are created are known as Guest OS and the OS in the VM is known as the Host OS. The VM doesn’t have direct access to Hardware, so it must get through the host OS.
VM has a virtual operating system of its own, the Hypervisor plays an essential role in providing the VMs with a platform to manage and execute this guest OS. Hypervisor allows the host computers to share their resources among the Virtual Machines that are running as guests on the top of them.
To read more about hypervisors, you can read my previous Article … LINK.
VM driven Workloads
The scalability of VM server workload is achieved much the same way it is achieved on bare metal, i.e. with a Web Server or a DB Server. The programs responsible for delivering service are distributed among multiple hosts. Load Balancers are inserted in front of those hosts to direct traffic among them equally.
Disadvantages
- Each VM includes separate OS image, which adds overhead in memory and storage.
- Increases Complexity.
- Severely limits Portability of apps between Public, Private and Traditional data centers.
Containers – OS Virtualization
OS Virtualization enables software to run well when moved from one server to another.
Containers provide a way to run these isolated systems on a single Host OS. They sit on the top of a physical server and its host OS, e.g. Windows. Each container shares the host OS kernel, and usually the binaries and Libraries too. Shared components are read only. Hence, Containers are lighter and only MBs in size and easier to start.
Reduce management overhead as they share a common OS. It makes it portable. Container package up just the user space, and not the Kernel or Virtual Hardware like a VM Does.
OS Level Architecture is being shared across containers. The only part that are created from scratch are bins and libs.
Container driven Workloads
The concept of Containerization was originally developed, not as an alternative to VM BUT to segregate namespaces in a Linux OS for security purposes.
The first Linux environment, resembling modern container systems, produced partitions within which applications could be executed without risk to the kernel. The kernel was still responsible for executing, though a level of abstraction was inserted between Kernel and workload.
Docker
Docker is an open source project based on Linux Containers. Docker enables you to separate your applications from your infrastructure, so you can deliver software quickly. With Docker, you can manage your infrastructure in the same ways you manage your applications. You can significantly reduce the delay between writing code and running it in production.
It uses Linux Kernel features like namespaces and control groups to create containers on the top of the OS. A Docker is used to create a Container Image, which in turn is used to create a Container. Kubernetes is responsible for management of multiple Containers working parallelly.
Docker is easy to Use. Its main mantra is Built once, run anywhere. Anyone can take advantage of containers. They must package an application on their laptop, which in turn can run unmodified on any public, private cloud or even bare metal.
Docker user benefits from the increasingly rich ecosystem of Docker Hub, which you can think of as an “App Store for Docker Images”. Docker Hub has tens of thousands of public images created by community that are readily available to use. It’s incredibly easy to search for images that meet your needs, ready to pull and use with little-to-no modification.
It’s modular and scalable. There can be a lot of modules in your application/Website, each module can be built in a different language, depending on the expertise of the Developer. So, while integrating, despite not being same language, it can be used to deploy. With docker, it becomes easier to link these containers together to create your application, making it easy to scale or update.
Docker helps you manage the life-cycle of your container:
- Develop your application and its supporting components using containers
- The container becomes the unit for distributing and testing your applications
- When you’re ready, deploy your application into the production environment, as a container.
- To know more about the Life cycle of a Docker container, you can refer link.
Situation where Dockers are used:
- Your developers write code locally and share their work with their colleagues using Docker Containers
- They use Docker to push their applications into a test Environment and execute automated and manual tests
- When developers find bugs, they can fix them in the development environment and redeploy them to the test environment for testing and validation
- When testing is complete, getting the fix to the customer is as simple as pushing the updated image to the production environment.
Components of Docker
Docker Engine
It is a Client Server application with three major components:
Docker Engine is the layer on which Docker runs. It is a lightweight runtime and tooling that manages containers, images, builds and more. It is made up of:
- Docker Daemon (that runs in the Host computer),
- Docker Client (Communicates with Docker Daemon to execute commands), and
- REST API (For Interacting with the Docker Daemon remotely).
Docker Engine also consists of a Command line Interface (CLI) client.
Docker Client is the UI – You communicate with Docker Client, which then communicates with Docker Daemon. The Docker Client can run on the Host Machine as well, but it’s not required to.
Docker Daemon is what executes commands sent to the Docker Client, like building, running, and distributing your containers. The Docker daemon runs of the Host machine, but as a user, you never communicate directly with the Daemon.
Docker File is a file where you write the instructions to build a Docker Images. The file consists of a set of Instructions. Once you’ve set up Docker file you can use the docker build command to build an image from it. Each instruction in the Docker file adds a new layer to the image, with layers representing a portion of the images file system that either adds to or replaces the layer below it.
Docker Images are read only Templates, that you build from a set of instructions written in your Docker File. Images define both what you want your packaged applications and its dependencies to look like and what processes to run when it’s launched.
Docker Containers
Containers are lightweight because they don’t need the extra load of a hypervisor but run directly within the host machine’s kernel.
A docker container, wraps an application’s software into an invisible box with everything the application needs to run. That includes the OS, application code, runtime, system tools, system libraries.
Docker Containers are build off Docker Images. Since images are read only, Docker adds a read write file system over the read only file system of the image to create a container. Once you have successfully created a container, you can then run it in any environment without having to make changes.
Union File Systems
Docker uses Union File systems to build up an image. In UFS, the contents of directories which have the same path within the overlaid branches are a single merged directory, which avoids the need to create separate copies of each layer.
Instead, they can all be given pointers to the same resources, when certain layers need to be modified. It’ll create a copy and modify a local copy, leaving the original unchanged. That’s how the file system can appear writable without allowing writes.
Layered systems offer two benefits:
- Duplication Free (Hence making instantiation of Docker containers very fast and cheap),
- Layer Segregation (Changes are made faster, when you change an image, Docker only propagates the updates to the layer that was changed)
Volumes
They are the “data” part of a container, initiated when a container is created.Volumes allow you to persist and share a container’s data.
They are separate from the Default UFS and exist as normal directories and files on the host file system. So, even if you destroy, update or rebuild your container, the data volume will remain untouched.
Docker Registries
A docker registry stores Docker images. Docker Hub and Docker Cloud are public registries that anyone can use, and Docker is configured to look for images on Docker Hub by default. Docker Store allows you to buy and sell Docker images or distribute them for free.
Namespaces
Namespaces provide container their own view of the underlying Linux System, limiting what the container can see and access. Docker Engine uses namespaces such as the following on Linux:
- PID namespace: Process isolation (PID: Process ID).
- NET namespace: Managing network interfaces (NET: Networking).
- IPC namespace: Managing access to IPC resources (IPC: Inter-Process Communication).
- MNT namespace: Managing filesystem mount points (MNT: Mount).
- UTS namespace: Isolating kernel and version identifiers. (UTS: Unix Time sharing System).
- USER namespace: Namespace to Isolate users within each container.
Kubernetes
Google Kubernetes or Mesosphere Marathon can determine based upon traffic patterns, when the quantity of containers needs to scale out, can replicate container images automatically and can then remove them from the system.
Kubernetes system would have multiple copies of containers and they coexist. If one of the Containers fail, they can be removed and replaced without noticeable impact on service. In case of experiment failure, all the newer versions can be rolled back and replaced.
Kubernetes is a broad topic and can’t be covered in a single post. Will cover this one in the upcoming posts.
Doubts & Takeaways
Containers and VMs are similar in their goals: to isolate an application and its dependencies into a self-contained unit that can run anywhere.
VMs and Containers differ in several ways, but the primary difference is its architectural Approach. Containers provide a way to virtualize an OS so that multiple workloads can run on a single OS instance. But VMs virtualize the Hardware to run multiple OS instances.
Can VMs and Docker containers coexists?
Yes, they can. At the most basic level, any kind of VMs are a great place for Docker hosts to run. All the VMs will serve equally well as a Docker Host. Depending on what you need to do, a VM might be the best place to land those containers. But the great thing about Docker is that, it doesn’t matter where you run containers, and it’s totally up to you.
While container allow you to break your application into more functional discrete parts to create a separation of concerns, it also means there’s growing number of parts to manage, which can get unwieldy. Security has been an area of concern with Docker Containers, since containers share the same kernel, the barrier between the containers is thinner.
Can a Docker container-based service interact with a VM based service?
Yes, it can. Running your app in a set of Docker containers doesn’t preclude it from talking to the services running in a VM.
I hope you are now equipped with the topics we discussed in the post. Don’t forget to hit like and pass it on to people who need help to understand these complicated topics. Stay tuned for more such posts 😊
.
.
.
References