Kubernetes is an instrumentation tool for managing dispersed services applications across a distributed cluster of nodes. Kubernetes itself follows a client-server architecture, with a master node composed of etcd cluster, kube-apiserver, kube-controller-manager, cloud-controller-manager, scheduler. Kubernetes is an open source container management platform designed to run enterprise-class, cloud-enabled and web-scalable IT workloads. It is built upon the foundation laid by Google in running containerized applications.
Though their popularity is a mostly recent trend, the concept of containers has existed for over a decade. Mainstream Unix-based operating systems (OS), such as Solaris, FreeBSD and Linux, had built-in support for containers, but it was Docker that truly democratized containers by making them manageable and accessible to both the development and IT operations teams. Docker has demonstrated that containerization can drive the scalability and portability of applications. Developers and IT operations are turning to containers for packaging code and dependencies written in a variety of languages. Containers are also playing a crucial role in DevOps processes. They have become an integral part of build automation and continuous integration and continuous deployment pipelines.
The interest in containers led to the formation of the Open Container Initiative (OCI) to define the standards of container runtime and image formats. The industry is also witnessing various implementations of containers, such as LXD by Canonical, rkt by CoreOS, Windows Containers by Microsoft, CRI-O — being reviewed through the Kubernetes Incubator, and vSphere Integrated Containers by VMware.
While core implementations center around the life cycle of individual containers, production applications typically deal with workloads that have dozens of containers running across multiple hosts. The complex architecture dealing with multiple hosts and containers running in production environments demands a new set of management tools. Some of the popular solutions include Docker Datacenter, Kubernetes, and Mesosphere DC/OS.
Container orchestration has influenced traditional Platform as a Service (PaaS) architecture by providing an open and efficient model for packaging, deployment, isolation, service discovery, scaling and rolling upgrades. Most mainstream PaaS solutions have embraced containers, and there are new PaaS implementations that are built on top of container orchestration and management platforms. Customers have the choice of either deploying core container orchestration tools that are more aligned with IT operations, or a PaaS implementation that targets developers.The key takeaway is that container orchestration has impacted every aspect of modern software development and deployment. Kubernetes will play a crucial role in driving the adoption of containers in both enterprises and emerging startups.
This architecture of Kubernetes provides a flexible, loosely-coupled mechanism for service discovery. Like most distributed computing platforms, a Kubernetes cluster consists of at least one master and multiple compute nodes. The master is responsible for exposing the application program interface (API), scheduling the deployments and managing the overall cluster. Each node runs a container runtime, such as Docker or rkt, along with an agent that communicates with the master. The node also runs additional components for logging, monitoring, service discovery and optional add-ons. Nodes are the workhorses of a Kubernetes cluster. They expose compute, networking and storage resources to applications. Nodes can be virtual machines (VMs) running in a cloud or bare metal servers running within the data center.
A pod is a collection of one or more containers. The pod serves as Kubernetes’ core unit of management. Pods act as the logical boundary for containers sharing the same context and resources. The grouping mechanism of pods make up for the differences between containerization and virtualization by making it possible to run multiple dependent processes together. At runtime, pods can be scaled by creating replica sets, which ensure that the deployment always runs the desired number of pods.
Replica sets deliver the required scale and availability by maintaining a pre-defined set of pods at all times. A single pod or a replica set can be exposed to the internal or external consumers via services. Services enable the discovery of pods by associating a set of pods to a specific criterion. Pods are associated to services through key-value pairs called labels and selectors. Any new pod with labels that match the selector will automatically be discovered by the service. This architecture provides a flexible, loosely-coupled mechanism for service discovery.
The definition of Kubernetes objects, such as pods, replica sets and services, are submitted to the master. Based on the defined requirements and availability of resources, the master schedules the pod on a specific node. The node pulls the images from the container image registry and coordinates with the local container runtime to launch the container.
etcd is an open source, distributed key-value database from CoreOS, which acts as the single source of truth for all components of the Kubernetes cluster. The master queries etcd to retrieve various parameters of the state of the nodes, pods and containers.
This architecture of Kubernetes makes it modular and scalable by creating an abstraction between the applications and the underlying infrastructure.
Key Design Principles
Kubernetes is designed on the principles of scalability, availability, security and portability. It optimizes the cost of infrastructure by efficiently distributing the workload across available resources. This section will highlight some of the key attributes of Kubernetes.
Applications deployed in Kubernetes are packaged as microservices. These microservices are composed of multiple containers grouped as pods. Each container is designed to perform only one task. Pods can be composed of stateless containers or stateful containers. Stateless pods can easily be scaled on-demand or through dynamic auto-scaling. Kubernetes 1.4 supports horizontal pod auto-scaling, which automatically scales the number of pods in a replication controller based on CPU utilization. Future versions will support custom metrics for defining the auto-scale rules and thresholds.
Hosted Kubernetes running on Google Cloud also supports cluster auto-scaling. When pods are scaled across all available nodes, Kubernetes coordinates with the underlying infrastructure to add additional nodes to the cluster.
An application that is architected on microservices, packaged as containers and deployed as pods can take advantage of the extreme scaling capabilities of Kubernetes. Though this is mostly applicable to stateless pods, Kubernetes is adding support for persistent workloads, such as NoSQL databases and relational database management systems (RDBMS),this will enable scaling stateless applications such as Cassandra clusters and MongoDB replica sets. This capability will bring elastic, stateless web tiers and persistent, stateful databases together to run on the same infrastructure.
Contemporary workloads demand availability at both the infrastructure and application levels. In clusters at scale, everything is prone to failure, which makes high availability for production workloads strictly necessary. While most container orchestration engines and PaaS offerings deliver application availability, Kubernetes is designed to tackle the availability of both infrastructure and applications.
On the application front, Kubernetes ensures high availability by means of replica sets, replication controllers and pet sets. Operators can declare the minimum number of pods that need to run at any given point of time. If a container or pod crashes due to an error, the declarative policy can bring back the deployment to the desired configuration. Stateful workloads can be configured for high availability through pet sets.
For infrastructure availability, Kubernetes has support for a wide range of storage backends, coming from distributed file systems such as network file system (NFS) and GlusterFS, block storage devices such as Amazon Elastic Block Store (EBS) and Google Compute Engine persistent disk, and specialized container storage plugins such as Flocker. Adding a reliable, available storage layer to Kubernetes ensures high availability of stateful workloads.
Each component of a Kubernetes cluster — etcd, API server, nodes— can be configured for high availability. Applications can take advantage of load balancers and health checks to ensure availability.
Security in Kubernetes is configured at multiple levels. The API endpoints are secured through transport layer security (TLS), which ensures the user is authenticated using the most secure mechanism available. Kubernetes clusters have two categories of users — service accounts managed directly by Kubernetes, and normal users assumed to be managed by an independent service. Service accounts managed by the Kubernetes API are created automatically by the API server. Every operation that manages a process running within the cluster must be initiated by an authenticated user; this mechanism ensures the security of the cluster.
Applications deployed within a Kubernetes cluster can leverage the concept of secrets to securely access data. A secret is a Kubernetes object that contains a small amount of sensitive data, such as a password, token or key, which reduces the risk of accidental exposure of data. Usernames and passwords are encoded in base64 before storing them within a Kubernetes cluster. Pods can access the secret at runtime through the mounted volumes or environment variables. The caveat is that the secret is available to all the users of the same cluster namespace.
To allow or restrict network traffic to pods, network policies can be applied to the deployment. A network policy in Kubernetes is a specification of how selections of pods are allowed to communicate with each other and with other network endpoints. This is useful to obscure pods in a multi-tier deployment that shouldn’t be exposed to other applications.
Kubernetes is designed to offer liberty of choice when choosing operating systems, container runtimes, processor architectures, cloud platforms and PaaS. A Kubernetes cluster can be configured on mainstream Linux distributions, including CentOS, CoreOS, Debian, Fedora, Red Hat Linux and Ubuntu. It can be deployed to run on local development machines; cloud platforms such as AWS, Azure and Google Cloud; virtualization environments based on KVM, vSphere and libvirt; and bare metal. Users can launch containers that run on Docker or rkt runtimes, and new container runtimes can be put up in the future.
It is likely to mix and match clusters running across multiple cloud providers and on-premises. This fetches the hybrid cloud abilities to containerized workloads. Customers can flawlessly move workloads from one deployment goal to the other
Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.
It groups containers that make up an application into logical units for easy management and discovery. Kubernetes builds upon 15 years of experience of running production workloads at Google, combined with best-of-breed ideas and practices from the community.
Kubernetes is a container orchestration tool able to streamline the management of containers and, instantaneously, to make it more efficient. The main goal of Kubernetes, as the other orchestration systems, is to streamline the work of technical teams, by automating many processes of applications and services deployment that before were carried out manually.
Features of Kubernetes:
- Automates various manual processes: for instance, Kubernetes will control for you which server will host the container, how it will be launched etc.
- Interacts with several groups of containers: Kubernetes is able to manage more cluster at the same time
- Provides additional services: as well as the management of containers, Kubernetes offers security, networking and storage services
- Self-monitoring: Kubernetes checks constantly the health of nodes and containers
- Horizontal scaling: Kubernetes allows you scaling resources not only vertically but also horizontally, easily and quickly
- Storage orchestration: Kubernetes mounts and add storage system of your choice to run apps
- Automates rollouts and rollbacks: if after a change to your application something goes wrong, Kubernetes will rollback for you
- Container balancing: Kubernetes always knows where to place containers, by calculating the “best location” for them
- Run everywhere: Kubernetes is an open source tool and gives you the freedom to take advantage of on-premises, hybrid, or public cloud infrastructure, letting you move workloads to anywhere you wan
Kubernetes Key Features
- Pod — collection of containers
A pod is a deployment unit in the K8S with a single IP address. Inside it, the Pause container handles networking by holding a network’s namespace, port and ip address, which in turn is used by all containers within the pod.
- Replication Controller
Kubernetes Replication Controller
A replication controller ensures that the desired number of containers are up and running at any given time. Pod templates are used to define the container image identifiers, ports, and labels. Using liveness probes, it auto-heals pods and maintains the number of pods as per desired state. It can also be manually controlled by manipulating the replica count using kubectl.
- Storage Management
Pods are ephemeral in nature — any information stored in a pod or container will be lost once the pod is killed or rescheduled. In order to avoid data loss, a persistent system — like Amazon Elastic Block Storage (EBS) or Google Compute Engine’s Persistent Disks (GCE PD) — or a distributed file system — such as the Network File System (NFS) or the Gluster File System (GFS) — is needed.
- Resource Monitoring
Monitoring is one of the key aspects to run infrastructure successfully. It is the base of hierarchy of reliability. Heapster is an addon used to collect metrics from kubelet, which is integrated with a cAdvisor. cAdvisor is used to collect metrics related to CPU, memory, I/O, and network stats of the running containers. Data collected by Heapster is stored in an influx DB and is displayed in the UI using Grafana. There are also other sinks available like Kafka or Elastic Search, which can be use for storing data and displaying it in the UI.
- Health Checking
Health checking in kubernetes is done by a kubelet agent. It is divided into two liveness and readiness probes.
There are mainly three types of handlers:
ExecAction: Shell command is executed, and if the resulting exit code is 0, it means that the instance is healthy. Under any other circumstances, the instance is not healthy.
TCPAction: Kubelet will try to connect to a specified port, and if it establishes a connection to the given socket, the diagnostic is successful.
HTTPGetAction: Based on the HTTP endpoint that the application exposes, kubelet performs an HTTP GET request against the container IP address on a specified path, and if it returns with a 200 to 300 response code, the diagnostic is successful.
Each probe usually has three results:
Success: The Container has passed the diagnostic.
Failure: The Container has failed the diagnostic.
Unknown: The diagnostic has failed, so no action should be taken.
- Horizontal Auto Scaling
Autoscaling utilizes computational resources based on the load. K8S scale pod automatically uses a HorizontalPodAutoscaler object, which gets metrics data from Heapster, and it decreases or increases the number of pods accordingly. For example, if auto-scaling is based on memory utilization, then the controller starts observing memory usage in the pod and scales the replica count based on it.
- Service Discovery
Kubernetes pods are ephemeral, and the Replication Controller creates them dynamically on any node, so it is a challenge to discover services in the cluster. A service needs to discover an IP address and ports dynamically related to each other to communicate within a cluster.
There are two primary ways of finding it — Environment variables and DNS
DNS based service discovery is preferable, and it is available as a cluster add-on. It keeps track of new services in cluster and creates a set of DNS records for each.
To manage a cluster fully, a network has to be setup properly, and there are three distinct networking problems to solve:
1. Container-to-Container communications: pods solve this problem through localhost communications and by using the Pause container network namespace
2. Pod-to-Pod communications: this problem is solved by the software defined networking as shown in the Architecture diagram above
3. External-to-Pod communications: this is covered by services.
Kubernetes provides a wide range of networking options. Furthermore, there is now support for the Container Networking Interface (CNI) plugins, which is common plugin architecture for containers. It’s currently supported by several orchestration tools such as Kubernetes, Mesos, and CloudFoundry.
There are various overlay plugins, some of which are discussed below:
- Flannel is a very simple etcd backed overlay network that comes from CoreOS. It creates another virtual, routable IP Per Pod network, which runs above the underlay network; ergo, it is called an overlay network. Each Pod will be assigned one ip address in this overlay network, and they communicate with each other using their IP directly.
- Weave provides an overlay network that is compatible with Kubernetes through a CNI plugin.
Kubernetes services are abstractions which route traffic to a set of pods to provide a microservice. Kube-proxy runs on each node and manages services by setting up a bunch of iptable rules.
There are three modes of setting up services:
1. ClusterIP (only provides access internally)
2. NodePort (needed to open firewall on a port; not recommended for public access)
3. LoadBalancer (owned by public cloud providers like AWS or GKE)
- ConfigMap and Secret
12factor app suggests that only the configuration changes in a container.
ConfigMap makes it possible to inject a configuration based on an environment while keeping the container image identical across multiple environments. These can be injected by mounting volumes or environment variables, and it stores these values in the key/value format.
Secrets are used to store sensitive data such as passwords, OAuth tokens, etc.
- Rolling Deployment and Rollback
A Deployment object holds one or more replica sets to support the rollback mechanism. In other words, it creates a new replica set every time the deployment configuration is changed and keeps the previous version in order to have the option of rollback. Only one replica set will be in active state at a certain time.
For rolling deployment, the strategy type required is “RollingUpdate” and “minReadySecs,” which specifies the time that the application takes to serve traffic. It will be unavailable if we leave it on default in the case that the application pods are not ready. This action can be done by running the command below:
$kubectl set image deployment <deploy> <container>=<image> — record
By replacing content in deployment yaml file and running the command below:
$ kubectl replace -f <yaml> — record
If the new version is not behaving as expected, then it is possible to rollback to the previous version by running the below command:
$ kubectl rollout undo deployment <deployment>
If the desired version is any revision other than the previous one, then run:
$ kubectl rollout undo deployment <deployment> — to-revision=<revision>
To superintend application performance, we should check logs — multiple are generated in each pod. To start searching logs in the Dashboard UI, there has to be some mechanism that collects and totals them into one log viewer. To exemplify, Fluentd, an open source tool and part of Cloud Native Computing Foundation (CNCF), combines perfectly with ElasticSearch and Kibana.
Major Features Of Kubernetes
Automatically places containers based on their resource requirements and other constraints, while not sacrificing availability. Mix critical and best-effort workloads in order to drive up utilization and save even more resources.
Restarts containers that fail, replaces and reschedules containers when nodes die, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready to serve.
Scale your application up and down with a simple command, with a UI, or automatically based on CPU usage.
4.Service discovery and load balancing
No need to modify your application to use an unfamiliar service discovery mechanism. Kubernetes gives containers their own IP addresses and a single DNS name for a set of containers, and can load-balance across them.
5.Automated rollouts and rollbacks
Kubernetes progressively rolls out changes to your application or its configuration, while monitoring application health to ensure it doesn’t kill all your instances at the same time. If something goes wrong, Kubernetes will rollback the change for you. Take advantage of a growing ecosystem of deployment solutions.
6.Secret and configuration management
Deploy and update secrets and application configuration without rebuilding your image and without exposing secrets in your stack configuration.
Automatically mount the storage system of your choice, whether from local storage, a public cloud provider such as GCP or AWS, or a network storage system such as NFS, iSCSI, Gluster, Ceph, Cinder, or Flocker.
In addition to services, Kubernetes can manage your batch and CI workloads, replacing containers that fail, if desired.
Installation Instructions For Windows
A) Click the Windows “Start” button and select “All Programs” and then point to Kubernetes.
B) RDP Connection: To connect to the operating system,
1) Connect to virtual machine using following RDP credentials :
- Hostname: PublicDNS / IP of machine
- Port : 3389
Username: To connect to the operating system, use RDP and the username is Administrator.
Password : Please Click here to know how to get password .
C) Other Information:
1.Default installation path: will be on your root folder “C:\Kubernetes“
- Windows Machines: RDP Port – 3389
- Http: 80
- Https: 443
Configure custom inbound and outbound rules using this link
Installation Step by Step Screenshots