The Unrecognized Platform
November 30, 2018
openshift kubernetes paas containersIt has been almost 3 years since I started working with OpenShift/k8s and a bit more than that with what we call now Linux Containers. During this period I’ve helped Customers from many different industries to move their application workloads to OpenShift. I had the opportunity to work on both greenfield and brownfield projects. I migrated or created simple microservices to run in OpenShift and take advantage from the features the platform brings to accelerate the development process. I helped to break old big monolithics into small microservices and build a CI/CD ecosystem around them. These are the typical use cases deployed when using OpenShift/k8s. I recently had the opportunity to work on different use cases where we successfully demonstrated that the platform limits and possibilities are still not fully discovered.
This blog post will try to demonstrate the existing and coming features for OpenShift that will expand the current capabilities of the Platform. We have already implemented in Customer environments some of these new capabilities which are key for their business development and the feedback has been really positive. Customers are already thinking to use OpenShift/k8s as their base platform to run all their workloads, something similar that happened not many years ago with RHEL. This is a massive opportunity for both Red Hat and the Community to continue leading the IT Enterprise direction.
From the initial days of OpenShift/k8s, the main idea was to orchestrate ‘stateless’ applications. These applications packaged as container images are immutable, as this is the nature of containers. They were not keeping any state as the industry thought at that time it was not necessary. As long the Platform evolves, becomes more popular and gets globally adopted, companies are looking to move new application workloads, applications that had the ability of record states and react from those states once restarted, or applications that are stuck to some particular identity that must be preserved after restarts. This resulted in new Kubernetes objects called PetSets initially and eventually renamed to StatefulSets. This provided the ability to run clustered applications or applications that required stable Network Identities.
As the popularity of the k8s platform increased new capabilities were required to support new types of workloads. Not only the community but also big software and hardware vendors started working on solutions to cover these new requirements. This resulted in a variety of new projects, initiatives and capabilities being added to the platform.
Using GPUs in OpenShift/k8s is one of the new features we are working at the moment. This is implemented using device plugins feature which provides a solution for vendors to be able to advertise their resources to the Kubelet and monitor them without writing custom Kubernetes code. The idea is to extend the existing k8s API so all these new capabilities can be added without modifying the Kubernetes project code. Read the Device Manager Proposal for further understanding and direction from the community on this. This feature allowed us to move complex workloads for a Customer where there were many different server racks isolated from each other, running Linux based applications using NVIDIA Graphics cards for calculation and rendering. Device manager API was made generally available in OpenShift 3.10, so running applications in OpenShift that require access to GPU cards provided by device plugins is fully supported by Red Hat.
This was a great advantage to the mentioned Customer, as this feature was not only going to help them to save millions of Euros (every single rack for their use case was around 1M) but also accelerate their development process drastically as now they can use the same ecosystem to run different processes that require calculation (CUDA from NVIDIA) and GPU rendering, while so far they were tied to one particular Rack per process type. In these days where ML and AI are becoming more and more popular, this capability will be key in order to migrate these workloads onto the OpenShift/k8s platform. There is still some work to do here, the main request we have from Customers is the ability to share a single GPU between multiple Pods, which is not supported yet.
Running Windows Containers alongside Linux Containers is a common requirement from some of our Customers. There is a joint effort between Red Hat and Microsoft on getting this working and supported in an Enterprise Level. Currently is available as a Dev Preview for Customers who want to try it and planned to be Tech Preview for next major OpenShift release coming in March 2019. In the last 4 months, different drop versions of this Dev Preview technology have been tested across the globe on different Customers. We had the opportunity to run tests on-prem and cloud-based for a particular Customer. This feature was celebrated with enthusiasm on the Customer Team, as they have some proprietary Windows-based applications. Having the ability to containerize this software, run it on the same ecosystem they are planning to run all their applications and being able to orchestrate Windows containers along Linux Containers will make their lives much simpler. There is still a lot of work to get this GA in the OpenShift platform, but the initial results from this very early stage release completely satisfied Customer expectations on this topic. We were able to provision a Hybrid OpenShift Cluster, with the Control Plane running on the Linux Hosts, and different compute Nodes formed by RHEL and Windows Server Core hosts. Using OVN each Kubelet has the ability to spin up Pods and add its interface to OVS and connect it to OVN, so different Pods on different Host type (Windows and RHEL) can speak each other.
As mentioned previously this is still at the Developer Preview stage, and many changes are happening on the Windows Server side. The initial version relied on Hyper-V to manage the Network part, but Microsoft is planning to integrate OVN into their Networking stack. Right now Hyper-V is not required any more but there is still work to do on this space. OVN will be the default Virtual Network for the next OpenShift major version coming in March 2019.
Multiarch, the ability to run OpenShift on different CPU architectures like ARM or PowerPC, is another current hot topic. PowerPC support was added for OpenShift recently (OpenShift v3.10), but we have been working on Customers where they need to orchestrate containers based on ARM architectures. We already have examples out there running OpenShift Upstream project (OKD) on Fedora ARM-based Hosts. Amazon has recently announced ARM-based new AWS instances. Customers are telling us that a heterogeneous OpenShift Cluster with compute Nodes running different architecture based containers will be required very soon (some industries like Automotive, Industrial Control and Mobile Markets are already requesting it), and Red Hat is putting a lot of effort to support OpenShift on these architectures. This is a key milestone that needs to be achieved in order to consider OpenShift/k8s the “global platform” to run the majority of customer workloads for the coming years.
One of my last Customers was using Big Data intensively and wanted to move all those Big Data workloads alongside other Teams' projects to OpenShift. There are some considerations mostly storage-wise when you plan to deploy Big Data workloads into OpenShift/k8s. This was not an easy task in the past, but thanks to OpenShift/k8s maturity is now possible to do this. Stateful containers are not an issue anymore as I explained before, nor is the high memory or disk I/O requirements of these workloads. OpenShift supports Huge Pages to be enabled for Pods, and thanks to device-plugins (same implementation used for GPU access) and CSI (Container Storage Interface) the flexibility to use different storage types for different purposes in OpenShift/k8s is a substantial advantage over other platforms. Big Data developers can easily consume specific storage for this purpose transparently. Different storage types supported by different backends can be presented to the Platform, and these can be consumed by different teams based on the specific application requirements. As mentioned previously, the idea behind both device-plugins and CSI is to completely decouple vendor specific code from Kubernetes code. Doing this, hardware vendor’s code is not tied to any specific k8s version, and k8s community is not responsible for maintaining all these plugins and can be focused on maintaining a stable API to be consumed by these third party vendor plugins. Using all these features, we were able to onboard different teams into OpenShift with a broad set diversity of requirements, applications requiring >1Gb/s I/O performance, applications requiring “standard” volumes, applications with high memory requirements or applications requiring to pin their application workloads to specific CPUs for low-latency network applications, all of them working well together and orchestrated by the same platform.
Last but not least, we had the opportunity to try out one of the probably most exciting products coming into OpenShift/k8s in the near future, and this is the capability to orchestrate VMs extending the Kubernetes API through CRDs (Custom Resource Definitions); this project is called KubeVirt, ‘Container-Native Virtualization’ as the Enterprise product name. This particular customer already had Virtual Machine-based workloads that cannot be easily containerized, which is one of the perfect use cases for this. There is a heavy work in progress around KubeVirt, and hopefully, we will get soon a Tech Preview version available for OpenShift. Right now it can be deployed using the documentation from the KubeVirt site and it works really well. There is also a tooling ecosystem around KubeVirt to make easier to move existing VMs to be managed by OpenShift/k8s. Tools like Containerized Data Importer or v2v for KubeVirt, make easier to move complete VMs or import virtual machine disks to be used by these VM objects in OpenShift/k8s.
A lot of things have changed in the past 2 years around Containers and OpenShift/k8s. The effort the Community has put around this is massive. I can’t remember a project growing than fast and being adopted globally in such a short period of time. There are a lot of things to come yet, and the next coming years look even more exciting. New projects and new ideas will make the Platform even more suitable for any type of workload; the idea of OpenShift/k8s becoming the ‘new RHEL’ looks now less hypothetical and more real. Next OpenShift major release promises to be a big milestone in terms of “self-managed self-healing” platform, that could change the game for everyone. Just having all the technologies mentioned here playing nicely together will be definitely a game changer, and I can’t wait to see all these workloads and projects we have already tested to run together at scale.