Containers are everywhere providing developers the environment to build applications and their dependencies in isolated processes. The proliferation of containers solves the mobility problem: getting software to execute and produce the same results no matter which host system it was running on. The creation of individual microservices that can be modified and scaled easily without impacting the whole system brought predictability and reliability to development environments.
However, the development of AI apps was not originally in the focus of containers because they required a lot of resources. This is destined to change. As containers grow in popularity and AI adoption increases, enterprises are starting to leverage containerization to gain flexibility, portability, and reliability for the AI and machine learning lifecycle.
AI market is expanding
The AI market is booming, and this expansion is being driven by a number of factors. The extensive use of various types of sensors (IoT and mobile devices) interacting with the physical world provides a widespread availability of large-scale datasets. At the same time, organizations, both in the public and private sector along with academia and research institutes, have realized the potential added value of getting insights from big data.
Technology advancements have also promoted the AI expansion. There are now readily available and accessible AI tools and technology, which in tandem with cheaper and more powerful computing power, allow a growing number of data scientists, analysts, and engineers to reap the benefits of AI. People, in general, realize that investing in AI can pay off.
Challenges for building AI/ machine learning systems
However, there various challenges when building AI and machine learning (ML) systems. First, creating ML models is an iterative and often complex process that includes data exploration, model training, validation, and deployment. It is far from being a one-off process. Scientists use data and ML to teach the application repeatedly until the outcomes represent efficiently the real world.
The main challenge is planning for and managing the high computing power needs for those ML systems. Training ML models is computing-intensive, particularly during the data extraction and model training phases. Even if model inferencing—using a trained model and new data to make a prediction—requires relatively less computing power, these systems need to be reliable, as they are assisting in decision making for critical business functions.
To accommodate these needs – reliability and computing power – enterprises are leveraging hybrid environments, making use of on-premise and in the cloud platforms, to meet the requirements in an efficient and cost-effective manner.
Benefits of using containers for ML systems
This is where containers come in handy. Born and raised in the cloud, the use of containers can accelerate the development of machine learning models. Deploying ML applications as containers comes with many advantages.
- Self-contained ML applications can be used in any platform with virtually no further testing required. Because containerized ML applications can operate in distributed environments, they can be placed close to the data source they analyze, allowing for more efficient use of available resources.
- Having the ability to expose the services of containerized ML systems as services or microservices allows external applications to take advantage of these services.
- Container orchestration provides the ability to scale up or down ML apps that reside in containers to meet business demands, meeting the reliability and efficacy required for this kind of deployments and improving utilization of expensive corporate resources.
- Containers have built-in mechanisms for distributed data access. It is therefore easier for ML apps to access the data fabric through data-oriented interfaces that support various complex models.
- Containers can also help to create a more effective application architecture. ML systems can be made up of containers which function as loosely coupled subsystems.
Requirements for successful use of containers in ML systems
Although containers are great at making ML applications flexible and portable, it is challenging to manage multiple containers within complex systems. That is where container orchestration like Kubernetes comes in.
However, Kubernetes alone is not sufficient for enterprise scale deployments of containerized applications. Organizations need to exercise rigorous management and continuous monitoring of the deployed containers to ensure reliability of delivered services. In addition, security and access controls should be built around the containerized environments to remediate known vulnerabilities and safeguard containers during build phase, deployment, and runtime.
Further, to expand the use of containers in ML systems there is the requirement for persistent data storage. Data analysis and ML applications require continuous access to reliable data. Therefore, an underlying data fabric is required to ensure the containerized applications have the same consistent view of the data no matter where they are deployed.
The Way ahead
Containers have the potential to cater a portable and consistent environment for the rapid deployment of ML and AI models. Like DevOps and software engineering, ML can also benefit from the agility, portability, and flexibility that containers bring.
Because of these potentials, there are lots of ongoing projects in this area. For example, KubeDirector is an open source project designed to run complex, distributed stateful applications on Kubernetes, while open source KubeFlow is designed to simplify production ML deployments.
As enterprises gather more and more data, they need to make sense of it and create valuable insights. The creation of machine learning systems will help them automate processes, increase productivity, and remove human error from the equation. Containers will be of great assistance.
Featured Image: Unsplash