Articles published on Container Orchestration Tools
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
20 Search results
Sort by Recency
- Research Article
- 10.11648/j.ajai.20250902.29
- Dec 11, 2025
- American Journal of Artificial Intelligence
- Trinh Minh + 4 more
The convergence of Artificial Intelligence (AI) with DevOps, DataOps, and MLOps has transformed the software development lifecycle, enabling scalable, automated, and intelligent systems. This paper explores the transition from traditional DevOps to MLOps, emphasizing the integration of machine learning workflows into continuous integration, deployment, and training pipelines. We present a practical framework for implementing MLOps using tools such as MLflow, Airflow, and Kubernetes, and address challenges like overfitting, underfitting, and model drift. The proposed architecture leverages Docker and ONNX for model packaging and deployment, ensuring reproducibility and cross-platform compatibility. Through real-world examples and pipeline automation strategies, we demonstrate how MLOps enhances model reliability, governance, and performance monitoring in dynamic environments. This study contributes to the growing body of knowledge on AI-driven DevOps by offering actionable insights for researchers and practitioners aiming to build robust ML systems. Build an Apache Airflow pipeline to load, train, and evaluate a ML model, store it, and use it for inferencing by deploying the model with a sleek Streamlit UI, Docker, and auto-scale it with Kubernetes as container orchestration tool. Techniques for implementing and automating continuous integration (CI), continuous delivery (CD), and continuous training (CT) for machine learning (ML) systems. This document applies primarily to predictive AI systems.
- Research Article
- 10.30871/jaic.v9i2.8972
- Mar 22, 2025
- Journal of Applied Informatics and Computing
- Mochamad Rizal Fachrudin + 1 more
Container orchestration has become a widely adopted standard for application deployment among medium to large-scale organizations. Docker Swarm is one of the popular container orchestration tools due to its relatively simple configuration. However, if the Docker Swarm cluster architecture is not properly designed, the goal of container orchestration, which is availability, cannot be achieved optimally. Challenges such as centralized traffic on a single node and service dependency on a single node are critical issues that need to be addressed. This study proposes solutions through an experimental approach involving the design, implementation, testing, and evaluation of a Docker Swarm cluster architecture to address these challenges. The results of this study demonstrate that the proposed architecture successfully resolves these issues. Traffic can be distributed more evenly across all nodes. When only one node is available, 5 out of 10 requests can be handled with a response latency of 197.4 ms. With two nodes available, the number of requests handled increases to 7 out of 10, with a response latency of 534.86 ms. The greater the number of available nodes, the more requests can be successfully processed. Services also become more flexible, and capable of running on any node, while offering additional benefits such as dual load balancing through DNS-based load balancing and the default load balancing provided by Docker Swarm's routing mesh. However, limitations such as the need for more complex adjustments and configurations should be considered, especially when implementing this architecture in on-premise environments, to ensure the best adoption and results.
- Research Article
1
- 10.3390/s24196244
- Sep 26, 2024
- Sensors (Basel, Switzerland)
- Joo-Young Roh + 2 more
In modern cloud environments, container orchestration tools are essential for effectively managing diverse workloads and services, and Kubernetes has become the de facto standard tool for automating the deployment, scaling, and operation of containerized applications. While Kubernetes plays an important role in optimizing and managing the deployment of diverse services and applications, its default scheduling approach, which is not optimized for all types of workloads, can often result in poor performance and wasted resources. This is particularly true in environments with complex interactions between services, such as microservice architectures. The traditional Kubernetes scheduler makes scheduling decisions based on CPU and memory usage, but the limitation of this arrangement is that it does not fully account for the performance and resource efficiency of the application. As a result, the communication latency between services increases, and the overall system performance suffers. Therefore, a more sophisticated and adaptive scheduling method is required. In this work, we propose an adaptive pod placement optimization technique using multi-tier inspection to address these issues. The proposed technique collects and analyzes multi-tier data to improve application performance and resource efficiency, which are overlooked by the default Kubernetes scheduler. It derives optimal placements based on the coupling and dependencies between pods, resulting in more efficient resource usage and better performance. To validate the performance of the proposed method, we configured a Kubernetes cluster in a virtualized environment and conducted experiments using a benchmark application with a microservice architecture. The experimental results show that the proposed method outperforms the existing Kubernetes scheduler, reducing the average response time by up to 11.5% and increasing the number of requests processed per second by up to 10.04%. This indicates that the proposed method minimizes the inter-pod communication delay and improves the system-wide resource utilization. This research aims to optimize application performance and increase resource efficiency in cloud-native environments, and the proposed technique can be applied to different cloud environments and workloads in the future to provide more generalized optimizations. This is expected to contribute to increasing the operational efficiency of cloud infrastructure and improving the quality of service.
- Research Article
5
- 10.3390/info15030126
- Feb 23, 2024
- Information
- Swati Kumari + 2 more
Due to rising cyber threats, IoT devices’ security vulnerabilities are expanding. However, these devices cannot run complicated security algorithms locally due to hardware restrictions. Data must be transferred to cloud nodes for processing, giving attackers an entry point. This research investigates distributed computing on the edge, using AI-enabled IoT devices and container orchestration tools to process data in real time at the network edge. The purpose is to identify and mitigate DDoS assaults while minimizing CPU usage to improve security. It compares typical IoT devices with and without AI-enabled chips, container orchestration, and assesses their performance in running machine learning models with different cluster settings. The proposed architecture aims to empower IoT devices to process data locally, minimizing the reliance on cloud transmission and bolstering security in IoT environments. The results correlate with the update in the architecture. With the addition of AI-enabled IoT device and container orchestration, there is a difference of 60% between the new architecture and traditional architecture where only Raspberry Pi were being used.
- Research Article
21
- 10.1016/j.jpdc.2024.104837
- Jan 9, 2024
- Journal of Parallel and Distributed Computing
- Javad Dogani + 1 more
Proactive auto-scaling technique for web applications in container-based edge computing using federated learning model
- Research Article
2
- 10.59200/icarti.2023.021
- Nov 9, 2023
- International Conference on Artificial Intelligence and its Applications
- Sabelo Justice Mthembu + 2 more
Internet of Things (IoT) is the developing technology that enables devices to communicate without human interaction. IoT utilizes cloud computing services to collect and process data for IoT devices and to manage the device remotely. Cloud computing is not efficient enough to handle the fast stream of data produced by the IoT, therefore scaling up IoT applications to meet demands of high peak becomes easier and highly automated in fog computing. Containers are mostly used as virtualization solutions for IoT in fog computing. It enables the execution of small microservices to large applications. However, the rise of many lightweight containers has resulted in new application architectures and fundamentally changing how applications are deployed and visualized. Due to this change, container orchestration tools were proposed. These tools allow users to coordinate and manage containers. However, container orchestration tools need to meet the requirements of IoT applications and constraints imposed on the nodes in fog. This paper presents a systematic literature review on the selection of orchestration tools for the efficient deployment of IoT applications in fog computing. Moreover, the performance of IoT applications must be considered by applying different metrics. This paper aims to propose potential research directions to address identified gaps in the selection of orchestration tools.
- Research Article
4
- 10.30574/wjarr.2023.18.1.0629
- Apr 30, 2023
- World Journal of Advanced Research and Reviews
- Taiwo Joseph Akinbolaji + 4 more
This study examines strategies for enhancing fault tolerance and scalability in multi-region Kafka clusters, essential for supporting high-demand cloud environments. As cloud-based applications expand globally, achieving seamless data streaming across regions requires advanced configurations in Apache Kafka. This paper provides a thorough analysis of key approaches, including replication strategies, dynamic resource management, and real-time monitoring techniques tailored for multi-region deployments. Through a comprehensive literature review and real-world case studies, the study identifies critical challenges in managing latency, data consistency, and resilience within distributed Kafka clusters. Findings reveal that fault tolerance can be significantly improved through hybrid replication models that balance latency and data integrity, while advanced partitioning and load balancing techniques optimize Kafka’s scalability under fluctuating demands. The integration of container orchestration tools such as Kubernetes has also proven effective in automating resource scaling and failover across distributed environments. Furthermore, the paper highlights future research directions, including edge computing integration, predictive scaling, and enhanced security protocols to address evolving data privacy requirements. In conclusion, while multi-region Kafka deployments offer robust solutions for distributed data streaming, achieving optimal performance and resilience requires a combination of adaptive replication, proactive resource management, and secure, compliant data flows. Future research should focus on refining edge-compatible solutions and regulatory-compliant frameworks to sustain Kafka’s role in global, real-time data processing.
- Research Article
42
- 10.3390/s23084008
- Apr 15, 2023
- Sensors
- Ivan Čilić + 3 more
Edge computing is a viable approach to improve service delivery and performance parameters by extending the cloud with resources placed closer to a given service environment. Numerous research papers in the literature have already identified the key benefits of this architectural approach. However, most results are based on simulations performed in closed network environments. This paper aims to analyze the existing implementations of processing environments containing edge resources, taking into account the targeted quality of service (QoS) parameters and the utilized orchestration platforms. Based on this analysis, the most popular edge orchestration platforms are evaluated in terms of their workflow that allows the inclusion of remote devices in the processing environment and their ability to adapt the logic of the scheduling algorithms to improve the targeted QoS attributes. The experimental results compare the performance of the platforms and show the current state of their readiness for edge computing in real network and execution environments. These findings suggest that Kubernetes and its distributions have the potential to provide effective scheduling across the resources on the network's edge. However, some challenges still have to be addressed to completely adapt these tools for such a dynamic and distributed execution environment as edge computing implies.
- Research Article
3
- 10.1007/s10270-022-01027-8
- Sep 15, 2022
- Software and Systems Modeling
- Bruno Piedade + 2 more
Container orchestration tools supporting infrastructure-as-code allow new forms of collaboration between developers and operatives. Still, their text-based nature permits naive mistakes and is more difficult to read as complexity increases. We can find few examples of low-code approaches for defining the orchestration of containers, and there seems to be a lack of empirical studies showing the benefits and limitations of such approaches. We hypothesize that a complete visual notation for Docker-based orchestrations could reduce the effort, the error rate, and the development time. Therefore, we developed a tool featuring such a visual notation for Docker Compose configurations, and we empirically evaluated it in a controlled experiment with novice developers. The results show a significant reduction in development time and error-proneness when defining Docker Compose files, supporting our hypothesis. The participants also thought the prototype easier to use and useful, and wanted to use it in the future.
- Research Article
- 10.20884/1.jutif.2022.3.4.484
- Aug 20, 2022
- Jurnal Teknik Informatika (Jutif)
- Anita Rosdina Nasution + 2 more
Data storage media, or what is often referred to as a database is something that is quite vital for technological developments. As the amount of data increases, it allows database services to experience downtime. For this reason, it is necessary to build an infrastructure that can replicate itself, so that it will avoid downtime. This infrastructure can be built using a container orchestration tool called Kubernetes which has high availability and autoscaler features, so it can replicate and guarantee service availability, to avoid downtime. This research builds a MongoDB NoSQL database service. This service is built using micro Kubernetes clusters from several different data centers. This service also implements a horizontal pod autoscaler feature that is capable of replicating pods, to increase high availability and avoid downtime. The autoscaling process will be tested by providing a load request for the service. Testing is done several times on each server. This study will compare the MongoDB service that was built monolithically with a micro Kubernetes cluster, and with HPA features and without HPA features by paying attention to several things. Based on Response Time, Response Code per Seconds, and CPU Usage, the results obtained are that the service built on a micro Kubernetes cluster with HPA features is the best, with a constant response time value below 100 ms, Response Code per Seconds reaches 500 threads per second. seconds, and CPU Usage in the range of 30 – 55%.
- Research Article
42
- 10.3390/s22082869
- Apr 8, 2022
- Sensors
- Quang-Minh Nguyen + 2 more
Kubernetes (K8s) is expected to be a key container orchestration tool for edge computing infrastructures owing to its various features for supporting container deployment and dynamic resource management. For example, its horizontal pod autoscaling feature provides service availability and scalability by increasing the number of replicas. kube-proxy provides traffic load-balancing between replicas by distributing client requests equally to all pods (replicas) of an application in a K8s cluster. However, this approach can result in long delays when requests are forwarded to remote workers, especially in edge computing environments where worker nodes are geographically dispersed. Moreover, if the receiving worker is overloaded, the request-processing delay can increase significantly. To overcome these limitations, this paper proposes an enhanced load balancer called resource adaptive proxy (RAP). RAP periodically monitors the resource status of each pod and the network status among worker nodes to aid in load-balancing decisions. Furthermore, it preferentially handles requests locally to the maximum extent possible. If the local worker node is overloaded, RAP forwards its requests to the best node in the cluster while considering resource availability. Our experimental results demonstrated that RAP could significantly improve throughput and reduce request latency compared with the default load-balancing mechanism of K8s.
- Research Article
13
- 10.3390/app12010140
- Dec 23, 2021
- Applied Sciences
- Seunghwan Lee + 4 more
With the exponential growth of the Internet of Things (IoT), edge computing is in the limelight for its ability to quickly and efficiently process numerous data generated by IoT devices. EdgeX Foundry is a representative open-source-based IoT gateway platform, providing various IoT protocol services and interoperability between them. However, due to the absence of container orchestration technology, such as automated deployment and dynamic resource management for application services, EdgeX Foundry has fundamental limitations of a potential edge computing platform. In this paper, we propose EdgeX over Kubernetes, which enables remote service deployment and autoscaling to application services by running EdgeX Foundry over Kubernetes, which is a product-grade container orchestration tool. Experimental evaluation results prove that the proposed platform increases manageability through the remote deployment of application services and improves the throughput of the system and service quality with real-time monitoring and autoscaling.
- Research Article
5
- 10.1155/2021/6397786
- Nov 27, 2021
- Scientific Programming
- Chunmao Jiang + 1 more
The container scaling mechanism, or elastic scaling, means the cluster can be dynamically adjusted based on the workload. As a typical container orchestration tool in cloud computing, Horizontal Pod Autoscaler (HPA) automatically adjusts the number of pods in a replication controller, deployment, replication set, or stateful set based on observed CPU utilization. There are several concerns with the current HPA technology. The first concern is that it can easily lead to untimely scaling and insufficient scaling for burst traffic. The second is that the antijitter mechanism of HPA may cause an inadequate number of onetime scale-outs and, thus, the inability to satisfy subsequent service requests. The third concern is that the fixed data sampling time means that the time interval for data reporting is the same for average and high loads, leading to untimely and insufficient scaling at high load times. In this study, we propose a Double Threshold Horizontal Pod Autoscaler (DHPA) algorithm, which fine-grained divides the scale of events into three categories: scale-out, no scale, and scale-in. And then, on the scaling strength, we also employ two thresholds that are further subdivided into no scaling (antijitter), regular scaling, and fast scaling for each of the three cases. The DHPA algorithm determines the scaling strategy using the average of the growth rates of CPU utilization, and thus, different scheduling policies are adopted. We compare the DHPA with the HPA algorithm under different loads, including low, medium, and high. The experiments show that the DHPA algorithm has better antijitter and antiload characteristics in container increase and reduction while ensuring service and cluster security.
- Research Article
18
- 10.3390/s21041378
- Feb 16, 2021
- Sensors (Basel, Switzerland)
- Syed M Raza + 4 more
Containers virtually package a piece of software and share the host Operating System (OS) upon deployment. This makes them notably light weight and suitable for dynamic service deployment at the network edge and Internet of Things (IoT) devices for reduced latency and energy consumption. Data collection, computation, and now intelligence is included in variety of IoT devices which have very tight latency and energy consumption conditions. Recent studies satisfy latency condition through containerized services deployment on IoT devices and gateways. They fail to account for the limited energy and computing resources of these devices which limit the scalability and concurrent services deployment. This paper aims to establish guidelines and identify critical factors for containerized services deployment on resource constrained IoT devices. For this purpose, two container orchestration tools (i.e., Docker Swarm and Kubernetes) are tested and compared on a baseline IoT gateways testbed. Experiments use Deep Learning driven data analytics and Intrusion Detection System services, and evaluate the time it takes to prepare and deploy a container (creation time), Central Processing Unit (CPU) utilization for concurrent containers deployment, memory usage under different traffic loads, and energy consumption. The results indicate that container creation time and memory usage are decisive factors for containerized micro service architecture.
- Research Article
16
- 10.1007/s10586-020-03210-2
- Nov 24, 2020
- Cluster Computing
- Ron C Chiang
Containerization technology utilizes operating system level virtualization to package applications to run with required libraries and are isolated from other processes on the same host. Lightweight and quick deployment make containers popular in many data centers. Running distributed applications in data centers usually involves multiple clusters of machines. Docker Swarm is a container orchestration tool for managing a cluster of Docker containers and their hosts. However, Docker Swarm’s scheduler does not consider resource utilization when placing containers in a cluster. This paper first investigated performance interference in container clusters. Our experimental study showed that distributed applications’ performance can be degraded when co-located with other containers which aggressively consume resources. A new scheduler is proposed to improve performance while keeping high resource utilization. The experimental results demonstrated that the proposed prototype with machine learning based clustering algorithms could effectively improve distributed applications’ performance by up to 14.5% with an average at around 12%. This work also provides theoretical bounds for the container placement problem.
- Research Article
133
- 10.1016/j.comcom.2020.04.061
- May 7, 2020
- Computer Communications
- Fabiana Rossi + 3 more
Geo-distributed efficient deployment of containers with Kubernetes
- Research Article
- 10.1109/access.2020.3035619
- Jan 1, 2020
- IEEE Access
- Henrique Cesar Carvalho De Resende + 4 more
As Cloud Computing (CC) branched areas such as Multi-access Edge Computing (MEC) and Fog computing are still on growing research interest. The creation of new tools to improve quality and speed the experimentation in such areas is a general interest. In this article, we propose COPA, an experimenter-level container orchestration tool for networking testbeds. This tool provides a friendly interface for the experimenter test container orchestration algorithms which can start, stop, copy, and even migrate a container from one host to another. COPA also includes network/resources monitoring to feed the experimenter's orchestration algorithm so that it can make decisions based on real-time environment information. Furthermore, the experimenter can automatize the experiment scenario setup and deployment by pre-configuring in COPA. This tool helps the experimenter in testing different scenarios and quickly changing experiment parameters. Considering these features, COPA aims to provide an experimentation architecture to deploy and test container orchestration algorithms. Furthermore, we provide a case study explaining how COPA can be a key tool in the MEC and Network Function Virtualization (NFV) experimentation environments. This tool was already deployed in Federated Union of Telecommunications Research Facilities for an EU-Brazil Open Laboratory (FUTEBOL) testbeds as part of the control framework and was well validated by the project reviewers and partners.
- Research Article
2
- 10.5121/ijcnc.2019.11507
- Sep 30, 2019
- International Journal of Computer Networks & Communications
- Felipe Rodriguez Yaguache + 1 more
As SD-WAN disrupts legacy WAN technologies and becomes the preferred WAN technology adopted by corporations, and Kubernetes becomes the de-facto container orchestration tool, the opportunities for deploying edge-computing containerized applications running over SD-WAN are vast. Service orchestration in SD-WAN has not been provided with enough attention, resulting in the lack of research focused on service discovery in these scenarios. In this article, an in-house service discovery solution that works alongside Kubernetes’ master node for allowing improved traffic handling and better user experience when running micro-services is developed. The service discovery solution was conceived following a design science research approach. Our research includes the implementation of a proof-ofconcept SD-WAN topology alongside a Kubernetes cluster that allows us to deploy custom services and delimit the necessary characteristics of our in-house solution. Also, the implementation's performance is tested based on the required times for updating the discovery solution according to service updates. Finally, some conclusions and modifications are pointed out based on the results, while also discussing possible enhancements.
- Research Article
33
- 10.3390/app9010191
- Jan 7, 2019
- Applied Sciences
- Dongmin Kim + 4 more
Kubernetes, a container orchestration tool for automatically installing and managing Docker containers, has recently begun to support a federation function of multiple Docker container clusters. This technology, called Kubernetes Federation, allows developers to increase the responsiveness and reliability of their applications by distributing and federating container clusters to multiple service areas of cloud service providers. However, it is still a daunting task to manually manage federated container clusters across all the service areas or to maintain the entire topology of cloud applications at a glance. This research work proposes a method to automatically form and monitor Kubernetes Federation, given application topology descriptions in TOSCA (Topology and Orchestration Specification for Cloud Applications), by extending the orchestration tool that automatizes the modeling and instantiation of cloud applications. It also demonstrates the successful federation of the clusters according to the TOSCA specifications and verifies the auto-scaling capability of the configured system through a scenario in which the servers of a sample application are deployed and federated.
- Research Article
1
- 10.1051/epjconf/201921407019
- Jan 1, 2019
- EPJ Web of Conferences
- Mayank Sharma + 3 more
This article describes a new framework, called SIMPLE, for settingup and maintaining classic WLCG sites with minimal operational efforts and insights needed into the WLCG middleware. The framework provides a single common interface to install and configure any of its supported grid services, such as Compute Elements, Batch Systems, Worker Nodes and miscellaneous middleware packages. It leverages modern container orchestration tools like Kubernetes, Docker Swarm, and confiuration management tools like Puppet, Ansible, to automate deployment of the WLCG services on behalf of a site admin. The framework is modular and extensible by design. Therefore, it is easy to add support for more grid services as well as infrastructure automation tools to accommodate diverse scenarios at different sites. We provide insight into the design of the framework and our efforts towards development, release and deployment of its first implementation featuring CREAM E, TORQUE Batch System and TORQUE based Worker Nodes.