Efficient resource allocation for generative AI workloads in cloud-native infrastructures: A multi-tiered approach

Kiran Randhi Kiran Randhi,Srinivas Reddy Bandarapu Srinivas Reddy Bandarapu

doi:10.30574/ijsra.2024.13.2.2208

Abstract

Resource management becomes essential in ensuring that generative AI workloads in cloud-native infrastructures deliver the best results. The architecture described in this article targets such workloads due to their inherent fluctuations in resource usage and the difficulties in scaling them. The proposed framework divides resources into groups to guarantee that applications are given support based on difficulty level. The features of the proposed methodology are the performance assessment of resource distribution effectiveness, taking into account metrics, including latency, throughput, and utilization rates. Furthermore, examples have been provided to support the use of this approach and its efficiency in real-life situations. Based on these, applying the multi-tiered approach to resource management improves the organization's operations performance and minimizes expenses connected with resource provisioning. Such a study also emphasizes the importance of developing flexible and effective resource management tools that can be especially useful in modern generative AI development environments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient resource allocation for generative AI workloads in cloud-native infrastructures: A multi-tiered approach

Abstract

Talk to us

Similar Papers

More From: International Journal of Science and Research Archive

Lead the way for us

Similar Papers

Research on Flexible Management of Human Resources under the Background of Wireless Communication and Internet of Things
Liqiu Qu
Wireless Communications and Mobile Computing | VOL. 2022
Liqiu QuLiqiu Qu
10 Jun 2022
Wireless Communications and Mobile Computing | VOL. 2022

Achieving Flexible SLA and Resource Management in Clouds
Vincent C. Emeakaroha ... César A. F. De Rose
-
Vincent C. Emeakaroha, et. al.Vincent C. Emeakaroha ... César A. F. De Rose
01 Jan 2012
01 Jan 2012

Future broadband radio access systems for integrated services with flexible resource management
T Tjelta ... M Dinis
IEEE Communications Magazine | VOL. 39
T Tjelta, et. al.T Tjelta ... M Dinis
01 Jan 2001
IEEE Communications Magazine | VOL. 39

Sustainable Development Innovation “The Effect of Flexible HRM Behavior on SME Innovation”
Rodhiah Rodhiah ... Nur Hidayah
Economit Journal: Scientific Journal of Accountancy, Management and Finance | VOL. 1
Rodhiah Rodhiah, et. al.Rodhiah Rodhiah ... Nur Hidayah
16 Feb 2022
Economit Journal: Scientific Journal of Accountancy, Management and Finance | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient resource allocation for generative AI workloads in cloud-native infrastructures: A multi-tiered approach

Abstract

Talk to us

Similar Papers

More From: International Journal of Science and Research Archive