Abstract

Today's big-data analysis systems achieve performance and scalability by requiring end users to embrace a novel programming model. This approach is highly effective whose the objective is to compute relatively simple functions on colossal amounts of data, but it is not a good match for a scientific computing environment which depends on complex applications written for the conventional POSIX environment. To address this gap, we introduce Conjugal, a scalable data-intensive computing system that is largely compatible with the POSIX environment. Conjugal brings together the workflow model of scientific computing with the storage architecture of other big data systems. Conjugal accepts large workflows of standard POSIX applications arranged into graphs, and then executes them in a cluster, exploiting both parallelism and data-locality. By making use of the workload structure, Conjugal is able to avoid the long-standing problems of metadata scalability and load instability found in many large scale computing and storage systems. We show that CompUSA's approach to load control offers improvements of up to 228% in cluster network utilization and 23% reductions in workflow execution time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.