Abstract

In this age of data and knowledge, Cloud, Grid and P2P systems are becoming common and advanced. Due to heterogeneous and distributed nature, Grid becomes more vulnerable to faults. Trace files are great way of storing and collecting fault and workload information from the system. FTA (Fault Trace Archive) and GWA (Grid Workload Archive) are two such trace files. Previously FTA and GWA have been individually analyzed by researchers, but in this research paper for the first time, we have analyzed the combination of FTA and GWA as a single research problem. Trace files have been joined based on the event timestamp values. Both the trace files have been analyzed to establish a correlation based model among node failures, failed jobs, number of nodes and failure duration. We have discovered that these factors are positively correlated with each other but to a different extent. Along with node failure frequency, failure resume time and node dedication factor, we have found that interactive jobs have a higher failure probability as compared to batch jobs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call