Abstract

Effort estimation is a key component of planning a software development project. In the past, there has been a lot of research on estimation methods for traditional applications but, unfortunately, these methods do not apply to Extract Transform Load (ETL) projects. Coming up with a systematic effort estimate for ETL projects is a challenging task since ETL development does not follow the traditional Software Development Life Cycle (SDLC). Traditional application development is requirements-driven whereas ETL application development is data-driven. This research paper describes the development of an effort estimation model for ETL projects and compares this model with the most widely used algorithmic effort estimation model i.e. COCOMO II. A dataset comprising 220 industrial projects from five different software houses is used to build this effort estimation model using Forward Stepwise Regression. After eliminating 20 outliers from this dataset, the adjusted R2 (i.e. goodness of fit) of our model is 0.87. The prediction and training accuracy of this model is measured using the de-facto standard accuracy measures such as MMRE and PRED(25). On a training dataset of 200 projects, the training accuracy value of PRED(25) is 81.16% and MMRE is 0.16. Results show that our proposed estimation model provides considerably better estimation accuracy as compared to COCOMO II. On a validation dataset of 58 projects, the value of PRED(25) was 49% for our model as compared to 21% for COCOMO II. Furthermore, the MMRE of our model is 0.31 as compared to 0.99 for COCOMO II.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.