The main objective of this study is to introduce machine learning-type extensions for the measurement of environmental inefficiency based on regression trees under shape constraints. The new methods developed are implemented using a by-production approach that distinguishes two technologies, one related to the generation of pollution and the other to the production of good outputs. In particular, we define two alternative approaches to measuring environmental inefficiency: by-production Efficiency Analysis Trees (by-production EAT) and by-production Convexified Efficiency Analysis Trees (by-production CEAT). The main advantage of the methods developed is that they do not suffer from the typical statistical problem of overfitting connected to Free Disposal Hull (FDH) and Data Envelopment Analysis (DEA). The performance of the new models is evaluated through a simulation study which shows that the new approaches outperform FDH and DEA in terms of mean squared error and bias. We also illustrate the practical usefulness of the new techniques through empirical application to 43 developing and developed countries over a fifteen-year period - from 2000 to 2014. Our empirical findings using real data clearly indicate the higher discriminating power of the by-production EAT and CEAT models as compared respectively to FDH and DEA.
Read full abstract