The tailored clustering enabled regionalization (TCER) framework has shown that the prediction error of design parameters at a target site can be reduced by constructing a transformation model from a cluster that only contains sites with geotechnical properties similar to the target site. These sites are retrieved from a global/regional database. However, the similarity between any two sites will reduce when the number of properties increases. TCER will be ineffective if too many properties are included in the database (e.g., structural health monitoring and remote sensing databases). The current study presents a least absolute shrinkage and selection operator for databases with incomplete records (denoted by HMLasso; a dimension reduction method) to improve the performance of TCER in high dimensional databases. HMLasso is utilized to construct a dimensionally reduced database, which includes only the properties relevant to the design parameters, from a high-dimensional database. TCER is then adopted to retrieve a cluster from the dimensionally reduced database for inferring the design parameters at a target site. The capability of HMLasso to enhance the performance of TCER in terms of identifying clusters similar to the target site and reducing the prediction error of design parameters are demonstrated using real-world geotechnical databases.
Read full abstract