Abstract Observed climate records of length, homogeneity, and reliability are the basis of climatological studies on tropical cyclones (TCs). However, such data are scarce for TC size in terms of wind field, particularly over the western North Pacific (WNP). This study demonstrates that deep learning can practically bridge this data gap when applied to satellite data. Using transfer learning, deep learning algorithms were developed to estimate reliable TC sizes from infrared imagery for the WNP TCs. The algorithms were then applied to a homogeneous satellite database to reconstruct a new historical dataset of TC sizes, named DeepTCSize, which covers 37 years (1981–2017) over the WNP. DeepTCSize includes multiple TC size quantities, such as wind radii of 17, 26, and 33 m s−1 and maximum winds (i.e., R17, R26, R33, and RMW), which have high correlations (R = 0.85, 0.84, 0.79, and 0.76, respectively) with postseason quality-controlled best track data. Comparisons with ocean wind observations were made and this further revealed that DeepTCSize has good quality and is free from spurious error trends, providing an advantage over the historical “best estimates” of TC sizes currently available in the best track archives for the WNP. The new reconstructed TC sizes dataset for the WNP TCs shows significant expanding trends in the annual-mean outer circulations (at a rate of 2% decade−1 for R17 and a rate of 2% decade−1 for R26), which are mainly associated with weaker storms, as well as a weak contracting trend in the annual-mean inner-core size (RMW). Significance Statement Tropical cyclone (TC) size largely controls the TC-induced hazard and risk. If the size of TC can be determined more efficiently in observations spanning a long-enough period, the climatology and changes in TC can be better modeled and understood. This study applies deep learning methods to reconstruct a new dataset of multiple inner- to outer-core TC size metrics from infrared imagery of satellites for the western North Pacific TCs. The dataset spans 37 years. It is homogenous and has comparable accuracy with the existing “best estimates.” Using the dataset, a significant expanding trend was identified in the outer-core size, while the inner-core size exhibits a weak contracting trend. The dataset can be employed in several applications.