AbstractPreviously, electronic straightedges with a length of 1 m were widely used to measure the longitudinal profiles of rail joints. However, owing to the lack of an efficient measurement device, rail joints with 3‐m wavelengths are seldom studied. In this study, a rail measurement trolley based on the chord‐reference method was developed with a measurement wavelength of up to 3 m. A field measurement was performed on a 53‐km metro line, and the waveforms of 4340 rail joints were obtained. First, to visualize the distribution of the dataset and to find out the common features, t‐distributed stochastic neighbor embedding dimensionality reduction was applied to the rail joint dataset, and each rail joint waveform was mapped to a point in a two‐dimensional space. Second, K‐means was applied to the rail joint dataset, and six categories of rail joints were obtained. The results indicated that there are two types of rail joints: M‐type and W‐type, accounting for 18.41% and 76.08% of the total number of joints, respectively, and the remainder are bolted rail joints. Third, to better evaluate rail joint status, the concept of rail joint triangle (RJT) is proposed, and five shape‐based features of a rail joint in 3‐m wavelength are defined. Finally, using RJT distribution analysis, we observed that the shape‐based features provide more essential information about a rail joint, such as symmetry, asymmetry, M‐type, or W‐type, compared with conventional indexes such as the quality index. Notably, compared with the waveform of a rail joint at 1 m, a 3‐m waveform provides significantly more essential information, which can be meaningful for future research on the dynamic impact of rail joints, as well as profile grinding around rail joints. To help other researchers follow our research, our dataset is available on Mendeley Data (RWJ‐3 m dataset).