In this work, we explored the feasibility of predicting the structural drift from the first seconds of P-wave signals for On-site Earthquake Early Warning (EEW) applications. To this purpose, we investigated the performance of both linear least square regression (LSR) and four non-linear machine learning (ML) models: Random Forest, Gradient Boosting, Support Vector Machines and K-Nearest Neighbors. Furthermore, we also explore the applicability of the models calibrated for a region to another one. The LSR and ML models are calibrated and validated using a dataset of ∼6,000 waveforms recorded within 34 Japanese structures with three different type of construction (steel, reinforced concrete, and steel-reinforced concrete), and a smaller one of data recorded at US buildings (69 buildings, 240 waveforms). As EEW information, we considered three P-wave parameters (the peak displacement, Pd, the integral of squared velocity, IV2, and displacement, ID2) using three time-windows (i.e., 1, 2, and 3 s), for a total of nine features to predict the drift ratio as structural response. The Japanese dataset is used to calibrate the LSR and ML models and to study their capability to predict the structural drift. We explored different subsets of the Japanese dataset (i.e., one building, one single type of construction, the entire dataset. We found that the variability of both ground motion and buildings response can affect the drift predictions robustness. In particular, the predictions accuracy worsens with the complexity of the dataset in terms of building and event variability. Our results show that ML techniques perform always better than LSR models, likely due to the complex connections between features and the natural non-linearity of the data. Furthermore, we show that by implementing a residuals analysis, the main sources of drift variability can be identified. Finally, the models trained on the Japanese dataset are applied the US dataset. In our application, we found that the exporting EEW models worsen the prediction variability, but also that by including correction terms as function of the magnitude can strongly mitigate such problem. In other words, our results show that the drift for US buildings can be predicted by minor tweaks to models.