Heart disease, due to its high prevalence and mortality, remains a key area of global research. Although traditional diagnostic methods are effective, they are often invasive and time-consuming, highlighting the need for non-invasive, AI-based approaches. A significant challenge in real-world applications is ensuring model generalization across different datasets, particularly when the datasets are small. In this study, the performance of machine learning models, including Decision Tree, Random Forest, Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP), was evaluated on two distinct heart disease datasets with different distributions and relatively small sizes. Two datasets with varying distributions were used for training and testing, with the primary focus on assessing model generalization in crossdataset applications. It is shown in the results that, while the Decision Tree model performed best after hyperparameter tuning, the improvements in Random Forest and MLP were limited, and SVM exhibited a decline in performance after tuning in the cross-dataset task. It was found that grid search tuning has limitations in cross-dataset scenarios, especially with small datasets, where complex models are prone to overfitting. The study demonstrates that, with smaller datasets, simpler models like Decision Trees often adapt better to different datasets. Furthermore, transfer learning and domain adaptation techniques are suggested as crucial for improving model generalization. Future research should focus on employing these techniques to enhance the robustness and accuracy of heart disease prediction models across diverse datasets.