Friction stir welding process has been studied extensively in the last decades since its early stage. Most of the research done so far is related to the process development including tool design, material weldability, post-weld mechanical behavior, and microstructural properties. More recently, in-line process monitoring and artificial intelligence algorithms are introduced into this process, but mainly to specific material configuration and joint thicknesses. This study will focus on the evaluation of different machine learning approaches including principle component analysis, K-nearest neighbor, multilayer perceptron, single vector machine, and random forest methods on a friction stir welding cell environment. The input variables provided from this cell environment are namely divided into two groups: one group refers to the application variables and the other group is related to the friction stir welding process variables. The application variables target the aluminum alloys, joint configuration, sheet thicknesses, initial mechanical properties, and their chemical composition. The friction stir welding process variables dictate the rotational speed, travel speed, forging force, longitudinal and transverse forces, torque, and specific energy. The output response to model from these machine learning algorithms is the defect index, which has been quantified using high-resolution immersed bath ultrasounds. This nondestructive evaluation technique has been described previously, which can detect defects ≥150 µm in thin sheets. The defect index has been classified into five classes, which is distinguished by the nature of defect, cold weld, or hot weld, as well as the width of the internal volumetric defect upon ultrasound C-scan result. The dataset, which is composed of around 500 various process conditions, has been generated over the last few years and the variables were taken exclusively in constant weld regime and in the force control mode using the output average values. This paper compares the best resulting machine learning methods applied on a friction stir welding cell basis, which is the K-nearest neighbor and multilayer perceptron algorithms. The K-nearest neighbor model reaches a deviation of 0.55 on the defect index in comparison with the experimental values, which is slightly better than the multilayer perceptron model, which obtains a score of 0.69. Over the initial 59 available model parameters, 10 and 15 of them were retained in the final algorithm using these techniques. The main predictors include the material thickness, base material ultimate tensile stress, rotational speed, travel speed, weld forces, and specific energy. The K-nearest neighbor model was able to provide a map of defect indices with regard to rotational speed and travel speed but was only possible when a higher density of data was found within the prediction area. A data density score was also included within the model to inform the end-user about the prediction reliability. The machine learning models are mainly about differentiating various cases rather than representing the physical phenomena as determined using the finite element analysis. That being said, in order to improve the prediction reliability as well as the machine learning models, the data twinning concept, which consists of generating simulated friction stir welding process conditions by finite element analysis, is briefly discussed.