Neural network models, thanks to their flexibility and ability to approximate any function, are gaining popularity across all research fields. In thermoacoustics, one of their applications is modelling of a nonlinear flame response, learned from a single broadband forcing time series obtained through a computational fluid dynamics simulation. However, the investigations on this flame modelling approach conducted by Jaensch et al. [1] and Tathawadekar et al. [2] report contradictory results, concerning the performance and uncertainty of the derived multi-layer perceptron networks, despite using identical training data. This paper reevaluates their findings and aims to reconcile the opposing conclusions. This work demonstrates the reason for the different network performances, by reviewing the data split policies and identifying shuffling as a detrimental factor. Additionally, in this study different regularisation techniques such as L1 and L2 regularisation as well as network size reduction are considered. Those are compared against the previously tested implementation of dropout [2], which was believed to be responsible for the difference in the results between the two studies.