The design of neural networks typically involves trial-and-error, a time-consuming process for obtaining an optimal architecture, even for experienced researchers. Additionally, it is widely accepted that loss functions of deep neural networks are generally non-convex with respect to the parameters to be optimised. We propose the Layer-wise Convex Theorem to ensure that the loss is convex with respect to the parameters of a given layer, achieved by constraining each layer to be an overdetermined system of non-linear equations. Based on this theorem, we developed an end-to-end algorithm (the AutoNet) to automatically generate layer-wise convex networks (LCNs) for any given training set. We then demonstrate the performance of the AutoNet-generated LCNs (AutoNet-LCNs) compared to state-of-the-art models on three electrocardiogram (ECG) classification benchmark datasets, with further validation on two non-ECG benchmark datasets for more general tasks. The AutoNet-LCN was able to find networks customised for each dataset without manual fine-tuning under 2 GPU-hours, and the resulting networks outperformed the state-of-the-art models with fewer than 5% parameters on all the above five benchmark datasets. The efficiency and robustness of the AutoNet-LCN markedly reduce model discovery costs and enable efficient training of deep learning models in resource-constrained settings.
Read full abstract