Abstract

In many machine learning applications, the number of labeled samples is often much smaller than that of unlabeled samples, because it may be difficult to obtain labeled samples. For example, due to computational cost, obtaining the oxygen ion conductivity (label) of compounds often requires a few days or weeks of calculations by molecular dynamics simulation, whereas unlabeled samples, such as the electronic states (features) of compounds, are relatively easily obtained (in one or two days). To address this issue, we develop a neural network model in transductive inference on regression, in which both the label smoothness and locally estimated label penalties are incorporated into the objective function. In addition, we propose empirical excess risk bounds for the neural network model in transductive inference on regression. These bounds using local Rademacher complexity are based on the eigenvalue analysis of the empirical Gram matrix. Experimental results were obtained on five benchmark data sets of regression problems and the data screening task of the oxygen ion conductivity as a real application. The proposed model was compared favorably with state-of-the-art methods. In addition, the proposed method improved the generalization performance of a linear regression model on data screening task. Finally, the results of the empirical excess risk bounds implied that these bounds are useful tools with respect to choosing the values of the hyperparameters of the model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call