Abstract
We analyze the generalization performance of a student in a model composed of linear perceptrons: a true teacher, ensemble teachers, and the student. Calculating the generalization error of the student analytically using statistical mechanics in the framework of on-line learning, it is proven that when learning rate $\eta <1$, the larger the number $K$ and the variety of the ensemble teachers are, the smaller the generalization error is. On the other hand, when $\eta >1$, the properties are completely reversed. If the variety of the ensemble teachers is rich enough, the direction cosine between the true teacher and the student becomes unity in the limit of $\eta \to 0$ and $K \to \infty$.
Submitted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have