We study the statistical mechanics of learning from examples between the two-layered committee machines with different numbers of hidden units using the replica theory. The number M of hidden units of the student network is larger than the number MT of those of the target network called the teacher. We choose the networks to have binary synaptic weights, ±1, which makes it possible to compare the calculation with the Monte Carlo simulation. We propose an effective teacher as a virtual target network which has the same M hidden units as the student and gives identical outputs with those of the original teacher. This is a way of making a conjecture for a ground state of a thermodynamic system, given by the weights of the effective teacher in our study. We suppose that the weights on MT hidden units of the effective teacher are the same as those of the original teacher while those on M − MT redundant hidden units are composed of anti-pairs, {1, − 1}, with probability 1 − p in the limit p → 0. For p = 0 exact, there are no terms related to the effective teacher in the calculation, for the contributions of anti-pairs to outputs are exactly cancelled. In the limit p → 0, however, we find that the learnt weights of the student are actually equivalent to those of the suggested effective teacher, which is not possible from the calculation for p = 0. p plays the role of a symmetry breaking parameter for anti-pairing ordering, which is analogous to the magnetic field for the Ising model. A first-order phase transition is found to be signalled by breaking of symmetry in permuting hidden units. Above a critical number of examples, the student is shown to learn perfectly the effective teacher. Anti-pairing can be measured by a set of order parameters; zero in the permutation-symmetric phase and nonzero in the permutation symmetry breaking phase. Results from the Monte Carlo simulation are shown to be in good agreement with those from the replica calculation.