A general form of minimizing the Rayleigh ratio on discrete variables is shown here, for the first time, to be polynomial time solvable. This is significant because major problems in clustering, partitioning, and imaging can be presented as the Rayleigh ratio minimization on discrete variables and an orthogonality constraint. These challenging problems are modeled as the normalized cut problem, the graph expander ratio problem, the Cheeger constant problem, or the conductance problem, all of which are NP-hard. These problems have traditionally been solved, heuristically, using the “spectral technique.” A unified framework is provided here whereby all these problems are formulated as a constrained minimization form of a quadratic ratio, referred to here as the Rayleigh ratio. The quadratic ratio is to be minimized on discrete variables and a single sum constraint that we call the balance or orthogonality constraint. When the discreteness constraints on the variables are disregarded, the resulting continuous relaxation is solved by the spectral method. It is shown here that the Rayleigh ratio minimization subject to the discreteness constraints requiring each variable to assume one of two values in {−b,1} is solvable in strongly polynomial time, equivalent to a single minimum s,t cut algorithm on a graph of same size as the input graph, for any nonnegative value of b. This discrete form for the Rayleigh ratio problem was often assumed to be NP-hard. Not only is it shown here that the discrete Rayleigh ratio problem is polynomial time solvable, but also the algorithm is more efficient than the spectral algorithm. Furthermore, an experimental study demonstrates that the new algorithm provides in practice an improvement, often dramatic, on the quality of the results of the spectral method, both in terms of approximating the true optimum of the Rayleigh ratio problem on both the discrete variables and the balance constraint, and in terms of the subjective partition quality. A further contribution here is the introduction of a problem, the quantity-normalized cut, generalizing all the Rayleigh ratio problems. The discrete version of that problem is also solved with the efficient algorithm presented. This problem is shown, in a companion paper, to enable the modeling of features essential to clustering that are valuable in practical applications.
Read full abstract