Canonical Basis Research Articles

Topic models have become popular for the analysis of data that consists in a collection of n independent multinomial observations, with parameters $N_{i}\in\mathbb{N}$ and $\Pi_{i}\in[0,1]^{p}$ for $i=1,\ldots,n$. The model links all cell probabilities, collected in a $p\times n$ matrix $\Pi$, via the assumption that $\Pi$ can be factorized as the product of two nonnegative matrices $A\in[0,1]^{p\times K}$ and $W\in[0,1]^{K\times n}$. Topic models have been originally developed in text mining, when one browses through $n$ documents, based on a dictionary of $p$ words, and covering $K$ topics. In this terminology, the matrix $A$ is called the word-topic matrix, and is the main target of estimation. It can be viewed as a matrix of conditional probabilities, and it is uniquely defined, under appropriate separability assumptions, discussed in detail in this work. Notably, the unique $A$ is required to satisfy what is commonly known as the anchor word assumption, under which $A$ has an unknown number of rows respectively proportional to the canonical basis vectors in $\mathbb{R}^{K}$. The indices of such rows are referred to as anchor words. Recent computationally feasible algorithms, with theoretical guarantees, utilize constructively this assumption by linking the estimation of the set of anchor words with that of estimating the $K$ vertices of a simplex. This crucial step in the estimation of $A$ requires $K$ to be known, and cannot be easily extended to the more realistic set-up when $K$ is unknown. This work takes a different view on anchor word estimation, and on the estimation of $A$. We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates $K$ from the observed data. We derive new finite sample minimax lower bounds for the estimation of $A$, as well as new upper bounds for our proposed estimator. We describe the scenarios where our estimator is minimax adaptive. Our finite sample analysis is valid for any $n,N_{i},p$ and $K$, and both $p$ and $K$ are allowed to increase with $n$, a situation not handled well by previous analyses. We complement our theoretical results with a detailed simulation study. We illustrate that the new algorithm is faster and more accurate than the current ones, although we start out with a computational and theoretical disadvantage of not knowing the correct number of topics $K$, while we provide the competing methods with the correct value in our simulations.

In this paper, we propose a method to construct uni-modular tight frames (UMTFs), which are tight frames with the additional constraint that every entry of the matrix has the same magnitude. UMTFs are useful in many applications, since multiplication of a UMTF by a vector can be implemented in polar coordinates using very low computational cost. Since normalized UMTFs are unit norm tight frames (UNTFs), and since a UNTF is a minimizer of the frame potential, we propose an algorithm to find UMTFs by minimizing the frame potential. We show that minimizing the frame potential is equivalent to minimizing the total coherence when the frame is unimodular. We use the majorization-minimization approach to propose a low complexity, iterative, fast-converging algorithm for minimizing the frame potential. We also extend our algorithm to the cases where the phase angles of the sensing matrix are required to belong to a given finite set of feasible angles, and to the case where the signal being sampled is sparse in an arbitrary, possibly non-canonical basis. We illustrate the utility of our proposed construction in the context of sparse signal recovery. Partial DFT matrices, obtained by randomly selected rows from the full DFT matrix, are UMTFs. However, they perform poorly when dealing with signals that admit a sparse representation in the wavelet, Fourier and discrete cosine transform domains. In such scenarios, we illustrate the superior performance of our construction compared to the partial DFT, complex Gaussian and Bernoulli random matrices through simulations. The proposed algorithm offers the same performance as the partial DFT matrix, and outperforms the complex Gaussian and Bernoulli random matrices, when the signal is sparse in the canonical basis.

Canonical Basis Research Articles

Related Topics

Articles published on Canonical Basis

Boundary emptiness formation probabilities in the six-vertex model at

A canonical basis of a pair of compatible Poisson brackets on a matrix algebra

A canonical basis of a pair of compatible Poisson brackets on a matrix algebra

Docking of Platinum Compounds on Cube Rhombellane Functionalized Homeomorphs

Canonical and 1-Deoxy(methyl) Sphingoid Bases: Tackling the Effect of the Lipid Structure on Membrane Biophysical Properties.

Polyhedral parametrizations of canonical bases & cluster duality

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

Affine flag varieties and quantum symmetric pairs

Breaking the Entanglement Barrier: Tensor Network Simulation of Quantum Transport.

Tri-partitions and Bases of an Ordered Complex

The DNA repair enzyme MUTYH potentiates cytotoxicity of the alkylating agent MNNG by interacting with abasic sites

Self-Testing of Symmetric Three-Qubit States

THE -SCHUR ALGEBRAS AND -SCHUR DUALITIES OF FINITE TYPE

Construction of unimodular tight frames for compressed sensing using majorization-minimization

Quantum mirrors of log Calabi–Yau surfaces and higher-genus curve counting

Extension of the Günter Derivatives to the Lipschitz Domains and Application to the Boundary Potentials of Elastic Waves

Hermitian Tensor Decompositions

Polymerase-tautomeric Model for Untargeted Delayed Base Substitution Mutations Formation during Error-prone and SOS Replication of Double-stranded DNA Containing Thymine and Adenine in Some Rare Tautomeric Forms

The Regime of the Synodality in the Eastern Church of the First Millennium and Its Canonical Basis

Pattern groups and a poset based Hopf monoid

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Canonical Basis Research Articles

Related Topics

Articles published on Canonical Basis

Boundary emptiness formation probabilities in the six-vertex model at

A canonical basis of a pair of compatible Poisson brackets on a matrix algebra

A canonical basis of a pair of compatible Poisson brackets on a matrix algebra

Docking of Platinum Compounds on Cube Rhombellane Functionalized Homeomorphs

Canonical and 1-Deoxy(methyl) Sphingoid Bases: Tackling the Effect of the Lipid Structure on Membrane Biophysical Properties.

Polyhedral parametrizations of canonical bases & cluster duality

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

Affine flag varieties and quantum symmetric pairs

Breaking the Entanglement Barrier: Tensor Network Simulation of Quantum Transport.

Tri-partitions and Bases of an Ordered Complex

The DNA repair enzyme MUTYH potentiates cytotoxicity of the alkylating agent MNNG by interacting with abasic sites

Self-Testing of Symmetric Three-Qubit States

THE -SCHUR ALGEBRAS AND -SCHUR DUALITIES OF FINITE TYPE

Construction of unimodular tight frames for compressed sensing using majorization-minimization

Quantum mirrors of log Calabi–Yau surfaces and higher-genus curve counting

Extension of the Günter Derivatives to the Lipschitz Domains and Application to the Boundary Potentials of Elastic Waves

Hermitian Tensor Decompositions

Polymerase-tautomeric Model for Untargeted Delayed Base Substitution Mutations Formation during Error-prone and SOS Replication of Double-stranded DNA Containing Thymine and Adenine in Some Rare Tautomeric Forms

The Regime of the Synodality in the Eastern Church of the First Millennium and Its Canonical Basis

Pattern groups and a poset based Hopf monoid