Tree approximation is a new form of nonlinear approximation which appears naturally in some applications such as image processing and adaptive numerical methods. It is somewhat more restrictive than the usual n-term approximation. We show that the restrictions of tree approximation cost little in terms of rates of approximation. We then use that result to design encoders for compression. These encoders are universal (they apply to general functions) and progressive (increasing accuracy is obtained by sending bit stream increments). We show optimality of the encoders in the sense that they provide upper estimates for the Kolmogorov entropy of Besov balls.