We study monotone neural networks with threshold gates where all the weights (other than the biases) are nonnegative. We focus on the expressive power and efficiency of the representation of such networks. Our first result establishes that every monotone function over [0,1]d can be approximated within arbitrarily small additive error by a depth-4 monotone network. When , we improve upon the previous best-known construction, which has a depth of d+1 . Our proof goes by solving the monotone interpolation problem for monotone datasets using a depth-4 monotone threshold network. In our second main result, we compare size bounds between monotone and arbitrary neural networks with threshold gates. We find that there are monotone real functions that can be computed efficiently by networks with no restriction on the gates, whereas monotone networks approximating these functions need exponential size in the dimension.