We consider the problem of recovering a signal $ \mathbf {x^{*}}\in \mathbb {R}^{n}$ , from magnitude-only measurements, $y_{i} = \left |{ \left \langle{ { \mathbf {a}_{i}}, \mathbf {x^{*}} }\right \rangle }\right | $ for $i=\{1,2,\ldots ,m\}$ . This is a stylized version of the classical phase retrieval problem , and is a fundamental challenge in nano- and bio-imaging systems, astronomical imaging, and speech processing. It is well-known that the above problem is ill-posed, and therefore some additional assumptions on the signal and/or the measurements are necessary. In this paper, we consider the case where the underlying signal $ \mathbf {x^{*}}$ is $s$ -sparse. For this case, we develop a novel recovery algorithm that we call Compressive Phase Retrieval with Alternating Minimization , or CoPRAM . Our algorithm is simple and is obtained via a natural combination of the classical alternating minimization approach for phase retrieval with the CoSaMP algorithm for sparse recovery. Despite its simplicity, we prove that our algorithm achieves a sample complexity of $ \mathcal {O}\left ({{s^{2} \log n}}\right )$ with Gaussian measurements $ { \mathbf {a}_{i}}$ , which matches the best known existing results; moreover, it also demonstrates linear convergence in theory and practice. An appealing feature of our algorithm is that it requires no extra tuning parameters other than the signal sparsity level $s$ . Moreover, we show that our algorithm is robust to noise. The quadratic dependence of sample complexity on the sparsity level is sub-optimal, and we demonstrate how to alleviate this via additional assumptions beyond sparsity. First, we study the (practically) relevant case where the sorted coefficients of the underlying sparse signal exhibit a power law decay. In this scenario, we show that the CoPRAM algorithm achieves a sample complexity of $ \mathcal {O}\left ({{s \log n}}\right )$ , which is close to the information-theoretic limit. We then consider the case where the underlying signal $ \mathbf {x^{*}}$ arises from structured sparsity models. We specifically examine the case of block-sparse signals with uniform block size of $b$ and block sparsity $k=s/b$ . For this problem, we design a recovery algorithm that we call Block CoPRAM that further reduces the sample complexity to $ \mathcal {O}\left ({{ks \log n}}\right )$ . For sufficiently large block lengths of $b=\Theta (s)$ , this bound equates to $ \mathcal {O}\left ({{s \log n}}\right )$ . To our knowledge, our approach constitutes the first family of linearly convergent algorithms for signal recovery from magnitude-only Gaussian measurements that exhibit a sub-quadratic dependence on the signal sparsity level.