Much of the statistical learning literature has focused on adjacent dependency learning, which has shown that learners are capable of extracting adjacent statistics from continuous language streams. In contrast, studies on non-adjacent dependency learning have mixed results, with some showing success and others failure. We review the literature on non-adjacent dependency learning and examine various theories proposed to account for these results, including the proposed necessity of the presence of pauses in the learning stream, or proposals regarding competition between adjacent and non-adjacent dependency learning such that high variability of middle elements is beneficial to learning. Here we challenge those accounts by showing successful learning of non-adjacent dependencies under conditions that are inconsistent with predictions of previous theories. We show that non-adjacent dependencies are learnable without pauses at dependency edges in a variety of artificial language designs. Moreover, we find no evidence of a relationship between non-adjacent dependency learning and the robustness of adjacent statistics. We demonstrate that our two-step statistical learning model can account for all of our non-adjacent dependency learning results, and provides a unified learning account of adjacent and non-adjacent dependency learning. Finally, we discussed the theoretical implications of our findings for natural language acquisition, and argue that the dependency learning process can be a precursor to other language acquisition tasks that are vital to natural language acquisition.