Language comprehension involves the grouping of words into larger multiword chunks. This is required to recode information into sparser representations to mitigate memory limitations and counteract forgetting. It has been suggested that electrophysiological processing time windows constrain the formation of these units. Specifically, the period of rhythmic neural activity (i.e., slow-frequency neural oscillations) may set an upper limit of 2-3 sec. Here, we assess whether learning of new multiword chunks is also affected by this neural limit. We applied an auditory statistical learning paradigm of an artificial language while manipulating the duration of to-be-learnt chunks. Participants listened to isochronous sequences of disyllabic pseudowords from which they could learn hidden three-word chunks based on transitional probabilities. We presented chunks of 1.95, 2.55, and 3.15 sec that were created by varying the pause interval between pseudowords. In a first behavioral experiment, we tested learning using an implicit target detection task. We found better learning for chunks of 2.55 sec as compared to longer durations in line with an upper limit of the proposed time constraint. In a second experiment, we recorded participants' electroencephalogram during the exposure phase to use frequency tagging as a neural index of statistical learning. Extending the behavioral findings, results show a significant decline in neural tracking for chunks exceeding 3 sec as compared to both shorter durations. Overall, we suggest that language learning is constrained by endogenous time constraints, possibly reflecting electrophysiological processing windows.