This paper implements and tests a label-setting algorithm for finding optimal hyperpaths in large transit networks with realistic headway distributions. It has been commonly assumed in the literature that headway is exponentially distributed. To validate this assumption, the empirical headway data archived by Chicago Transit Agency are fitted into various probabilistic distributions. The results suggest that the headway data fit much better with Loglogistic, Gamma and Erlang distributions than with the exponential distribution. Accordingly, we propose to model headway using the Erlang distribution in the proposed algorithm, because it best balances realism and tractability. When headway is not exponentially distributed, finding optimal hyperpaths may require enumerating all possible line combinations at each transfer stop, which is tractable only for a small number of alternative lines. To overcome this difficulty, a greedy method is implemented as a heuristic and compared to the brute-force enumeration method. The proposed algorithm is tested on a large scale CTA bus network that has over 10,000 stops. The results show that (1) the assumption of exponentially distributed headway may lead to sub-optimal route choices and (2) the heuristic greedy method provides near optimal solutions in all tested cases.