Stratified sampling is one of the powerful variance reduction methods for analyzing system performance, such as reliability, with stochastic simulation. It divides the input space into disjoint subsets, called strata, to draw samples from each stratum. Partitioning the input space properly and allocating greater computational effort to crucial strata can help accurately estimate system performance with a limited computational budget. How to create strata, however, has yet to be thoroughly examined. Strata design faces the curse of dimensionality and data scarcity as the input dimension increases. We analytically derive the optimal stratification structure that minimizes the estimation variance for univariate problems. Further, reconciling the optimal stratification into decision trees, we devise a robust algorithm for multi-dimensional problems. Numerical experiments and a wind turbine case study demonstrate the superiority of the proposed method in terms of variance reduction, leading to computational efficiency and scalability.