The cluster randomized stepped wedge design is a multi-period uni-directional switch design in which all clusters start in the control condition and at the beginning of each new period a random sample of clusters crosses over to the intervention condition. Such designs often use uniform allocation, with an equal number of clusters at each treatment switch. However, the uniform allocation is not necessarily the most efficient. This study derives the optimal allocation of clusters to treatment sequences in the cluster randomized stepped wedge design, for both cohort and cross-sectional designs. The correlation structure is exponential decay, meaning the correlation decreases with the time lag between two measurements. The optimal allocation is shown to depend on the intraclass correlation coefficient, the number of subjects per cluster-period and the cluster and (in the case of a cohort design) individual autocorrelation coefficients. For small to medium values of these autocorrelations those sequences that have their treatment switch earlier or later in the study are allocated a larger proportion of clusters than those clusters that have their treatment switch halfway the study. When the autocorrelation coefficients increase, the clusters become more equally distributed across the treatment sequences. For the cohort design, the optimal allocation is almost equal to the uniform allocation when both autocorrelations approach the value 1. For almost all scenarios that were studied, the efficiency of the uniform allocation is 0.8 or higher. R code to derive the optimal allocation is available online.