Everyday life often requires us to switch between states of cognitive flexibility and stability. However, little is known about what drives their regulation at the meta-control level. Based on current theories of cognitive control, we tested whether the regulation of cognitive flexibility and stability is guided by reinforcement learning. Using a task-switching paradigm with a double registration procedure – where people need to choose which task to perform before seeing the target, we systematically reinforced cued task switches or repetitions to test whether this led to more voluntary task switching (flexibility) or repeating (stability) on interspersed unrewarded free choice trials, respectively. While we did not find the hypothesized effect in an uninstructed version of the experiment (n=97), informing participants (n=58) on the reinforcement schedule on cued trials induced more switching or repeating on unrewarded free-choice trials. We speculate that people adapt their control strategies if they are aware of the respective benefits, even when they cease to be rewarded. Interestingly, choices were predominantly driven by task preference in both experiments and strongly correlated with task performance costs. This finding suggests that, when using an interspersed design with a double registration procedure, people mostly choose tasks based on task difficulty rather than switch avoidance. Together, these experiments help determine whether and when people can learn about the value of task selection strategies beyond the scope of a single task, and provide a self-regulating system to understand putative higher-order control processes.