This article studies the mean standard deviation (mean-std) Canadian traveller problem (CTP). Different from the canonical CTP, which aims at minimizing the traveller's expected travel time, while considering edge breakdown probabilities, we introduce the reliability version of CTP, which tries to find a routing policy with the minimal linear combination of the travel time's mean and standard deviation. With the recent development of internet-of-things (IoT) technology, the transportation network's edges' travel-time statistics, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</i> . <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">e</i> ., mean and standard deviation, as well as the traversal probabilities, are available to the end users. With those information, we propose a dual dynamic programming (DDP) method, which simultaneously estimates the first-order and the second-order moments of a given decision-list (DL) policy, and thereby makes improvements towards to the optimal one through the generalized policy iteration (GPI) scheme. We construct an open source benchmark environment to evaluate the performance of different mean-std CTP solutions, and show that the DDP method outperforms state of the arts in a range of transportation networks.