The paper deals with a class of discounted discrete-time Markov control models with non-constant discount factors of the form \(\tilde{\alpha } (x_{n},a_{n},\xi _{n+1})\), where \(x_{n},a_{n},\) and \(\xi _{n+1}\) are the state, the action, and a random disturbance at time \(n,\) respectively, taking values in Borel spaces. Assuming that the one-stage cost is possibly unbounded and that the distributions of \(\xi _{n}\) are unknown, we study the corresponding optimal control problem under two settings. Firstly we assume that the random disturbance process \(\left\{ \xi _{n}\right\} \) is formed by observable independent and identically distributed random variables, and then we introduce an estimation and control procedure to construct strategies. Instead, in the second one, \(\left\{ \xi _{n}\right\} \) is assumed to be non-observable whose distributions may change from stage to stage, and in this case the problem is studied as a minimax control problem in which the controller has an opponent selecting the distribution of the corresponding random disturbance at each stage.