In this article, we propose a learning approach to analyze dynamic systems with an asymmetric information structure. Instead of adopting a game-theoretic setting, we investigate an online quadratic optimization problem driven by system noises with unknown statistics. Due to information asymmetry, it is infeasible to use the classic Kalman filter nor optimal control strategies for such systems. It is necessary and beneficial to develop an admissible approach that learns the probability statistics as time goes forward. Motivated by the online convex optimization (OCO) theory, we introduce the notion of regret, which is defined as the cumulative performance loss difference between the optimal offline-known statistics cost and the optimal online-unknown statistics cost. By utilizing dynamic programming and linear minimum mean square biased estimate (LMMSUE), we propose a new type of online state-feedback control policy and characterize the behavior of regret in a finite-time regime. The regret is shown to be sublinear and bounded by O(lnT) . Moreover, we address an online optimization problem with output-feedback control policy and propose a heuristic online control policy.