ABSTRACT In this article, we study the contextual dynamic pricing problem where the market value of a product is linear in its observed features plus some market noise. Products are sold one at a time, and only a binary response indicating success or failure of a sale is observed. Our model setting is similar to the work by? except that we expand the demand curve to a semiparametric model and learn dynamically both parametric and nonparametric components. We propose a dynamic statistical learning and decision making policy that minimizes regret (maximizes revenue) by combining semiparametric estimation for a generalized linear model with unknown link and online decision making. Under mild conditions, for a market noise cdf F ( · ) with mth order derivative ( m ≥ 2 ), our policy achieves a regret upper bound of O ˜ d ( T 2 m + 1 4 m − 1 ) , where T is the time horizon and O ˜ d is the order hiding logarithmic terms and the feature dimension d. The upper bound is further reduced to O ˜ d ( T ) if F is super smooth. These upper bounds are close to Ω ( T ) , the lower bound where F belongs to a parametric class. We further generalize these results to the case with dynamic dependent product features under the strong mixing condition. Supplementary materials for this article are available online.