We consider a joint pricing and inventory control problem in which the customer’s response to selling price and the demand distribution are not known a priori. Unsatisfied demand is lost and unobserved, and the only available information for decision making is the observed sales data (also known as censored demand). Conventional approaches, such as stochastic approximation, online convex optimization, and continuum-armed bandit algorithms, cannot be employed, because neither the realized values of the profit function nor its derivatives are known. A major challenge of this problem lies in that the estimated profit function constructed from observed sales data is multimodal in price. We develop a nonparametric spline approximation–based learning algorithm. The algorithm separates the planning horizon into a disjoint exploration phase and an exploitation phase. During the exploration phase, a spline approximation of the demand-price function is constructed based on sales data, and then the corresponding surrogate optimization problem is solved on a sparse grid to obtain a pair of recommended price and target inventory level. During the exploitation phase, the algorithm implements the recommended strategies. We establish a (nearly) square-root regret rate, which (almost) matches the theoretical lower bound.