As sensor calibration plays an important role in visual-inertial sensor fusion, this article performs an in-depth investigation of online self-calibration for robust and accurate visual-inertial state estimation. To this end, we first conduct complete observability analysis for visual-inertial navigation systems (VINS) with full calibration of sensing parameters, including inertial measurement unit (IMU)/camera intrinsics and IMU-camera spatial-temporal extrinsic calibration, along with readout time of rolling shutter (RS) cameras (if used). We study different inertial model variants containing intrinsic parameters that encompass most commonly used models for low-cost inertial sensors. With these models, the observability analysis of linearized VINS with full sensor calibration is performed. Our analysis theoretically proves the intuition commonly assumed in the literature—that is, VINS with full sensor calibration has four unobservable directions, corresponding to the system's global yaw and position, while all sensor calibration parameters are observable given fully excited motions. Moreover, we, for the first time, identify degenerate motion primitives for IMU and camera intrinsic calibration, which, when combined, may produce complex degenerate motions. We compare the proposed <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">online</i> self-calibration on commonly used IMUs against the state-of-art <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">offline</i> calibration toolbox Kalibr, showing that the proposed system achieves better consistency and repeatability. Based on our analysis and experimental evaluations, we also offer practical guidelines to effectively perform online IMU-camera self-calibration in practice.