Of all indoor localization techniques, vision-based localization emerges as a promising one, mainly due to the ubiquity of rich visual features. Visual landmarks, which present distinguishing textures, play a fundamental role in visual indoor localization. However, few researches focus on visual landmark labeling. Preliminary arts usually designate a surveyor to select and record visual landmarks, which is tedious and time-consuming. Furthermore, due to structural changes (e.g., renovation), the visual landmark database may be outdated, leading to degraded localization accuracy. To overcome these limitations, we propose VILL , a user-friendly, efficient, and accurate approach for visual landmark labeling. VILL asks a user to sweep the camera to take a video clip of his/her surroundings. In the construction stage, VILL identifies unlabeled visual landmarks from videos adaptively according to the graph-based visual correlation representation. Based on the spatial correlations with selected anchor landmarks, VILL estimates locations of unlabeled ones on the floorplan accurately. In the update stage, VILL formulates an alteration identification model based on the judgments from different users to identify altered landmarks accurately. Extensive experimental results in two different trial sites show that VILL reduces the site survey substantially (by at least 65.9%) and achieves comparable accuracy.