With the rapid development of the internet and popularization of intelligent mobile devices, social media is evolving fast and contains rich spatial information, such as geolocated posts, tweets, photos, video, and audio. Those location-based social media data have offered new opportunities for hazards and disaster identification or tracking, recommendations for locations, friends or tags, pay-per-click advertising, etc. Meanwhile, a massive amount of remote sensing (RS) data can be easily acquired in both high temporal and spatial resolution with a multiple satellite system, if RS maps can be provided, to possibly enable the monitoring of our location-based living environments with some devices like charge-coupled device (CCD) cameras but on a much larger scale. To generate the classification maps, usually, labeled RS image pixels should be provided by RS experts to train a classification system. Traditionally, labeled samples are obtained according to ground surveys, image photo interpretation or a combination of the aforementioned strategies. All the strategies should be taken care of by domain experts, in a means which is costly, time consuming, and sometimes of a low quality due to reasons such as photo interpretation based on RS images only. These practices and constraints make it more challenging to classify land-cover RS images using big RS data. In this paper, a new methodology is proposed to classify urban RS images by exploiting the semantics of location-based social media photos (SMPs). To validate the effectiveness of this methodology, an automatic classification system is developed based on RS images as well as SMPs via big data analysis techniques including active learning, crowdsourcing, shallow machine learning, and deep learning. As the labels of RS training data are given by ordinary people with a crowdsourcing technique, the developed system is named Crowd4RS. The quantitative and qualitative experiments confirm the effectiveness of the proposed Crowd4RS system as well as the proposed methodology for automatically generating RS image maps in terms of classification results based on big RS data made up of multispectral RS images in a high spatial resolution and a large amount of photos from social media sites, such as Flickr and Panoramio.