International Conference on Internet Multimedia Computing and Services (ICIMCS) is an annual conference sponsored by ACM SIGMM China Chapter. The conference is especially interested in the latest technologies and applications that deal with the web-scale processing and management of heterogeneous data from the Internet for multimedia computing and service. ICIMCS 2011 held in Chengdu, China— the ancient hometown of lovely panda. The conference has attracted around 80 participants, including researchers from academia and industries across ten countries/regions, for sharing their recent works in the topics ranging from visual information analysis and mining, query processing and search, multimedia privacy and security. This special issue comprises the extended versions of five papers, including two best papers, and three papers from the regular and special sessions of ICIMCS 2011. These papers cover key issues in multimedia computing, including the leveraging of Internet resources for multi-modality fusion and visual classifier learning, preprocessing and indexing of million-scale Internet data, and sharing and streaming of Internet videos. Some of these techniques also demonstrate applications for emerging Internet services, such as video recommendation and product search system, by processing and modeling the heterogeneous forms of resources associated with multimedia data. The first two papers are about the web-scale processing of Internet data. The paper entitled ‘‘Video Recommendation over Multiple Information Sources’’, presented by Meng Wang and his colleagues from National University of Singapore, proposes a unified framework that explores heterogeneous information sources for video recommendation. The framework, based on multi-task SVM learning, aggregates multiple ranked lists generated from personal data, social network, and video metadata into an optimized list for recommendation. The framework is experimented on a large video dataset composed of 1-month social activities happened on Facebook and YouTube websites by 76 users. The second paper entitled ‘‘Multi-label Multiinstance Learning with Missing Object Tags’’, presented by Yi Shen and his colleagues from University of North Carolina at Charlotte, proposes a web-scale learning of object classifiers for free from a collection of user-tagged Flickr images as many as 10 millions. Particularly, the paper addresses three important issues toward fully automatic learning: scalable filtering of spam tags by distributed image clustering; joint modeling of loose tags and missing tags by multiple instance learning that is capable of performing tag prediction; structural learning that takes into account the object relationship to train discriminant classifiers. The next two papers address the search and mining of visual instances. The paper entitled ‘‘Combining Global and Local Matching of Multiple Features for Precise Item Image Retrieval’’, co-authored by Haojie Li and his colleagues from Dalian University of Technology and Nanjing C.-W. Ngo (&) City University of Hong Kong, Hong Kong, Hong Kong e-mail: cscwngo@cityu.edu.hk