Articles published on google-street-view
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
2198 Search results
Sort by Recency
- Research Article
- 10.3390/su18020810
- Jan 13, 2026
- Sustainability
- Ziyun Ye + 3 more
Against the backdrop of rising global carbon emissions, promoting active transportation modes such as walking and cycling has become a key strategy for countries worldwide to meet carbon reduction targets and advance the goals of sustainable development. In China, the concept of low-carbon mobility has gained rapid traction, leading to a significant increase in public demand for non-motorized travel options like walking and cycling. From the perspective of inclusive urban development, gender imbalances in sample representation during design and evaluation processes have contributed to homogenization and a lack of diversity in urban slow-traffic environments. To address this issue, this study adopts a problem-oriented approach. First, we collect street scene images of slow-traffic environments through self-conducted field surveys. Concurrently, we gather satisfaction survey responses from 511 urban residents regarding existing slow-traffic streets, identifying three key environmental evaluation indicators: safety, liveliness, and beauty. Second, an experimental analysis is conducted to compare machine-generated assessments based on self-collected street view data with manual evaluations performed by 27 female participants. The findings reveal significant perceptual differences between genders in the assessment of slow-moving environments, particularly regarding attention to environmental elements, challenges in utilizing non-motorized lanes, and overall environmental satisfaction. Moreover, notable discrepancies are observed between machine scores and manual assessments performed by women. Based on these findings, this study investigates the underlying causes of such perceptual disparities and the mechanisms influencing them. Finally, it proposes female-inclusive strategies aimed at enhancing the quality of slow-traffic environments, thereby addressing the current absence of gender considerations in their design. This research seeks to provide a robust female perspective and empirical evidence to support improvements in the quality of slow-moving environments and to inform strategic advancements in their design. The findings of this study can provide a theoretical and empirical basis for the optimization of gender-inclusive non-motorized transportation environment design, policy formulation, and subsequent interdisciplinary research.
- Research Article
- 10.1186/s12940-025-01257-5
- Jan 9, 2026
- Environmental health : a global access science source
- Lucas Shen + 8 more
Policy-relevant spatial determinants of human exposure to Perfluoroalkyl Substances (PFAS), a broad class of persistent environmental contaminants affecting pregnancy and child development, remain poorly understood because of the diversity of exposure sources. This is especially true for modern, dense urban settings, which contain less well-studied built environment-related sources, including transportation-related ground and airborne contamination. We link high-resolution spatiotemporal urban land use data to longitudinal residential histories to assess determinants of individual-level blood plasma PFAS exposures in two geographically- and demographically- diverse cohorts of pregnant women in urban Singapore (n = 784 in 2009-2011; n = 384 in 2015-2017). Longitudinal repeated measures allow us to rule out socio-behavioral factors (e.g., residential segregation) as alternative explanations. Actual land use occupancies were ground-truthed through automated extraction of Google Street View data. Adjusting for known predictors and within-neighborhood unobserved spatial heterogeneity, a standard deviation (SD) increase (∼10,000m[Formula: see text]) in transport facility exposure was linked to 0.11 (1.78 ng/mL), 0.16, 0.11 SD increases in residents' perfluorobutane sulfonic acid (PFBS), perfluorobutanoic acid (PFBA), and perfluorononanoic acid (PFNA) concentrations, respectively, in the 2009 cohort. Dose-response analyses suggested that associations strengthened when transport facilities exceeded 10,000 m[Formula: see text], with residents living near ≥12,000 m[Formula: see text] exhibiting 7.3 ng/mL higher plasma PFBS (p = 0.04), consistent with footprints from large bus depots rather than smaller petrol kiosks. Associations with different PFAS congeners were replicated in the 2015 cohort. No other land use type showed similarly consistent findings. Transport facilities are prevalent near residences in urban settings and may be potential sources of PFAS emissions from automotive-related lubricants, parts, and materials. Our findings that exposure was robustly associated with individual-level concentration, over and above behavioral and other factors, highlight the importance of monitoring these and other urban sources of exposure.
- Research Article
- 10.3390/app16010550
- Jan 5, 2026
- Applied Sciences
- Jiqiu Deng + 2 more
Building height is an important indicator for describing the three-dimensional structure of cities. However, monitoring its changes is still difficult due to high labor costs, low efficiency, and the limited resolution and viewing angles of remote sensing images. This study proposes an automatic framework for estimating building height changes using multi-temporal street view images. First, buildings are detected by the YOLO-v5 model, and their contours are extracted through edge detection and hole filling. To reduce false detections, greenness and depth information are combined to filter out pseudo changes. Then, a neighboring region resampling strategy is used to select visually similar images for better alignment, which helps to reduce the influence of sampling errors. In addition, the framework applies cylindrical projection correction and introduces a triangulation-based method (HCAOT) for building height estimation. Experimental results show that the proposed framework achieves an accuracy of 85.11% in detecting real changes and 91.23% in identifying unchanged areas. For height estimation, the HCAOT method reaches an RMSE of 0.65 m and an NRMSE of 0.04, which performs better than several comparison methods. Overall, the proposed framework provides an efficient and reliable approach for dynamically updating 3D urban information and supporting spatial monitoring in smart cities.
- Research Article
- 10.1016/j.mex.2026.103785
- Jan 2, 2026
- MethodsX
- Iuria Betco + 2 more
Street-level imagery (SLI) is increasingly used in urban analytics for tasks like estimating greenery, conducting transport audits, and assessing facades. However, inconsistent image quality, uneven spatial coverage, and non-standardized acquisition methods limit reproducibility. We introduce USE-SVI (Urban Sampling & Extraction of Street View Imagery), a reproducible process to sample, acquire, and stitch street-view images for city-wide analysis. The protocol ensures regular spatial coverage sampling points at fixed intervals, generates four viewing directions per point to capture main views, acquires images through official Street View APIs or open-licence platforms (e.g., Mapillary or KartaView) with detailed metadata recording, and creates panoramas using OpenCV (e.g., ORB keypoints, FLANN matching, Stitcher). This approach produces evenly spaced images, clear provenance, and ready-to-use outputs (CSV, PNG, XLSX), supporting machine learning and visual checks. By standardizing key steps, sampling, acquisition, and stitching, USE-SVI enhances transparency and scalability, adheres to platform terms, and enables replication across cities and periods. Limitations involve variable provider coverage and occasional stitching failures in scenes with few features.
- Research Article
- 10.1002/eco.70175
- Jan 1, 2026
- Ecohydrology
- Pedro Aurélio Costa Lima Pequeno
ABSTRACT Rainfall is a major dimension of species climate niches. It has been proposed that groundwater can change species responses to rainfall by creating ‘hydrological refugia’ under a drier climate. However, the available evidence is conflicting and biased towards plants. This study tested whether shallow water tables (< 5 m deep) provide hydrological refugia to the Amazonian mound‐building termite, Amitermes excellens . This termite builds mounds up to 4 m tall in the Lavrado , the largest continuous savanna in northern Amazonia. Google Street View was used to remotely survey A. excellens mounds in 131 sites along a road network covering the study region (~795 km). For each site, published products on environmental and land use variables were also obtained. Mound abundance was modelled as a function of an interaction between mean water table depth and mean annual rainfall, while accounting for other factors. The analysis showed that mound abundance was higher on sandier soils and inside Indigenous lands but decreased with fire frequency and the Normalized Difference Vegetation Index. After accounting for these effects, a climate–groundwater interaction was found: Mound abundance increased with annual rainfall over deeper water tables but was consistently low over shallow water tables regardless of rainfall. Thus, groundwater can drastically change the realized climate niche of both plants and animals. However, shallow groundwater need not provide hydrological refugia under drier climates. Given that termites concentrate much of the terrestrial animal biomass and are major detritivores in the tropics, climate–groundwater interactions may have neglected impacts on global biogeochemical cycles.
- Research Article
1
- 10.1016/j.jtrangeo.2025.104485
- Jan 1, 2026
- Journal of Transport Geography
- Shuli Luo + 3 more
The integration of metro and bike systems has emerged as a promising climate change mitigation strategy, fostering a transition towards sustainable transportation modes within urban landscapes. Extensive research has probed the effects of neighborhood-level built environment factors such as accessibility, urban density, land use mix, and proximity to cycling infrastructure—on cycling behavior. However, the impact of eye-level built environment features, those physical and visual characteristics of urban spaces as perceived by individuals at street level, remains insufficiently explored. Using Generalized Additive Mixed Models (GAMM), we examine the effects of both neighborhood-level and eye-level built environment variables on first-mile and last-mile metro-bike integration trips during weekdays and weekends, accounting for spatial and temporal autocorrelations. The results reveal that commercial establishments, job opportunities, population density, and spatial and temporal dynamics all significantly influence metro-bike integration. Surprisingly, the presence of cycle lanes shows a weak effect on integrated metro-bike usage. In addition, non-linear relationships are observed between metro-bike usage and eye-level built environment variables such as sky ratio, greenery, and building ratio, indicating the existence of optimal levels for improving metro-bike integration. The findings emphasize the importance of considering eye-level urban aesthetics when planning for transport infrastructures and thereafter provide concrete threshold guidelines for urban planners and policymakers to better integrate cycling facilities with metro system. • Both station-level and eye-level built environment features significantly impact metro-bike integration. • Proposed a framework integrating street view images and bike-sharing data. • Non-linear relationships exist between eye-level variables and metro-bike usage. • Diurnal asymmetry in metro-bicycle integration: AM first-mile, PM last-mile prevalence. • The presence of cycle lanes shows a modest effect on integrated metro-bike usage.
- Research Article
1
- 10.1016/j.inffus.2025.103467
- Jan 1, 2026
- Information Fusion
- Chenbo Zhao + 4 more
• Fusion of diffusion model and human perception on street view imagery. • Street space quality improvement by fusion of 8.8 million perception survey data. • Improvements achieve an 86.36% success rate in improving perception scores. • Generative AI methodology enables quick, resident-oriented street space upgrades. The development of sustainable cities and communities aligns with the Sustainable Development Goals (SDGs) and smart city initiatives, emphasizing the integration of residents' subjective perceptions into urban street space planning. While previous research has quantitatively assessed streetscape quality, existing methods remain largely conceptual and lack actionable strategies for improvement. Recent advances in generative AI have enabled the generation of realistic and visually compelling images across various domains. However, most existing image generation frameworks lack a mechanism to directly incorporate residents' subjective perceptions when modifying street view imagery. This gap results in generated images that, while aesthetically impressive, may not fully align with the preferences and lived experiences of local communities. To address this issue, we propose a novel, data-driven approach that conditionally fuses subjective perception data into the transformation of original street view images. Our method integrates multidimensional perception cues, including beautiful, safety, lively, etc., fused the 8.8 million perception survey data to generate street views that are more reflective of public sentiment. Experimental evaluations demonstrate an 86.36% success rate in enhancing 22 distinct subjective perception metrics based on initial street view inputs. This fusion-based methodology advances both image generation and smart city development by aligning generated landscapes with resident preferences. It also provides urban planners and community stakeholders with a robust framework for visualizing targeted street space improvements and designing more livable, human-centric urban environments.
- Research Article
4
- 10.1016/j.compenvurbsys.2025.102356
- Jan 1, 2026
- Computers, Environment and Urban Systems
- Yuhao Kang + 7 more
Decoding human safety perception with eye-tracking systems, street view images, and explainable AI
- Research Article
- 10.1016/j.cities.2025.106399
- Jan 1, 2026
- Cities
- Zhanjun He + 5 more
Measuring the causal effect of urban environments on crime with street view images and points of interest
- Research Article
- 10.1016/j.tbs.2025.101157
- Jan 1, 2026
- Travel Behaviour and Society
- Zidong Yu + 2 more
Characterizing walkability in Hong Kong’s 15-minute transit-oriented development(TOD): insights from street view imagery and local accessibility
- Research Article
- 10.1016/j.aap.2025.108305
- Jan 1, 2026
- Accident; analysis and prevention
- Ying Chen + 5 more
Application of the dual-model interpretability framework of XGBoost-SHAP and GAT-GNNExplainer to investigate the impact of the built environment on traffic accidents at intersections.
- Research Article
- 10.1109/tgrs.2025.3646611
- Jan 1, 2026
- IEEE Transactions on Geoscience and Remote Sensing
- Zheyao Yu + 2 more
Street view (SV) images provide valuable supplementary data for characterizing the functional attributes of land use types, improving urban land use classification based on very-high-resolution (VHR) remote sensing images. However, integrating SV and VHR images is challenging due to differing view modes: SV provides oblique views, while VHR offers overhead views. This discrepancy causes misalignment in both spatial and feature spaces, especially when directly mapping SV images to corresponding VHR images. To address this, we propose a novel matching method that combines spatial position constraints and viewpoint transformation. The spatial constraint establishes coarse-level correspondence, while the viewpoint transformation generates pseudo-oblique VHR images to bridge the view gap. To refine the alignment, we introduce a strip-based correlation calculation component (SCCC) that adaptively matches image features using convolutional neural networks. We then extract high-level semantic features from the matched pairs and develop an alignment aware fusion module (AAFM) using Sinkhorn algorithm to align their representations in feature space, enhancing semantic consistency across modalities. This, in turn, improves the classification of urban land use. We evaluated our method on two datasets, LU-FZ-VHR-SV and LU-ZZ-VHR-SV, created using GF-2 and WorldView-3 images. Results demonstrate that land use classification using the matched VHR-SV image pairs evidently outperforms classification only using remote sensing data. Comparative experiments show that our proposed SCCC and AAFM deliver superior performance over alternative correlation and fusion strategies. We conclude that integrating SV images using our approach effectively improves urban land use classification, particularly in scenes with high inter-class similarity and intra-class variability.
- Research Article
- 10.1016/j.cities.2025.106424
- Jan 1, 2026
- Cities
- Ziyin Qi + 3 more
How streetscape shapes affect visitor emotions: An experimental analysis based on large-scale street view images in Xi'an, China
- Research Article
- 10.3390/sym18010068
- Dec 31, 2025
- Symmetry
- Sina Rezaei + 2 more
Semantic segmentation of crowdsourced street-level imagery plays a critical role in urban analytics by enabling pixel-wise understanding of urban scenes for applications such as walkability scoring, environmental comfort evaluation, and urban planning, where robustness to geometric transformations and projection-induced symmetry variations is essential. This study presents a comparative evaluation of two primary families of semantic segmentation models: transformer-based models (SegFormer and Mask2Former) and prompt-based models (CLIPSeg, LangSAM, and SAM+CLIP). The evaluation is conducted on images with varying geometric properties, including normal perspective, fisheye distortion, and panoramic format, representing different forms of projection symmetry and symmetry-breaking transformations, using data from Google Street View and Mapillary. Each model is evaluated on a unified benchmark with pixel-level annotations for key urban classes, including road, building, sky, vegetation, and additional elements grouped under the “Other” class. Segmentation performance is assessed through metric-based, statistical, and visual evaluations, with mean Intersection over Union (mIoU) and pixel accuracy serving as the primary metrics. Results show that LangSAM demonstrates strong robustness across different image formats, with mIoU scores of 64.48% on fisheye images, 85.78% on normal perspective images, and 96.07% on panoramic images, indicating strong semantic consistency under projection-induced symmetry variations. Among transformer-based models, SegFormer proves to be the most reliable, attains higher accuracy on fisheye and normal perspective images among all models, with mean IoU scores of 72.21%, 94.92%, and 75.13% on fisheye, normal, and panoramic imagery, respectively. LangSAM not only demonstrates robustness across different projection geometries but also delivers the lowest segmentation error, consistently identifying the correct class for corresponding objects. In contrast, CLIPSeg remains the weakest prompt-based model, with mIoU scores of 77.60% on normal images, 59.33% on panoramic images, and a substantial drop to 59.33% on fisheye imagery, reflecting sensitivity to projection-related symmetry distortions.
- Research Article
- 10.3390/ijgi15010018
- Dec 31, 2025
- ISPRS International Journal of Geo-Information
- Somang Kim + 2 more
Understanding the spatial distribution and determinants of perceived fear of crime is essential for enhancing urban safety and promoting equitable city development. This study models and explains perceived fear of crime from street view imagery using a GeoAI framework that integrates deep learning, semantic segmentation, and explainable AI techniques. Focusing on Yeongdeungpo-gu in Seoul, South Korea—a district characterized by diverse urban morphologies—we collected 171,942 pairwise comparison responses through a large-scale crowdsourced survey designed to capture visual perceptions of crime-related fear. A Vision Transformer-based Siamese network (RSS-Swin) was employed to predict continuous fear-of-crime scores, while semantic segmentation (SegFormer-B5) and AutoML regression were applied to identify built-environment features influencing these perceptions. SHAP-based interpretability analysis was then used to quantify the importance and interactions of key visual elements. The results reveal that open and accessible streetscape components, such as roads and sidewalks, consistently mitigate perceived fear, whereas enclosed or unmanaged features, including walls, poles, and narrow alleys, heighten it. Moreover, the effects of vegetation, fences, and buildings vary across spatial contexts, emphasizing the need for place-sensitive interpretation. By integrating predictive modeling and explainable analysis, this study advances a transparent and scalable GeoAI framework for understanding the visual and environmental determinants of crime-related fear and supporting perception-aware CPTED strategies.
- Research Article
- 10.1080/13467581.2025.2608446
- Dec 29, 2025
- Journal of Asian Architecture and Building Engineering
- Sifan Jia + 2 more
ABSTRACT With the continuous expansion of the underground public space network, metro public spaces are increasingly confronted with challenges like extreme weather events and sudden surges in passenger flow. Existing research lacks a resilience evaluation model based on multisource data, making it difficult to systematically identify weaknesses in composite spaces. Based on multisource data, this paper develops an evaluation model containing three dimensions of resilience, adaptability, and changeability, and employs the TOPSIS-entropy weight-AHP method to standardize the assessment of metro public space data. The Guangji South Road Station on the Suzhou Metro in China serves as a case study to validate the proposed model. Through GIS analysis and a street view semantic segmentation method, the resilience weaknesses of the station in the dimensions of spatial resources, environmental comfort, and management mechanisms are visualized. To bolster resilience, the study advances a strategic framework encompassing spatial transformation, data monitoring, and collaborative management to enhance resilience performance by boosting spatial redundancy, implementing a data-based management mode, and upgrading facility service capabilities. The study confirms that the resilience evaluation model driven by multisource data can identify weak points in metro public space resilience, providing theoretical and practical references for establishing resilient metro public spaces.
- Research Article
- 10.1080/17489725.2025.2599302
- Dec 29, 2025
- Journal of Location Based Services
- Meiliu Wu + 2 more
ABSTRACT Although Vision-Language Models (VLMs) have demonstrated great potential, limited integration of spatial context via textual input has constrained their performance in geospatial analysis, particularly for location-based services (LBS). To this end, we propose Spatial-context Prompt Tuning, tailored to global image geo-localization tasks. Six spatial-context dimensions (i.e., geospatial image types, geo-localization clues, spatial patterns, land use/land cover, urban perception, and urban development) are designed to create Visual Question Answering (VQA) prompts with GPT-4, based on three imagery types (street view images, satellite images, map tiles) sampled from 790 populous cities. Next, we evaluate the efficacy of these dimensions based on a leading open-source VLM – Contrastive Language-Image Pre-training (CLIP). Results demonstrate consistent, task-dependent accuracy improvements in image geo-localization: land-use prompts improve city-level accuracy by 4.5%, and image-type prompts increase country-level accuracy by 6.7% and continent-level accuracy by 3.0%. Our key contributions include: (1) revealing how spatial versus non-spatial context affects prompt-tuned VLM performance, (2) designing reusabGle six dimensions of spatial context that support explainable, context-aware VLMs for geospatial applications, and (3) enhancing geo-localization accuracy across heterogeneous geospatial imagery. This work charts a positive direction for Geospatial Artificial Intelligence (GeoAI)-empowered research, enabling more effective and interpretable VLM applications in geospatial domains.
- Research Article
- 10.3390/s26010197
- Dec 27, 2025
- Sensors (Basel, Switzerland)
- Po-Chyi Su + 5 more
Scene text detection in multilingual environments poses significant challenges. Traditional detection methods often struggle with language-specific features and require extensive annotated training data for each language, making them less practical for multilingual contexts. The diversity of character shapes, sizes, and orientations in natural scenes, along with text deformation and partial occlusions, further complicates the task of detection. This paper introduces LICS (Locating Inter-Character Spaces), a method that detects inter-character gaps as language-agnostic structural cues, enabling more feasible multilingual text detection. A two-stage approach is employed: first, we train on synthetic data with precise character gap annotations, and then apply weakly supervised learning to real-world datasets with word-level labels. The weakly supervised learning framework eliminates the need for character-level annotations in target languages, substantially reducing the annotation burden while maintaining robust performance. Experimental results on the ICDAR and Total-Text benchmarks demonstrate the strong performance of LICS, particularly on Asian scripts. We also introduce CSVT (Character-Labeled Street View Text), a new scene-text dataset comprising approximately 20,000 carefully annotated streetscape images. A set of standardized labeling principles is established to ensure consistent annotation of text locations, content, and language types. CSVT is expected to facilitate more advanced research and development in multilingual scene-text analysis.
- Research Article
- 10.3390/f17010032
- Dec 26, 2025
- Forests
- Yu-Xiang Sun + 6 more
Rapid urbanization has intensified the demand for street designs that reconcile ecological quality with positive human experiences, particularly in high-density cities such as Tianjin, China. Streets function as key interfaces where ecological processes, social activities and human perception intersect. However, existing research tends to emphasize the amount of greenery while overlooking its structural characteristics, to treat perception as a psychological response decoupled from spatial context, and to make limited use of fine-grained functional data to examine how ecology and perception interact. This study develops an integrated analytical framework that combines the DeepLabV3+ model to extract the Urban Street Greenery Generalized Structure (USGGS) from Baidu Street View imagery with a vision transformer model trained on the Place Pulse 2.0 dataset to derive multidimensional perceptual metrics. Functional diversity is represented using point-of-interest (POI) data, and an enhanced Light Gradient Boosting Machine (LightGBM) model is employed to explore associations among greenery structure, perceived qualities and functional characteristics. Analyses of six urban districts in Tianjin indicate that ecological and perceived street qualities are closely related to the degree of coupling between vegetation structure and functional diversity. Streets characterized by multi-layered greenery and diverse, active functions tend to exhibit higher perceived aesthetics, safety and vitality, whereas streets with single-layer vegetation or functionally monotonous environments generally do not perform as well. Functional patterns appear to mediate relationships between greening and perception by shaping how ecological form is experienced through everyday social activities. Overall, the results suggest that closer coordination between ecological design and functional organization is important for fostering urban streets that combine environmental resilience with strong perceived appeal.
- Research Article
- 10.18038/estubtda.1721167
- Dec 25, 2025
- Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering
- Ali Ekincek + 1 more
Urban perception is a multidimensional phenomenon reflecting individuals’ evaluations of the urban environment and playing a critical role in planning and design processes aimed at improving quality of life. This study aims to predict six different themes of urban perception (beautiful, boring, depressing, lively, safe, wealthy) from street view images using regression-based deep learning methods. Three different deep learning architectures—ResNet18, VGG19, and EfficientNet-B1—were employed. The Place Pulse 2.0 dataset was utilized in the modeling process, with approximately 110,000 labeled street images processed through necessary preprocessing steps (resizing, cropping, tensor conversion, and normalization). Models were trained with an 80% training and 20% validation split. Performance evaluation was conducted using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), R2 and validation loss graphs. Findings indicate that the EfficientNet-B1 model achieved the lowest error values, particularly in the “safe” and “lively” themes, while the ResNet18 model offered more balanced and stable performance in terms of validation loss. The VGG19 model generally yielded higher error rates and exhibited a clear tendency toward overfitting. It was observed that theme-specific visual complexity directly affected model performance. In conclusion, while deep learning architectures prove effective in modeling urban perception through visual data, both the choice of architecture and the inherent nature of the theme play decisive roles in model performance. This study highlights the importance of architecture- and theme-sensitive model design in AI-supported analysis of urban perception.