Analisis Efektivitas Metode Filtering dan Intersection dalam Analisis Data Permukaan Bangunan dengan QGIS

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

This study evaluates the efficiency of two methods for processing geospatial building surface data, namely Filtering and Intersection, using a case study in Blitar Regency. The data for this research was obtained by comparing two sources: OpenStreetMap (OSM), which has a data completeness rate of 60%, and Google Open Building, with a data completeness rate of 90%. From these two sources, the data with the highest completeness, which is from Google Open Building, was selected for further analysis. The data processing was carried out using QGIS software, chosen for its capability to support various geospatial analysis methods. The comparison of the two methods was based on three main criteria: processing time, resource efficiency, and scalability. The results showed that the Filtering method outperforms in all these aspects. Filtering can complete processing in an average of 1.6 seconds, significantly faster than the Intersection method, which requires an average of 7 minutes and 50 seconds. In terms of resource efficiency, Filtering is also more economical, with an average CPU usage of 18.85% and memory usage of 121.4 MB, compared to the Intersection method’s 34.05% CPU usage and 236.4 MB of memory. Additionally, the Filtering method demonstrated better scalability, capable of handling larger datasets with fewer resources and less time. Therefore, the Filtering method is recommended for geospatial data processing that prioritizes speed, efficiency, and the ability to handle large and complex datasets.

Similar Papers
  • Research Article
  • 10.69803/3083-6034-2025-2-246
Modern information technologies for the analysis of geospatial BIG DATA
  • Nov 19, 2025
  • Journal of management economics and technology
  • Yu.V Synyavina + 2 more

The article is devoted to the study of modern information technologies for processing and analyzing geospatial Big Data, which are becoming an integral component of scientific research, business processes, and public administration. The relevance of this work is determined by the rapid growth of data volumes generated by satellites, drones, IoT devices, cadastral systems, and geographic information systems. This creates new challenges for data collection, storage, processing, and analysis, which cannot be effectively addressed by traditional GIS alone. The study examines the transition from classical geographic information systems to modern cloud platforms, distributed computing technologies, and artificial intelligence methods that provide high performance and scalability in geospatial data analysis. Three main classes of technological solutions are systematized and characterized: distributed computing frameworks (Apache Spark and its extensions), specialized spatio-temporal databases (GeoMesa), and cloud platforms (Google Earth Engine). Special attention is paid to Big Data platforms such as Google Earth Engine, AWS, and Microsoft Planetary Computer, as well as to the integration of specialized tools (PostGIS, Hadoop, Spark + GeoSpark) that enable efficient spatial data processing. The advantages and limitations of each approach are justified, and their optimal application areas are identified. It is shown that the choice of the most appropriate technology is a non-trivial task and depends on the specificity of the problem: Spark for customized analytics, GeoMesa for operational monitoring, and GEE for global raster data analysis. It is emphasized that the further development of this field is associated with the improvement of Big Data processing algorithms, the integration of geospatial analytics with artificial intelligence technologies, and the creation of more effective visualization interfaces. Thus, modern information technologies for analyzing geospatial Big Data are a strategic tool for addressing pressing scientific and practical challenges, playing a key role in ensuring sustainable development, environmental security, and efficient resource management. The perspectives of further research are formulated in the direction of integrating artificial intelligence methods (GeoAI) for detecting hidden patterns, and advancing real-time geospatial data processing technologies.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 4
  • 10.55779/ng2242
Evaluating the pedestrian accessibility to public services using open-source geospatial data and QGIS software
  • Jun 23, 2022
  • Nova Geodesia
  • Alba Kucukali + 3 more

This study brings a rapid method to utilize the available open-source geospatial data in assessing the pedestrian accessibility to key public services/facilities. At this stage, we are testing the method in the case of Tirana, the capital city of Albania. Yet, the method is reproducible to other metropolitan areas around the world. Open street map (OSM) data and reference layers from Albanian National authority for geospatial information (ASIG geoportal) have been used as the raw material of the study. While the geospatial visualization, refinement, and analysis rely on the usage of QGIS software and the related plugins. QNEAT is the plugin that was used to generate the isochrones which indicate the spatial coverage of a certain service referring to the existing urban transportation/circulation network. The plugin enables the definition of different distance ranges. Our results show that certain public services serve to various amounts of the building stock at a gradient of walking distances. For example, more than 25% of the existing building stock has pedestrian access to caffes and pharmacies within a walking distance of 250 m. The same services serve to almost 90% of the same building stock within 1 km walking distance. However, services like banks are accessible only by 12.6% of the existing buildings within a walking distance of 250 m, and 67% at 1 km walking distance. The accuracy of the available geospatial data resulted to be vital for the reliability of the results. We conclude by highlighting the importance and utility of GIS-based methods of urban analysis in the processes of planning new public services in the city.

  • PDF Download Icon
  • Preprint Article
  • Cite Count Icon 2
  • 10.7287/peerj.preprints.2226v1
Geospatial Big Data processing in an open source distributed computing environment
  • Jul 4, 2016
  • Angéla Olasz + 1 more

In recent years, distributed computing has reached many areas of computer science including geographic and remote sensing information systems. However, distributed data processing solutions have primarily been focused on processing simple structured documents, rather than complex geospatial data. Hence, migrating current algorithms and data management to a distributed processing environment may require a great deal of effort. In data processing, different aspects are to be considered such as speed, precision or timeliness. All depending on data types and processing methods. Available data volume and variety evolving as never before which instantly exceeding the capabilities of traditional algorithm performance and hardware environment in the aspect of data management and computation. Augmented efficiency is required to exploit the available information derived from Geospatial Big Data. Most of the current distributed computing frameworks have important limitations on transparent and flexible control on processing (and/or storage) nodes. Hence, this paper presents a prototype for distribution (“tiling”), aggregation (“stitching”) and processing of Big Geospatial Data focusing the distribution and processing of raster data type. Furthermore, we introduce an own data and metadata catalogue enables to store the “lifecycle” of datasets, accessible for users and processes. The data distribution framework has no limitations on programming environment and can execute scripts (and workflows) written in different language (e.g. Python, R or C#). It is capable of processing raster, vector and point cloud data allowing full control of data distribution and processing. In this paper, the IQLib concept (https://github.com/posseidon/IQLib/) and background of practical realization as a prototype is presented, formulated within the IQmulus EU FP7 research and development project (http://www.iqmulus.eu). Further investigations on algorithmic and implementation details are in focus for the oral presentation.

  • Book Chapter
  • 10.4018/978-1-5225-8054-6.ch011
Semantic Web and Geospatial Unique Features Based Geospatial Data Integration
  • Jan 1, 2019
  • Ying Zhang + 6 more

Since large amount of geospatial data are produced by various sources and stored in incompatible formats, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. First, we provide a uniform integration paradigm for users to retrieve geospatial data. Then, we align the retrieved geospatial data in the modeling process to eliminate heterogeneity with the help of Karma. Our main contribution focuses on addressing the third problem. Previous work has been done by defining a set of semantic rules for performing the linking process. However, the geospatial data has some specific geospatial relationships, which is significant for linking but cannot be solved by the Semantic Web techniques directly. We take advantage of such unique features about geospatial data to implement the linking process. In addition, the previous work will meet a complicated problem when the geospatial data sources are in different languages. In contrast, our proposed linking algorithms are endowed with translation function, which can save the translating cost among all the geospatial sources with different languages. Finally, the geospatial data is integrated by eliminating data redundancy and combining the complementary properties from the linked records. We mainly adopt four kinds of geospatial data sources, namely, OpenStreetMap(OSM), Wikmapia, USGS and EPA, to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).

  • Research Article
  • Cite Count Icon 3
  • 10.4018/ijswis.2016010101
Semantic Web and Geospatial Unique Features Based Geospatial Data Integration
  • Jan 1, 2016
  • International Journal on Semantic Web and Information Systems
  • Ying Zhang + 6 more

Since large amount of geospatial data are produced by various sources and stored in incompatible formats, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. First, we provide a uniform integration paradigm for users to retrieve geospatial data. Then, we align the retrieved geospatial data in the modeling process to eliminate heterogeneity with the help of Karma. Our main contribution focuses on addressing the third problem. Previous work has been done by defining a set of semantic rules for performing the linking process. However, the geospatial data has some specific geospatial relationships, which is significant for linking but cannot be solved by the Semantic Web techniques directly. We take advantage of such unique features about geospatial data to implement the linking process. In addition, the previous work will meet a complicated problem when the geospatial data sources are in different languages. In contrast, our proposed linking algorithms are endowed with translation function, which can save the translating cost among all the geospatial sources with different languages. Finally, the geospatial data is integrated by eliminating data redundancy and combining the complementary properties from the linked records. We mainly adopt four kinds of geospatial data sources, namely, OpenStreetMap(OSM), Wikmapia, USGS and EPA, to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).

  • Research Article
  • 10.31891/2307-5732-2026-361-8
ОПТИМІЗАЦІЯ ОБРОБКИ СЛАБОСТРУКТУРОВАНИХ IoT-ДАНИХ НА ЕТАПІ ПЕРЕДОБРОБКИ У СИСТЕМАХ ВЕЛИКОГО ОБСЯГУ
  • Jan 29, 2026
  • Herald of Khmelnytskyi National University. Technical sciences
  • Володимир Мельник

In the era of digital transformation and the rapid proliferation of IoT devices, organizations are increasingly faced with the challenge of efficiently processing massive volumes of semi-structured data in real time. Such data—originating from sensors, smart devices, and distributed systems—often lack consistent structure, making their processing computationally expensive and resource-intensive. This paper presents a practical approach to optimizing resource utilization during the stream processing of semi-structured IoT data using a combination of Apache Spark Structured Streaming and Kubernetes-based orchestration. A synthetic dataset simulating 10,000 sensor readings of various types (temperature, humidity, pressure) was generated to replicate a real-world industrial IoT environment. Apache Spark was employed for the real-time aggregation and analysis of the data stream, while Kubernetes was utilized to dynamically allocate computing resources via the Horizontal Pod Autoscaler (HPA). The proposed method was evaluated using key performance metrics, including average CPU and memory usage, system latency, and processing time per iteration. The results demonstrate a significant improvement in performance and efficiency. After applying Kubernetes HPA, average CPU usage decreased from 85% to 55%, memory usage dropped from 80% to 50%, and processing latency was reduced by 25%. A comparative table and performance graphs are included to visualize the effectiveness of the optimization approach. This work highlights the value of integrating cloud-native orchestration tools with big data streaming engines to enhance system scalability and responsiveness. The findings underscore that even relatively simple infrastructure configurations—when combined strategically—can yield substantial improvements without resorting to overly complex architectures. Future directions include applying predictive scaling based on machine learning models and further optimizing system configurations for different types and volumes of semi-structured data.

  • Research Article
  • 10.29040/ijebar.v5i3.2891
Innovation Capabilities, Work Discipline, Leader Supervision and Their Effect On Employee Performance Pande Iron Crafts Center In Blitar Regency
  • Aug 28, 2021
  • International Journal of Economics, Business and Accounting Research (IJEBAR)
  • Sonang Sitohang

This study aims to determine the effect of innovation ability, work discipline, leadership supervision on the performance of Clushter Blacksmith employees in Blitar Regency. This research type is explanatory research with quantitative method. The population is permanent employees of Clushter Blacksmith, totaling 50 people with a complete enumeration sampling technique. Using a questionnaire to collect primary data related to innovation ability, work discipline, leadership supervision and employee performance. Data processing using Multiple Linear Regression Analysis with SPSS (Statissical Package for the Social Science) application. The results showed that innovation ability had a significant positive effect on the performance of the Blacksmithemployees performance in Blitar Regency. The higher the employee's innovation ability, the higher the employee's performance through the use of technology and information to develop, produce and market new products for the industry. Work discipline has a positive and significant effect on the performance of the Blacksmithemployees performance in Blitar Regency. The higher the employee's work discipline will have a good impact on employee performance. Supervision of the leadership has a positive and significant effect on the performance of the Blacksmithemployees performance in Blitar Regency. Supervision of the leadership is carried out to evaluate each employee's performance on a regular basis and provide input so that the performance of the employees develops.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.5194/isprsarchives-xl-3-w3-543-2015
RASTER DATA PARTITIONING FOR SUPPORTING DISTRIBUTED GIS PROCESSING
  • Aug 20, 2015
  • The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
  • B Nguyen Thai + 1 more

Abstract. In the geospatial sector big data concept also has already impact. Several studies facing originally computer science techniques applied in GIS processing of huge amount of geospatial data. In other research studies geospatial data is considered as it were always been big data (Lee and Kang, 2015). Nevertheless, we can prove data acquisition methods have been improved substantially not only the amount, but the resolution of raw data in spectral, spatial and temporal aspects as well. A significant portion of big data is geospatial data, and the size of such data is growing rapidly at least by 20% every year (Dasgupta, 2013). The produced increasing volume of raw data, in different format, representation and purpose the wealth of information derived from this data sets represents only valuable results. However, the computing capability and processing speed rather tackle with limitations, even if semi-automatic or automatic procedures are aimed on complex geospatial data (Krist´of et al., 2014). In late times, distributed computing has reached many interdisciplinary areas of computer science inclusive of remote sensing and geographic information processing approaches. Cloud computing even more requires appropriate processing algorithms to be distributed and handle geospatial big data. Map-Reduce programming model and distributed file systems have proven their capabilities to process non GIS big data. But sometimes it’s inconvenient or inefficient to rewrite existing algorithms to Map-Reduce programming model, also GIS data can not be partitioned as text-based data by line or by bytes. Hence, we would like to find an alternative solution for data partitioning, data distribution and execution of existing algorithms without rewriting or with only minor modifications. This paper focuses on technical overview of currently available distributed computing environments, as well as GIS data (raster data) partitioning, distribution and distributed processing of GIS algorithms. A proof of concept implementation have been made for raster data partitioning, distribution and processing. The first results on performance have been compared against commercial software ERDAS IMAGINE 2011 and 2014. Partitioning methods heavily depend on application areas, therefore we may consider data partitioning as a preprocessing step before applying processing services on data. As a proof of concept we have implemented a simple tile-based partitioning method splitting an image into smaller grids (NxM tiles) and comparing the processing time to existing methods by NDVI calculation. The concept is demonstrated using own development open source processing framework.

  • Research Article
  • Cite Count Icon 32
  • 10.1016/j.future.2018.09.061
A novel method for parallel indexing of real time geospatial big data generated by IoT devices
  • Oct 16, 2018
  • Future Generation Computer Systems
  • Suresh V Limkar + 1 more

A novel method for parallel indexing of real time geospatial big data generated by IoT devices

  • Research Article
  • 10.55606/jupti.v4i2.4216
Kajian Literatur Review : Analisis Perbandingan Integrasi Virtualisasi OS terhadap Efisiensi Sumber Daya Komputasi
  • May 31, 2025
  • Jurnal Publikasi Teknik Informatika
  • Aurylia Taffana + 4 more

The development of operating system virtualization technology has become a crucial need in the computing world to optimize the utilization of resources such as CPU, memory, and application response time. However, resource efficiency in each virtualization technology, including bare-metal, hypervisor, and container, remains a debated topic regarding optimal performance and resource usage. This study aims to analyze the computational resource efficiency of various virtualization technologies, focusing on CPU usage, memory consumption, and application response time. The method used is a descriptive literature review based on 23 relevant accredited national journals and articles. Data were analyzed through classification and comparison of previous research results on virtualization technology performance. The findings indicate that containers have the highest efficiency in CPU and memory usage as well as fast application response times, while hypervisors provide good isolation and security but with higher resource consumption. Bare-metal offers stable performance but with the largest resource usage. The implications of this study provide practical references for system developers and institutions in selecting virtualization technologies that suit resource efficiency needs and serve as a basis for further research in optimizing virtualization technologies.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/indiacom51348.2021.00008
GeoBD2: Geospatial Big Data Deduplication Scheme in Fog Assisted Cloud Computing Environment
  • Mar 17, 2021
  • Rabindra K Barik + 4 more

With the speedy expansion of Internet of Spatial Things, the enormous volume of geospatial big data is produced by the IoT devices. It gives rise to the new challenges for real time geospatial data processing and storing of reliable data in cloud system. The traditional geospatial cloud computing system is not efficient enough to process large volumetric of concurrent geospatial data. Consequently, fog assisted cloud computing environment has come into picture for achieving secure geospatial big data deduplication scheme. In this paper, we introduce a novel scheme GeoBD2 which defines the geo-deduplication structure to build an efficient geospatial bigdata deduplication scheme on fog assisted cloud computing framework. It also regulates which fog node needs to be traversed to investigate duplicate geospatial data rather than to traverse all the fog nodes. This can substantially enhance the efficiency of geospatial big data deduplication in fog assisted cloud environment. It also executes the performance analysis of the proposed scheme. By the experimental results, it is found that the proposed scheme has minimum overhead cost than the existing big data deduplication scheme.

  • Conference Article
  • Cite Count Icon 6
  • 10.1109/mdm.2016.53
An Unsupervised Collaborative Approach to Identifying Home and Work Locations
  • Jun 1, 2016
  • Rong Liu + 3 more

There is a growing interest in leveraging geo-spatial data to provide location-aware services. With a large amount of collected geo-spatial data, a crucial step is to identify important locations (e.g., home or work) and understand users' behavior at these locations. In this paper, we propose an unsupervised collaborative learning approach to identifying home and work locations of individuals from geo-spatial trajectory data. Our approach transforms user trajectory records into intuitive and insightful user-location signatures, clusters these signatures, and then identifies location types based on cluster characteristics. This clustering model can be used to identify base locations for new users. We validate this approach using Open Street Map and Foursquare location tags and obtain an accuracy of 80%.

  • Research Article
  • 10.51489/tuzal.854252
How do Sentinel-1 SAR images match with the existing maps
  • Jun 15, 2021
  • Turkish Journal of Remote Sensing
  • Ali Kilçik + 2 more

Synthetic Aperture Radar (SAR) images are used in several different applications for Remote Sensing purposes. SAR is an imaging sensor that can detect high-resolution ground images under a wide variety of imaging conditions. As SAR is an active system, the data are already acquired with geo-position information. To investigate to verify the image spatial accuracy, a part of the Antalya region of Turkey was selected as test site. The Open Street Map (OSM) and Photogrammetric Digital Map (PDM) data of the area on the Antalya O25 map were used in comparison. First, characteristic common points were selected on the OSM data and the SAR satellite image both. The projected coordinates of these points were calculated with the QGIS software. Normal distribution of the coordinate differences in these data sets were plotted. It was confirmed that the data sets were in normal distribution and standard deviation and 2 * standard deviation values were calculated. The maximum and minimum confidence interval (95%) was determined according to the standard deviation limit values. X and Y coordinate differences were calculated for 49 selected points from both image pairs SAR&OSM and SAR&PDM. Finally, the maximum differences show that the SAR positional accuracy respect to OSM and PSM is below 1 pixel azimuthal resolution.

  • Research Article
  • 10.35784/jcsi.7110
The impact of using eBPF technology on the performance of networking solutions in a Kubernetes cluster
  • Jun 30, 2025
  • Journal of Computer Sciences Institute
  • Konrad Miziński + 1 more

The aim of this study was to investigate the impact of eBPF technology on the performance of network solutions in Kubernetes clusters. Two configurations were compared: a traditional iptables-based setup and eBPF based solution via the Cilium networking plugin. Performance tests were conducted, measuring throughput, latency, CPU usage, and memory consumption under unloaded and loaded conditions. The results indicate that the traditional configuration achieved higher throughput and lower latency in unloaded scenarios. However, under load, the eBPF-enabled cluster demonstrated advantages, including reduced CPU and memory usage and slightly improved latency. This study highlights the potential of eBPF as an efficient technology for Kubernetes environments, particularly in scenarios demanding high performance and resource efficiency.

  • Conference Article
  • Cite Count Icon 6
  • 10.1145/3220228.3220236
A new data science framework for analysing and mining geospatial big data
  • Apr 20, 2018
  • Mo Saraee + 1 more

Geospatial Big Data analytics are changing the way that businesses operate in many industries. Although a good number of research works have reported in the literature on geospatial data analytics and real-time data processing of large spatial data streams, only a few have addressed the full geospatial big data analytics project lifecycle and geospatial data science project lifecycle. Big data analysis differs from traditional data analysis primarily due to the volume, velocity and variety characteristics of the data being processed. One of a motivation of introducing new framework is to address these big data analysis challenges. Geospatial data science projects differ from most traditional data analysis projects because they could be complex and in need of advanced technologies in comparison to the traditional data analysis projects. For this reason, it is essential to have a process to govern the project and ensure that the project participants are competent enough to carry on the process. To this end, this paper presents, new geospatial big data mining and machine learning framework for geospatial data acquisition, data fusion, data storing, managing, processing, analysing, visualising and modelling and evaluation. Having a good process for data analysis and clear guidelines for comprehensive analysis is always a plus point for any data science project. It also helps to predict required time and resources early in the process to get a clear idea of the business problem to be solved.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.