HierGP: Hierarchical Grid Partitioning for Scalable Geospatial Data Analytics

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Application domains such as environmental health science, climate science, and geosciences—where the relationship between humans and the environment is studied—are constantly evolving and require innovative approaches in geospatial data analysis. Recent technological advancements have led to the proliferation of high-granularity geospatial data, enabling such domains but posing major challenges in managing vast datasets that have high spatiotemporal similarities. We introduce the Hierarchical Grid Partitioning (HierGP) framework to address this issue. Unlike conventional discrete global grid systems, HierGP dynamically adapts to the data’s inherent characteristics. At the core of our framework is the Map Point Reduction algorithm, designed to aggregate and then collapse data points based on user-defined similarity criteria. This effectively reduces data volume while preserving essential information. The reduction process is particularly effective in handling environmental data from extensive geographical regions. We structure the data into a multilevel hierarchy from which a reduced representative dataset can be extracted. We compare the performance of HierGP against several state-of-the-art geospatial indexing algorithms and demonstrate that HierGP outperforms the existing approaches in terms of runtime, memory footprint, and scalability. We illustrate the benefits of the HierGP approach using two representative applications: analysis of over 289 million location samples from a registry of participants and efficient extraction of environmental data from large polygons. While the application demonstration in this work has focused on environmental health, the methodology of the HierGP framework can be extended to explore diverse geospatial analytics domains.

Similar Papers
  • Conference Article
  • Cite Count Icon 6
  • 10.1145/3220228.3220236
A new data science framework for analysing and mining geospatial big data
  • Apr 20, 2018
  • Mo Saraee + 1 more

Geospatial Big Data analytics are changing the way that businesses operate in many industries. Although a good number of research works have reported in the literature on geospatial data analytics and real-time data processing of large spatial data streams, only a few have addressed the full geospatial big data analytics project lifecycle and geospatial data science project lifecycle. Big data analysis differs from traditional data analysis primarily due to the volume, velocity and variety characteristics of the data being processed. One of a motivation of introducing new framework is to address these big data analysis challenges. Geospatial data science projects differ from most traditional data analysis projects because they could be complex and in need of advanced technologies in comparison to the traditional data analysis projects. For this reason, it is essential to have a process to govern the project and ensure that the project participants are competent enough to carry on the process. To this end, this paper presents, new geospatial big data mining and machine learning framework for geospatial data acquisition, data fusion, data storing, managing, processing, analysing, visualising and modelling and evaluation. Having a good process for data analysis and clear guidelines for comprehensive analysis is always a plus point for any data science project. It also helps to predict required time and resources early in the process to get a clear idea of the business problem to be solved.

  • Research Article
  • Cite Count Icon 5
  • 10.1007/s41060-017-0075-9
A data mining framework for environmental and geo-spatial data analysis
  • Sep 30, 2017
  • International Journal of Data Science and Analytics
  • Sujing Wang + 1 more

Mining geo-spatial data is an important task in many application domains, such as environmental science, geographic information science, and social networks. In this paper, we introduce a data mining framework, which includes pre-processing of environmental and geo-spatial data, geo-spatial data mining techniques, and visual analysis of environmental and geo-spatial data. In particular, we propose new density-based clustering algorithms to identify interesting distribution patterns from geo-spatial data, a change pattern discovery technique to detect dynamic change patterns within spatial clusters, and a post-processing technique to extract interesting patterns and useful knowledge from geo-spatial data. Our density-based clustering algorithms are based on the well-established density-based shared nearest neighbor clustering algorithm, which can find clusters of different shape, size, and densities in high-dimensional data. The post-processing analysis technique allows automatic screening of interesting spatial clusters. The change pattern discovery algorithm is able to detect and analyze dynamic patterns of changes within spatial clusters. This paper focuses on developing a framework integrating a sequence of data mining process including clustering algorithm, analysis technique and pattern changing discovery algorithm. In contrast to previous works in this area, our approaches can cluster and analyze dynamically evolved complex objects, i.e., polygons. We evaluate the effectiveness of our techniques through a challenging real case study involving ozone pollution events in the Houston–Galveston–Brazoria area. The experimental results show that our approaches can discover interesting patterns and useful information from geo-spatial air-quality data.

  • Research Article
  • Cite Count Icon 55
  • 10.1006/enrs.1995.1069
The Application of GIS in Environmental Health Sciences: Opportunities and Limitations
  • Nov 1, 1995
  • Environmental Research
  • U.S Tim

The Application of GIS in Environmental Health Sciences: Opportunities and Limitations

  • News Article
  • Cite Count Icon 1
  • 10.1289/ehp.114-a524
RTP Leaders Unite to Advance Environmental Health
  • Sep 1, 2006
  • Environmental Health Perspectives
  • Luz Claudio

When North Carolina’s Research Triangle Foundation provided 509 acres of land to the U.S. Surgeon General’s Office in 1967 as the site for the newly established Division of Environmental Health Sciences, the area was probably not foreseen as a hub for companies, institutions, and government agencies working on issues related to environmental health. Then, just two years later, the Division of Environmental Health Sciences was elevated to institute status to form the NIEHS. Since that time, the area now known as Research Triangle Park (RTP) has expanded into a nucleus of intellectual activity in environmental health sciences that includes the National Toxicology Program, the laboratories of the U.S. EPA, the CIIT Centers for Health Research, and environmental research programs at Duke University, the University of North Carolina–Chapel Hill, and North Carolina State University, among other institutions and nonprofit organizations. These organizations are now taking advantage of a unique opportunity to solidify RTP’s reputation as the epicenter for environmental health science research in the United States by creating a forum for discussion and debate of the important public health issues related to environment and health. Prominent individuals in the RTP community—including former North Carolina governor James Hunt, former NIEHS director Kenneth Olden, and William Roper, chief executive officer of the North Carolina Health Care System and former head of the CDC—have been working to bring thought leaders together on these issues in a new initiative that has been dubbed the Research Triangle Environmental Health Collaborative. The mission of the collaborative is to connect organizations and institutions; link research and policy; and join government, academia, industry, and public interest groups for the purpose of mutually considering, discussing, and debating the grand challenges in environmental health at the regional, national, and international levels. Says Olden, “When I came to the NIEHS many years ago, I realized the talent base we have here in RTP. The major environmental health research institutions are all here, the intellectual resources of the major research universities, and also the companies that have evolved around this. No place else in the world can boast this concentration of minds working on environmental public health issues. So we thought that it follows that if you can help to focus these talents in the areas where perhaps the most change can be effected, real progress might be made.”

  • News Article
  • 10.1289/ehp.112-a806
Mission: Educational
  • Oct 1, 2004
  • Environmental Health Perspectives
  • John Manuel

Mission: Educational

  • News Article
  • Cite Count Icon 1
  • 10.1289/ehp.113-a596
Beyond the Bench: Cultivating Environmental Leadership in the Midwest
  • Sep 1, 2005
  • Environmental Health Perspectives
  • Tanya Tillett

Today’s youth are the environmental health leaders of tomorrow. The Environmental Health Sciences Research Center (EHSRC) at The University of Iowa, in conjunction with its partner, the Belin–Blank International Center for Gifted Education and Talent Development (also a component of The University of Iowa), is helping some of these future leaders understand the environment and their role in it, with the goal of inspiring the next generation of environmental health advocates. Each summer since 1997, the two partners have joined forces to conduct the Environmental Health Sciences Institute for Rural Youth (EHSI), an intensive, full-scholarship, one-week residential experience for rising tenth-graders from small, rural Midwest communities. By giving high school students access to a wealth of environmental health information and helping them translate that information for dissemination to their own communities, the EHSI helps foster leadership qualities that will help them apply those skills to their future careers and their personal lives. Each summer the EHSI accepts about 15 students to the program, and houses them in student residence halls on the Iowa campus. According to David Osterberg, director of the EHSI, the primary goal of the program is to inspire students to consider the environmental health sciences as a possible future career. “We have students for a week, so we can aspire to do many things,” he says. “We help develop mentoring relationships between smart high school students and our scientists, expose students to cutting-edge research, show them a full range of environmental health topics, and give them some career options. I especially like to emphasize policy so students realize there are potential solutions to problems that impact the environment and human health.” Throughout the week, the students are exposed to information on environmental health and related research through lectures, interactive lab sessions, one-on-one mentoring, and field trips. In this year’s session, students attended lectures on such diverse topics as the relationship between cancer and the environment, nanotechnology in environmental health science, global climate change, and the connection between agriculture and health. The mentoring and lab sessions then give the students a first-hand glimpse of current research related to the lecture content. This summer’s lab activities included a pathology session in which students examined specimens of human organs to compare cancerous and healthy tissues. Another session focused on inhalation toxicology. Students dissected mouse lung tissue and examined the cells under a microscope to determine the effects of grain dust exposure on the lung. Afterwards, they watched a dust measurement and quantification demonstration in the EHSRC’s environmental modeling and assessment facilities. The program also exposes the students to initiatives taking place in the Iowa community that encourage environmental responsibility. One of this year’s field trips was a visit to the Amana Lily Pond, a wetland that has been planted with poplar trees to act as natural filters to prevent herbicides, insecticides, and fertilizers from entering the creek and emptying into the pond. The students also saw a demonstration by the Iowa Renewable Energy Association of the “solar traveler,” a mobile demonstration unit that produces electricity via solar power. Since public speaking is a crucial skill for scientists and public health workers, the program also includes a session on speaking in front of groups that helps the students improve their body language and voice projection to deliver an effective presentation. Once the summer session ends, each student chooses an environmental heatlh science topic and organizes information learned over the week into a presentation that is delivered to a school group and to a community group in their hometown. The presentations, as well as the other aspects of the EHSI experience, give the students the opportunity to become environmental health science ambassadors who can potentially impact the lives of their families and neighbors. “We hope that in the course of EHSI Week, students gain an appreciation for environmental issues as well as their own personal stake in how these issues affect their health, their families, and their communities,” says Osterberg.

  • Research Article
  • 10.1289/ehp.118-a224
Dust: The Inside Story of Its Role in the September 11th Aftermath By Paul J. Lioy . New York: Rowman & Littlefield, 2010. 245 pp. ISBN: 978-1-4422-0148-4, $34.95
  • May 1, 2010
  • Environmental Health Perspectives
  • Timothy J Buckley

Dust: The Inside Story of Its Role in the September 11th Aftermath <i>By Paul J. Lioy</i> . New York: Rowman &amp; Littlefield, 2010. 245 pp. ISBN: 978-1-4422-0148-4, $34.95

  • Front Matter
  • 10.1038/sj.jid.5700313
The Environment and Human Health
  • May 1, 2006
  • Journal of Investigative Dermatology
  • David A Schwartz

The Environment and Human Health

  • Research Article
  • 10.1289/ehp.115-a24
The Chosen ONES: Awards Fund Young Investigators
  • Jan 1, 2007
  • Environmental Health Perspectives
  • Ernie Hood

A major goal of the 2006 NIEHS Strategic Plan encompasses the institute’s desire to “recruit and train the next generation of environmental health scientists.” To begin to achieve that goal, the NIEHS has unveiled a new annual grants program called the Outstanding New Environmental Scientist (ONES) Award. The five-year grants are designed to identify, encourage, inspire, and support outstanding investigators early in their careers, who have not yet received their first R01 grant. The first ONES grants, totaling $3.6 million, were awarded in September 2006 to eight promising young scientists chosen from more than 70 applicants through a rigorous application, review, and interview process. The program is the brainchild of NIEHS director David Schwartz, who has been concerned for some time about the loss of promising young scientific talent from the field for lack of support. “As a faculty member at Duke,” he says, “I found that the individuals who were particularly vulnerable in terms of their career development were those at that transitional stage between mentored and independent research, and that many very bright, creative people simply were not supported in ways that enhanced their career development.” Schwartz says the awards are also intended to help attract innovative young investigators to the NIEHS and the environmental health sciences, as well as to support the institutions that are helping new scientists develop their careers. The program’s long-term impact on the field, in terms of both the science and the scientists, could be significant. “These individuals represent very promising early career trajectories that are likely to have a substantial effect on environmental health sciences, and hopefully will evolve into the leaders in the field in the future,” says Schwartz. To ease that tricky early-career transition, ONES grantees are encouraged to establish and meet annually with an advisory committee comprising senior experts in their disciplines. According to Pat Mastin, chief of the NIEHS Cellular, Organ, and Systems Pathobiology Branch, who helped coordinate the initial ONES process, the grants represent a hybrid between mentored career development awards and independent R01 grants. “We think the young investigators should continue to be mentored,” Mastin says. “So we encouraged them to identify not a specific mentor, but an advisory committee, to give not only scientific advice but also career path advice.” The grantees recognize and appreciate the value of this hybrid approach to mentoring. “It gives us access to people we wouldn’t normally be interacting with,” says ONES grantee Thomas Begley, an assistant professor in the Department of Biomedical Sciences at the University at Albany State University of New York. “Having a mechanism to ensure that will promote good science on my end, and also will help me network with others in the field.” Grantee Patricia Opresko, an assistant professor in the Department of Environmental and Occupational Health at the University of Pittsburgh, agrees. “The grant has funds that will allow the four investigators on my advisory committee to come to Pittsburgh and meet with me once a year to focus on my project and offer their ideas, insight, input, and criticisms,” she says. “It adds an additional layer of mentoring that is really critical for a young investigator’s development.”

  • News Article
  • Cite Count Icon 12
  • 10.1289/ehp.114-a350
Beyond the Bench: Bringing EXCITEment to the Classroom
  • Jun 1, 2006
  • Environmental Health Perspectives
  • Tanya Tillett

Who are the scientists, public health officials, and policy makers who will monitor our relationship with the environment 20 years from now? Right now a lot of them are students in middle and high schools throughout the country. And it’s a certainty that these future stakeholders will need to develop the diversity of skills required to tackle the complex issues that arise where environmental and human health intersect––skills that go beyond the practice of simple classroom science experiments. Answering this call to train is Project EXCITE (Environmental Health Science Explorations through Cross-Disciplinary and Investigative Team Experiences), an NIEHS-supported program at Bowling Green State University (BGSU) in Ohio.

  • Abstract
  • 10.5210/ojphi.v11i1.9772
Tracking environmental hazards and health outcomes to inform decision-making in the United States
  • May 30, 2019
  • Online Journal of Public Health Informatics
  • Heather Strosnider + 4 more

ObjectiveTo increase the availability and accessibility of standardized environmental health data for public health surveillance and decision-making.IntroductionIn 2002, the United States (US) Centers for Disease Control and Prevention (CDC) launched the National Environmental Public Health Tracking Program (Tracking Program) to address the challenges in environmental health surveillance described by the Pew Environmental Commission (1). The report cited gaps in our understanding of how the environment affects our health and attributed these gaps to a dearth of surveillance data for environmental hazards, human exposures, and health effects. The Tracking Program’s mission is to provide information from a nationwide network of integrated health and environmental data that drives actions to improve the health of communities. Accomplishing this mission requires a range of expertise from environmental health scientists to programmers to communicators employing the best practices and latest technical advances of their disciplines. Critical to this mission, the Tracking Program must identify and prioritize what data are needed, address any gaps found, and integrate the data into the network for ongoing surveillance.MethodsThe Tracking Program identifies important environmental health topics with data challenges based on the recommendations in the Pew Commission report as well as input from federal, state, territorial, tribal, and local partners. For each topic, the first step is to formulate the key surveillance question, which includes identifying the decision-maker or end user. Next, available data are evaluated to determine if the data can answer the question and, if not, what enhancements or new data are needed. Standards are developed to establish data requirements and to ensure consistency and comparability. Standardized data are then integrated into the network at national, state, and local levels. Standardized measures are calculated to translate the data into the information needed. These measures are then publically disseminated via national, state, and local web-based portals. Data are updated annually or as they are available and new data are added regularly. All data undergo a multi-step validation process that is semi-automated, routinized, and reproducible.ResultsThe first set of nationally consistent data and measures (NCDM) was released in 2008 and covered 8 environmental health topics. Since then the NCDM have grown to cover 14 topics. Additional standardized data and measures are integrated into the national network resulting in 23 topics with standardized 450 measures (Figure). On the national network, measures can be queried via the Data Explorer, viewed in the info-by-location application, or connected to via the network’s Application Program Interface (API). On average, 15,000 and 3300 queries are run every month on the Data Explorer and the API respectfully. Additional locally relevant data are available on state and local tracking networks.Gaps in data have been addressed through standards for new data collections, models to extend available data, new methodologies for using existing data, and expansion of the utility of non-traditional public health data. For example, the program has collaborated with the Environmental Protection Agency to develop daily estimates of fine particulate matter and ozone for every county in the conterminous US and to develop the first national database of standardized radon testing data. The program also collaborated with the National Aeronautics and Space Administration and its academic partners to transform satellite data into data products for public health.The Tracking Program has analyzed the data to address important gaps in our understanding of the relationship between negative health outcomes and environmental hazards. Data have been used in epidemiologic studies to better quantify the association between fine particulate matter, ozone, wildfire smoke, and extreme heat on emergency department visits and hospitalizations. Results are translated into measures of health burden for public dissemination and can be used to inform regulatory standards and public health interventions.ConclusionsThe scope of the Tracking Program’s mission and the volume of data within the network requires the program to merge traditional public health expertise and practices with current technical and scientific advances. Data integrated into the network can be used to (1) describe temporal and spatial trends in health outcomes and potential environmental exposures, (2) identify populations most affected, (3) generate hypotheses about associations between health and environmental exposures, and (4) develop, guide, and assess the environmental public health policies and interventions aimed at reducing or eliminating health outcomes associated with environmental factors. The program continues to expand the data within the network and the applications deployed for others to access the data. Current data challenges include the need for more temporally and spatially resolved data to better understand the complex relationships between environmental hazards, health outcomes, and risk factors at a local level. National standards are in development for systematically generating, analyzing, and disseminating small area data and real-time data that will allow for comparisons between different datasets over geography and time.

  • Research Article
  • Cite Count Icon 13
  • 10.1289/ehp4067
Environmental Health Sciences in a Translational Research Framework: More than Benches and Bedsides
  • Apr 1, 2019
  • Environmental Health Perspectives
  • Joel D Kaufman + 1 more

Background:Environmental health scientists may find it challenging to fit the structure of the questions addressed in their discipline into the prevailing paradigm for translational research.Objective:We aim to frame the translational science paradigm to address the stages of scientific discovery, knowledge acquisition, policy development, and evaluation in a manner relevant to the environmental health sciences. Our intention is to characterize differences between environmental health sciences and clinical medicine, and to orient this effort towards public health goals.Discussion:Translational research is usually understood to have evolved from the bench-to-bedside framework by which basic science transitions to clinical treatment. Although many health-related fields have incorporated the terminology and context of translational science, environmental health research has not always found a clear fit into this paradigm. We describe a translational research framework applicable to environmental health sciences that retains the basic structure that underlies the original bench-to-bedside paradigm. We propose that scientific discovery (T1) in environmental health research frequently occurs through epidemiological or clinical observations. This discovery often involves understanding the potential for human health effects of exposure to a given environmental chemical or chemicals. The practical applications of this discovery evolve through an understanding of exposure–response relationships (T2) and identification of potential interventions to reduce exposure and improve health (T3). These stages of translation require an interdisciplinary partnership between exposure sciences, exposure biology, toxicology, epidemiology, biostatistics, risk assessment, and clinical sciences. Implementation science then plays a crucial role in the development of environmental and public health practice and policy interventions (T4). Outcome evaluation (T5) often takes the form of accountability research, as environmental health scientists work to quantify the costs and benefits of these interventions.Conclusion:We propose an easily visualized framework for translation of environmental health science knowledge–from discovery to public health practice–that reflects the crucial interactions between multiple disciplines in our field. https://doi.org/10.1289/EHP4067

  • Research Article
  • Cite Count Icon 4
  • 10.1089/env.2012.0014
Assessment and Impact of a Summer Environmental Justice and Health Enrichment Program: A Model for Pipeline Development
  • Dec 1, 2012
  • Environmental Justice
  • Sacoby M Wilson + 4 more

The need for more enrichment programs for underrepresented groups in the health sciences particularly the environmental health sciences is considerable. The implications of chronic racial/ethnic differences in scientific training are best illustrated by the disproportionate number of health professionals from underrepresented groups. For example, African Americans comprise 13% of the total U.S. population but only account for 4% of U.S. physicians and in 2006, only 7.2% of all bachelor's degrees and 8.6% of all master's degrees awarded to African American were in the health sciences field. In an effort to increase the representation of persons of color in the health sciences, we used our existing community university partnership between the Low Country Alliance for Model Communities, the University of South Carolina, and the University of Maryland-College Park as the basis for a summer enrichment pilot project. The major aim was to provide academic experiences for underrepresented undergraduate and graduate students in the environmental health sciences, specifically on how to perform scientific research in environmental health sciences, learn about environmental justice and public health issues, and work closely with environmental health researchers on community-based participatory research projects. We set out to evaluate the process and feasibility of the pilot and several themes emerged from our qualitative inquiry that included: 1) need for strengthening research skills at the collegiate level; 2) lack of knowledge of environmental justice and environmental health issues; 3) need for practical experiences within the community; and 4) expansion of the project beyond the summer.

  • Book Chapter
  • Cite Count Icon 2
  • 10.4018/979-8-3693-6381-2.ch011
Big Data Analytics for Geospatial Application Using Python
  • May 10, 2024
  • Assefa Senbato Genale

Numerous organizations regularly produce enormous volumes of geospatial data due to the widespread use of sensors and location-based services. However, traditionally collecting, storing, managing, exploring, analyzing, and visualization of geospatial data has been a complex and time-consuming task. This study proposed a big data analytics approach to collect, store, manage, explore, process, and analyze massive amounts of geospatial data. A comprehensive literature review, various Python libraries for geospatial big data, challenges in geospatial big data analytics, and big data analytics techniques such as spatial clustering, spatial regression analysis, and spatial-temporal analysis, were presented. In addition, geospatial big data analytics algorithms like K-means clustering, ordinary least squares (OLS), geographically weighted regression (GWR), Spatio-temporal clustering algorithms, Spatio-temporal regression models, and others were discussed. Finally, case studies on performing geospatial big data analytics using Pyspark were addressed.

  • Research Article
  • Cite Count Icon 4
  • 10.1016/j.aej.2023.08.078
An interconnected IoT-inspired network architecture for data visualization in remote sensing domain
  • Sep 11, 2023
  • Alexandria Engineering Journal
  • Sunil K Panigrahi + 6 more

Geospatial Data Analytics (GDA) is a futuristic platform for analyzing and processing volumetric data in remote sensing and GIS applications. GDA utilizes the Internet of Spatial Things (IoST), mist, fog, and cloud computing architecture as a backend tool for analyzing and processing big geospatial data. This paper introduces utilizing these interconnected network architectures of cloud, fog, and mist to process a large volume of geospatial data. Also, the paper presents a flexible, interconnected distributed network system, i.e., IoST-mist-fog-cloud GIS architecture, to analyze and manage geospatial data. The proposed system helps cloud platforms when MIST devices are trying to cut down on latency and boost throughput at the edge of the IoST tier. It also performs the geospatial crime data visualization of the total number of stolen vehicles from 2001 to 2011 from all the states of India as a case study by using the proposed model. It explains the mathematical and analytical queueing model of the proposed system. In addition, it performs a performance evaluation and experimental findings on the proposed architecture and uses graphs to represent the various arithmetic outcomes. The experimentation result proves the proposed interconnected network architecture's efficacy in terms of reliability and efficiency.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon