Mapping Research Trends with the CoLiRa Framework: A Computational Review of Semantic Enrichment of Tabular Data
This article introduces the CoLiRa (Computational Literature Review & Analysis) framework, a novel integration of established computational algorithms designed to quantitatively analyze and map the evolution of scientific fields. Employing a human-in-the-loop epistemological approach, CoLiRa combines the scalability of automated algorithms with the semantic coherence of expert-driven qualitative research. The multi-stage pipeline incorporates Latent Dirichlet Allocation (LDA) for thematic discovery, cluster analysis (K-Means and Multidimensional Scaling) for conceptual mapping, and Ordinary Least Squares (OLS) regression to monitor temporal trends. Algorithmic outputs are structurally validated by domain experts using quantitative metrics. The framework’s end-to-end capabilities are demonstrated through a proof-of-concept case study on the semantic enrichment of tabular data, encompassing studies up to 2024 that utilize Semantic Web ontologies, Linked Data, and knowledge graphs. The analysis identifies three core research topics and finds no statistically significant linear trends, suggesting thematic coexistence. This work provides a validated, hybrid computational approach for conducting robust literature reviews and mapping research trajectories.
- Research Article
- 10.1609/aaai.v28i1.8780
- Jun 21, 2014
- Proceedings of the AAAI Conference on Artificial Intelligence
This thesis seeks to address word reasoning problems from a semantic standpoint, proposing a uniform approach for generating solutions while also providing human-understandable explanations. Current state of the art solvers of semantic problems rely on traditional machine learning methods. Therefore their results are not easily reusable by algorithms or interpretable by humans. We propose leveraging web-scale knowledge graphs to determine a semantic frame of interpretation. Semantic knowledge graphs are graphs in which nodes represent concepts and the edges represent the relations between them. Our approach has the following advantages: (1) it reduces the space in which the problem is to be solved; (2) sparse and noisy data can be used without relying only on the relations deducible from the data itself; (3) the output of the inference algorithm is supported by an interpretable justification. We demonstrate our approach in two domains: (1) Topic Modeling: We form topics using connectivity in semantic graphs. We use the same topic models for two very different recommendation systems, one designed for high noise interactive applications and the other for large amounts of web data. (2) Analogy Solving: For humans, analogies are a fundamental reasoning pattern, which relies on abstraction and comparative analysis. In order for an analogy to be understood, precise relations have to be identified and mapped. We introduce graph algorithms to assess the analogy strength in contexts derived from the analogy words. We demonstrate our approach by solving standardized test analogy question.
- Research Article
1
- 10.3897/biss.3.37412
- Jun 26, 2019
- Biodiversity Information Science and Standards
The landscape of currently existing repositories of specimen data consists of isolated islands, with each applying its own underlying data model. Using standardized protocols such as DarwinCore or ABCD, specimen data and metadata are exchanged and published on web portals such as GBIF. However, data models differ across repositories. This can lead to problems when comparing and integrating content from different systems. for example, in one system there is a field with the label 'determination', in another there is a field with the label 'taxonomic identification'. Both might refer to the same concepts of organism identification process (e.g., 'obi:organism identification assay'; http://purl.obolibrary.org/obo/OBI_0001624), but the intuitive meaning of the content is not clear and the understanding of the providers of the information might differ from that of the users. Without additional information, data integration across isolated repositories is thus difficult and error-prone. As a consequence, interoperability and retrievability of data across isolated repositories is difficult. Linked Open Data (LOD) promises an improvement. URIs can be used for concepts that are ideally created and accepted by a community and that provide machine-readable meanings. LOD thereby supports transfer of data into information and then into knowledge, thus making the data FAIR (Findable, Accessible, Interoperable, Reusable; Wilkinson et al. 2016). Annotating specimen associated data with LOD, therefore, seems to be a promising approach to guarantee interoperability across different repositories. However, all currently used specimen collection management systems are based on relational database systems, which lack semantic transparency and thus do not provide easily accessible, machine-readable meanings for the terms used in their data models. As a consequence, transferring their data contents into an LOD framework may lead to loss or misinterpretation of information. This discrepancy between LOD and relational databases results from the lack of semantic transparency and machine-readability of data in relational databases. Storing specimen collection data as semantic Knowledge Graphs provides semantic transparency and machine-readability of data. Semantic Knowledge Graphs are graphs that are based on the syntax of ‘Subject – Property – Object’ of the Resource Description Framework (RDF). The ‘Subject’ and ‘Property’ position is taken by URIs and the ‘Object’ position can be taken either by a URI or by a label or value. Since a given URI can take the ‘Subject’ position in one RDF statement and the ‘Object’ position in another RDF statement, several RDF statements can be connected to form a directed labeled graph, i.e. a semantic graph. Semantic Knowledge Graphs are graphs in which each described specimen and its parts and properties possess their own URI and thus can be individually referenced. These URIs are used to describe the respective specimen and its properties using the RDF syntax. Additional RDF statements specify the ontology class that each part and property instantiates. The reference to the URIs of the instantiated ontology classes guarantees the Findability, Interoperability, and Reusability of information contained in semantic Knowledge Graphs. Specimen collection data contained in semantic Knowledge Graphs can be made Accessible in a human-readable form through an interface and in a machine-readable form through a SPARQL endpoint (https://en.wikipedia.org/wiki/SPARQL). As a consequence, semantic Knowledge Graphs comply with the FAIR guiding principles. By using URIs for the semantic Knowledge Graph of each specimen in the collection, it is also available as LOD. With semantic Morph·D·Base, we have implemented a prototype to this approach that is based on Semantic Programming. We present the prototype and discuss different aspects of how specimen collection data are handled. By using community created terminologies and standardized methods for the contents created (e.g. species identification) as well as URIs for each expression, we make the data and metadata semantically transparent and communicable. The source code for Semantic Programming and for semantic Morph·D·Base is available from https://github.com/SemanticProgramming. The prototype of semantic Morph·D·Base can be accessed here: https://proto.morphdbase.de.
- Research Article
2
- 10.3897/biss.3.37205
- Jun 19, 2019
- Biodiversity Information Science and Standards
Currently, morphological data and metadata are still mostly published as unstructured free texts, which lack semantic transparency, cannot be parsed by computers, and do not comply with the FAIR (Findable, Accessible, Interoperable, Reusable; Wilkinson et al. (2016) data principles, thus hampering their reuse by non-experts and their integration across many fields in the life sciences. With an ever-increasing amount of available ontologies and the development of adequate semantic technology, however, a solution to this problem becomes available. Instead of free text descriptions, morphological data and metadata can be recorded, stored, and communicated through the Web in the form of Resource Description Framework (RDF) triple statements that use the ‘Subject – Property – Object’ syntax of RDF and URIs of ontology classes and properties as well as URIs for individual entities as terminology. Since a given URI can take the ‘Subject’ position in one and the ‘Object’ position in another RDF statement, several triples can be linked to form a highly formalized and structured directed graph (semantic graph). After introducing an instance-based approach of recording morphological descriptions and their accompanying metadata as semantic knowledge graphs (i.e. Anatomy Knowledge Graphs), we propose a knowledge graph template pattern for each type of anatomical observation and a pattern for documenting metadata. The use of template patterns for knowledge graphs provides Interoperability and Reusability of comparable anatomical observations and of their accompanying metadata and a means to meaningfully visualize information contained in semantic graphs in a user-friendly HTML representation. Stored in a tuple store, Anatomy Knowledge Graphs become Findable and Accessible through the store’s SPARQL endpoint. As a consequence, anatomy data and metadata documented as Anatomy Knowledge Graphs in a tuple store are FAIR. Finally, we suggest a general scheme of how to efficiently organize Anatomy Knowledge Graphs in a tuple store framework based on instances of named graphs, with each individual named graph instantiating an ontology class that relates to a particular type of observation (e.g., weight measurement named graph class). A named graph is a fourth element in an RDF statement (‘Subject – Property – Object – Named Graph’), turning the triple into a quadruple. All RDF statements that share the same URI in the ‘Named Graph’ position belong to the same named graph. The use of named graph resources allows meaningful fragmentation of the contents of an Anatomy Knowledge Graph (Fig. 1), which in turn enables subsequent specification of all kinds of data views for managing and accessing morphological data and metadata. This scheme has been implemented in the description module of the prototype for semantic Morph∙D∙Base.
- Conference Article
- 10.18653/v1/2023.newsum-1.10
- Jan 1, 2023
Recent work within the Argument Mining community has shown the applicability of Natural Language Processing systems for solving problems found within competitive debate.One of the most important tasks within competitive debate is for debaters to create high quality debate cases.We show that effective debate cases can be constructed using constrained shortest path traversals on Argumentative Semantic Knowledge Graphs.We study this potential in the context of a type of American Competitive Debate, called "Policy Debate", which already has a large scale dataset targeting it called "DebateSum".We significantly improve upon DebateSum by introducing 53180 new examples, as well as further useful metadata for every example, to the dataset.We leverage the txtai semantic search and knowledge graph toolchain to produce and contribute 9 semantic knowledge graphs built on this dataset.We create a unique method for evaluating which knowledge graphs are better in the context of producing policy debate cases.A demo which automatically generates debate cases, along with all other code and the Knowledge Graphs, are opensourced and made available to the public here: https://huggingface.
- Research Article
1
- 10.1108/bij-03-2024-0208
- Jul 11, 2024
- Benchmarking: An International Journal
Purpose This study aims to contribute to the ongoing assessment of executive compensation by investigating the nexus between managerial entrenchment factors, adopting a multifaceted perspective encompassing both economic and non-economic dimensions. Design/methodology/approach This research employs pooled cross-sectional Ordinary Least Squares (OLS) regression and Least Squares with Dummy Variables (LSDV) models with fixed effects to examine the determinants of Chief Executive Officer (CEO) compensation. Findings This research identifies firm size, performance (via ROA and Tobin’s Q), and CEO characteristics (age, tenure, stock ownership, MBA degree) as significant determinants of executive compensation at the 0.05 level. In contrast, the prestige of educational institutions, doctoral degrees, and the MBA’s relevance to short-term performance, along with CEO tenure, do not significantly affect pay. Additionally, the study highlights the significance of industry type (manufacturing vs technology) in shaping compensation, emphasizing the role of firm metrics and CEO credentials in designing executive pay packages. Originality/value This research introduces an innovative approach to controlling unobserved heterogeneity and adjusting for the dynamic nature of CEO compensation attributes across diverse CEO characteristics. By integrating both pooled Ordinary Least Squares (OLS) and Least Squares Dummy Variable (LSDV) models, the study addresses the challenges posed by time-invariant variables and unobservable heterogeneity. Such issues have historically skewed the accuracy of traditional OLS models in identifying the comprehensive array of factors—both economic and non-economic—that influence CEO compensation. This novel methodological framework significantly advances the examination of unobservable variables that may vary not only across the firms selected for analysis but also over time periods, thereby offering a more detailed understanding of the determinants of CEO pay.
- Research Article
3
- 10.56065/ijuev2022.66.3-4.198
- Dec 1, 2022
- Izvestiya Journal of the University of Economics – Varna
Financial inclusion involves decreasing the number of unbanked population through series of activities that will enhance the participation in the financial system. The objective of this study is examining financial inclusion and its implications on growth of small and medium sized enterprises (SMEs) in Nigeria from 1992 to 2020 using data obtained from the Central Bank of Nigeria (CBN) Statistical Bulletin. The study used the classical linear regression model using Ordinary least square (OLS) and Dynamic Ordinary Least Square (DOLS) to analyse the data. The outcome of the analysis revealed that the growth of small and medium sized enterprises (SMEs) in Nigeria is positively and significantly influenced by financial inclusion. The findings further revealed that government needs to steer up efforts in ensuring the dissemination of all banking services to reach everyone at affordability fees regardless of income group and location.
- Research Article
38
- 10.3390/su131910856
- Sep 29, 2021
- Sustainability
Facing the big data wave, this study applied artificial intelligence to cite knowledge and find a feasible process to play a crucial role in supplying innovative value in environmental education. Intelligence agents of artificial intelligence and natural language processing (NLP) are two key areas leading the trend in artificial intelligence; this research adopted NLP to analyze the research topics of environmental education research journals in the Web of Science (WoS) database during 2011–2020 and interpret the categories and characteristics of abstracts for environmental education papers. The corpus data were selected from abstracts and keywords of research journal papers, which were analyzed with text mining, cluster analysis, latent Dirichlet allocation (LDA), and co-word analysis methods. The decisions regarding the classification of feature words were determined and reviewed by domain experts, and the associated TF-IDF weights were calculated for the following cluster analysis, which involved a combination of hierarchical clustering and K-means analysis. The hierarchical clustering and LDA decided the number of required categories as seven, and the K-means cluster analysis classified the overall documents into seven categories. This study utilized co-word analysis to check the suitability of the K-means classification, analyzed the terms with high TF-IDF wights for distinct K-means groups, and examined the terms for different topics with the LDA technique. A comparison of the results demonstrated that most categories that were recognized with K-means and LDA methods were the same and shared similar words; however, two categories had slight differences. The involvement of field experts assisted with the consistency and correctness of the classified topics and documents.
- Research Article
- 10.12681/cclabs.9705
- Feb 24, 2026
- Ετήσιο Ελληνόφωνο Επιστημονικό Συνέδριο Εργαστηρίων Επικοινωνίας
The fifth industrial revolution is characterized by the rise of innovative technologies and the ongoing collaboration between humans and machines. In this context, journalism increasingly integrates Artificial Intelligence (AI) technologies and Semantic Web (SW) services to enhance the efficiency of news production. Accordingly, Semantic Knowledge Graphs have become key assets, enabling access to diverse data sources used in investigative reporting, yet their relationship with other journalistic forms, such as data journalism, has not been adequately examined. This paper aims to examine the relationship between data journalism and semantic knowledge graphs, focusing on their applications in news production and highlighting emerging opportunities for innovation in communication. The methodology is based on the analysis of two semantic knowledge graphs (case studies) used by media organizations that specialize in data journalism. The findings reveal that both case studies are applied in news production to access, organize, and analyze diverse data sources, uncovering hidden patterns and connections within complex information. They also highlight how both knowledge graphs leverage artificial intelligence to drive innovation in communication, enhancing the organization and presentation of complex data through automated categorization and visualization. Lastly, the discussion provides research directions based on the study’s analysis and findings.
- Book Chapter
5
- 10.5772/intechopen.92433
- Sep 16, 2020
Technologies of knowledge representation, inductive reasoning, and semantic annotation methods are considered in relation to knowledge graphs that are focused on the domain of nuclear physics and nuclear power engineering. Interactive visual navigation and inductive reasoning in knowledge graphs are performed using special search widgets and an intelligent RDF browser. As a toolkit for ontologies refinement and enrichment, a software agent for the context-sensitive searching for new knowledge in the WWW is presented. In order to evaluate the measure of compliance of the found content with respect to a specific domain, the binary Pareto relation and Levenshtein metrics are used. The proposed semantic annotation methods allow the knowledge engineer to calculate the measure of the proximity of an arbitrary network resource in relation to classes and objects of specific knowledge graphs. Operations with remote semantic repositories are implemented on cloud platforms using SPARQL queries and RESTful services. The proposed software solutions are based on cloud computing using DBaaS and PaaS service models to ensure scalability of data warehouses and network services. Examples of using the proposed technologies and software are given.
- Research Article
37
- 10.1016/j.eswa.2023.120955
- Jul 8, 2023
- Expert Systems with Applications
Food security is currently a major concern due to the growing global population, the exponential increase in food demand, the deterioration of soil quality, the occurrence of numerous diseases, and the effects of climate change on crop yield. Sustainable agriculture is necessary to solve this food security challenge. Disruptive technologies, such as of artificial intelligence, especially, deep learning techniques can contribute to agricultural sustainability. For example, applying deep learning techniques for early disease classification allows us to take timely action, thereby helping to increase the yield without inflicting unnecessary environmental damage, such as excessive use of fertilisers or pesticides. Several studies have been conducted on agricultural sustainability using deep learning techniques and also semantic web technologies such as ontologies and knowledge graphs. However, the three major challenges remain: (i) the lack of explainability of deep learning-based systems (e.g. disease information), especially to non-experts like farmers; (ii) a lack of contextual information (e.g. soil or plant information) and domain-expert knowledge in deep learning-based systems; and (iii) the lack of pattern learning ability of systems based on the semantic web, despite their ability to incorporate domain knowledge. Therefore, this paper presents the work on disease classification, addressing the challenges as mentioned earlier by combining deep learning and semantic web technologies, namely ontologies and knowledge graphs. The findings are: (i) 0.905 (90.5%) prediction accuracy on large noisy dataset; (ii) ability to generate user-level explanations about disease and incorporate contextual and domain knowledge; (iii) the average prediction latency of 3.8514 s on 5268 samples; (iv) 95% of users finding the explanation of the proposed method useful; and (v) 85% of users being able to understand generated explanations easily—show that the proposed method is superior to the state-of-the-art in terms of performance and explainability and is also suitable for real-world scenarios.
- Conference Article
22
- 10.1109/dsaa.2016.51
- Oct 1, 2016
This paper describes a new kind of knowledge representation and mining system which we are calling the Semantic Knowledge Graph. At its heart, the Semantic Knowledge Graph leverages an inverted index, along with a complementary uninverted index, to represent nodes (terms) and edges (the documents within intersecting postings lists for multiple terms/nodes). This provides a layer of indirection between each pair of nodes and their corresponding edge, enabling edges to materialize dynamically from underlying corpus statistics. As a result, any combination of nodes can have edges to any other nodes materialize and be scored to reveal latent relationships between the nodes. This provides numerous benefits: the knowledge graph can be built automatically from a real-world corpus of data, new nodes - along with their combined edges - can be instantly materialized from any arbitrary combination of preexisting nodes (using set operations), and a full model of the semantic relationships between all entities within a domain can be represented and dynamically traversed using a highly compact representation of the graph. Such a system has widespread applications in areas as diverse as knowledge modeling and reasoning, natural language processing, anomaly detection, data cleansing, semantic search, analytics, data classification, root cause analysis, and recommendations systems. The main contribution of this paper is the introduction of a novel system - the Semantic Knowledge Graph - which is able to dynamically discover and score interesting relationships between any arbitrary combination of entities (words, phrases, or extracted concepts) through dynamically materializing nodes and edges from a compact graphical representation built automatically from a corpus of data representative of a knowledge domain. The source code for our Semantic Knowledge Graph implementation is being published along with this paper to facilitate further research and extensions of this work.
- Research Article
26
- 10.1080/13241583.2011.11465390
- Jan 1, 2011
- Australasian Journal of Water Resources
Regional flood frequency analysis (RFFA) techniques are commonly used to estimate design floods for ungauged catchments. In Australian Rainfall and Runoff (ARR), the probabilistic rational method (PRM) was recommended for eastern New South Wales (NSW). Recent studies in Australia have shown that regression-based RFFA methods can provide more accurate design flood estimates than the PRM. This paper compares ordinary least squares (OLS) and generalised least squares (GLS) based quantile regression techniques using data from 96 small-to medium-sized catchments across NSW for average recurrence intervals of 2 to 100 years. The advantages of the GLS regression are that this accounts for the inter-station correlation and varying record lengths from site to site. An independent test based on both the split-sample and one-at-a-time validation approaches employing a wide range of statistical diagnostics indicates that the GLS regression provides more accurate flood quantile estimates than the OLS one. The developed regression equations are relatively easy to apply, which require data for only two to three predictors, catchment area, design rainfall intensity and stream density. The findings from this study together with those from other RFFA studies being examined as a part of ARR upgrade projects will inform the development of RFFA techniques for inclusion in the revised edition of ARR.
- Research Article
16
- 10.1016/j.eve.2024.100038
- Jan 1, 2024
- Evolving Earth
Exploring urban land surface temperature with geospatial and regression modelling techniques in Uttarakhand using SVM, OLS and GWR models
- Research Article
1
- 10.3897/biss.3.37206
- Jun 19, 2019
- Biodiversity Information Science and Standards
We would like to present FAIR Research Data: Semantic Knowledge Graph Infrastructure for the Life Sciences (in short, FAIR.ReD), a project initiative that is currently being evaluated for funding. FAIR.ReD is a software environment for developing data management solutions according to the FAIR (Findable, Accessible, Interoperable, Reusable; Wilkinson et al. 2016) data principles. It utilizes what we call a Data Sea Storage, which employs the idea of Data Lakes to decouple data storage from data access but modifies it by storing data in a semantically structured format as either semantic graphs or semantic tables, instead of storing them in their native form. Storage follows a top-down approach, resulting in a standardized storage model, which allows sharing data across all FAIR.ReD Knowledge Graph Applications (KGAs) connected to the same Sea, with newly developed KGAs having automatically access to all contents in the Sea. In contrast access and export of data follows a bottom-up approach that allows the specification of additional data models to meet the varying domain-specific and programmatic needs for accessing structured data. The FAIR.ReD engine enables bidirectional data conversion between the two storage models and any additional data model, which will substantially reduce conversion workload for data-rich institutes (Fig. 1). Moreover, with the possibility to store data in semantic tables, FAIR.ReD provides high performance storage for incoming data streams such as sensory data. FAIR.ReD KGAs are modularly organized. Modules can be edited using the FAIR.ReD editor and combined to form coherent KGAs. The editor allows domain experts to develop their own modules and KGAs without any programming experience required, thus also allowing smaller projects and individual researchers to build their own FAIR data management solution. Contents from FAIR.ReD KGAs can be published under a Creative Commons license as documents, micropublications, or nanopublications, each receiving their own DOI. A publication-life-cycle is implemented in FAIR.ReD and allows updating published contents for corrections or additions without overwriting the originally published version. Together with the fact that data and metadata are semantically structured and machine-readable, all contents from FAIR.ReD KGAs will comply with the FAIR Guiding Principles. Due to all FAIR.Red KGAs providing access to semantic knowledge graphs in both a human-readable and a machine-readable version, FAIR.ReD seamlessly integrates the complex RDF (Resource Description Framework) world with a more intuitively comprehensible presentation of data in form of data entry forms, charts, and tables. Guided by use cases, the FAIR.ReD environment will be developed using semantic programming where the source code of an application is stored in its own ontology. The set of source code ontologies of a KGA and its modules provides the steering logic for running the KGA. With this clear separation of steering logic from interpretation logic, semantic programming follows the idea of separating main layers of an application, analog to the separation of interpretation logic and presentation logic. Each KGA and module is specified exactly in this way and their source code ontologies stored in the Data Sea. Thus, all data and metadata are semantically transparent and so is the data management application itself, which substantially improves their sustainability on all levels of data processing and storing.
- Research Article
- 10.1093/ofid/ofac492.1837
- Dec 15, 2022
- Open Forum Infectious Diseases
Background Neisseria gonorrhea is the second most prevalent sexually transmitted infection (STI) in the US with increasing incidence. Untreated, it leads to devastating sequelae for nonpregnant women, pregnant women, and their neonates. Geographic clustering of gonorrhea coincides with clustering of other risk factors and may help guide interventions. We utilize Geographic Information Systems (GIS) software to map gonorrhea incidence among women in Chicago, and examine factors associated with the spatial distribution of infection. Methods Chicago Department of Public Health data sets and shape files were available through the Chicago Data Portal. ArcGIS was utilized for map illustrating and data analysis. Ordinary Least Squares (OLS) Regression was utilized using the dependent variable of Gonorrhea incidence in women per Chicago Community Area (CCA), with various explanatory variables. Within the OLS, the Koenker Statistic (KS) was reported and from this a Geographically Weighted Regression (GWR) tool was run for combinations with statistically significant KS. Results The highest incidence is shown in the southern and western CCA’s, with hospitals clustered in the northern and southeastern CCA’s & clinics with gaps primarily noted in the southern and northwestern CCA’s. (Figure 1.) Teen pregnancy incidence (TP), infant mortality (IM), and first trimester prenatal care, had statistical significance of the robust probability for all of their explanatory variables, and VIF’s less than 2 for all explanatory variables, JBS were not statistically significant, and AICc were approximately 1170 demonstrating stable OLS. (Figure 2&3) TP & IM did have a statistically significant KS and GWR was run demonstrating random spatial pattern. This confirms TP & IM as predictors for gonorrheal incidence had variable predictability throughout the CCA’s and GWR improves the model outcome. Both combinations spatially explain gonorrhea incidence. Map of Gonorrhea Incidence among Females in 2006-2010 Figure 1:The data of the incidence of gonorrhea in females was mapped with 5 separate classifications created and displayed in the form of a map. Also included on the map were hospitals in the city, and Chicago Department of Public Health clinic locations where free STI management could take place. illustrates gonorrhea incidence rates (per 100,000) within the 77 Chicago Community Areas (CCA) of the city. The darker shades of green correspond with higher incidence. The highest incidence is shown to be greatest in the southern and western CCA’s. Also illustrated are the locations of the hospitals within the city in addition to CDPH clinics that are Federally Qualified Health Clinics (FQHC) that offer services for treatment at a reduced or free cost depending on health insurance status. The majority of the hospitals are shown to clustered in the northern and southeastern CCA’s. The clinics are more evenly spaced out with gaps primarily noted in the southern and northwestern CCA’s. The Ordinary Least Squares (OLS) regression results are illustrated in Table 1 above. The results are shown for six separate regressions: chlamydia incidence, Natality (teen pregnancy incidence [TP], percent of live births that were preterm [Preterm], fetal low birth percentage [LBW], prenatal care in the 1st trimester rate [PNC 1T], infant mortality rate [IM]), poverty (total # below poverty line per 100,000), poverty & chlamydia, TP & IM, TP & IM & PNC 1T, and all coefficients (natality incidence, poverty incidence, chlamydia incidence, fertility rate). Utilizing ArcGIS software, OLS Regression was mapped with the 77 Chicago Community Areas (CCA) with Gonorrhea Infection Incidence per CCA as the dependent variable and Teen Pregnancy (TP), Infant Mortality (IM) and 1st trimester prenatal care (1TM) incidence per CCA as the explanatory variables. Regression reported with standard deviation (SD) and color coded based off quantity of SD. Map shapefile is of the 77 CCA’s. Overlying are Hospitals and Chicago Department of Public Health (CDPH) clinic locations as made available through CDPH public domain website. Conclusion Teen pregnancy rates, infant mortality rates, and prenatal care in the first trimester explain some of the spatial patterns of gonorrhea incidence of females in Chicago. Identification of these factors identified should prompt providers to ensure enhanced testing of gonorrhea and other STI’s to reduce burden in areas of high incidence. Disclosures All Authors: No reported disclosures.