Distance-based correlation analysis for graph databases
Abstract Big data is often characterized by its volume, velocity, and variety, properties that entail the fact that the data contains values and relationships that are too complex to be stored using standard, relational, or document databases. Graph databases, commonly utilized for their capacity to model complex relationships between sets of objects, provide an effective framework for the processing and storing of such data. Afterwards, it is necessary to work with data further − analyse it using methods of descriptive statistics and statistical analysis, visualize it with the use of exploratory analysis techniques, and especially use this data to build analytical models for predictive and estimation purposes. The main objective of the presented study is the design and implementation of the predictive potential metric in graph databases, which is based on the structures found in the graph databases themselves. We focus on the examination of the correlation between the attribute values of individual database objects and the mutual distance of these objects in the defined graph space. The proposed metric is verified using standard prediction models built on a sizeable graph database.
- Book Chapter
- 10.1007/978-3-030-57881-7_59
- Jan 1, 2020
In order to improve the training effect of applied talents, quantitative evaluation and big data analysis are used to evaluate the training of applied talents. A quantitative analysis model of applied talents training based on big data and probability statistics is proposed. The statistical mathematical analysis model of applied talents training is constructed, and the significance of applied talents training is analyzed by using T statistical test analysis method, and the benefit distribution model of applied talents training is established. The cumulative average analysis method is used to evaluate the bilateral reliability of applied talents training, and the big data mining and feature extraction methods are used to analyze the characteristics of applied talents training. The big data robust mining model for the cultivation of applied talents is constructed, and the descriptive statistical analysis method of single variable is taken, the statistical analysis of probability theory and big data analysis method are used to realize the evaluation of benefit index for the cultivation of applied talents. The results of empirical analysis show that the model has good accuracy and high level of confidence in the quantitative evaluation of applied talents training.
- Conference Article
- 10.1109/icicas48597.2019.00063
- Dec 1, 2019
the selection of teaching mode is related to the improvement of teaching quality. In order to improve the quality of teaching and promote the transformation of teaching results, combined with big data statistical analysis method, a choosing teaching method is proposed based on big data's analysis, which constructs the statistical analysis model of teaching mode selection by descriptive statistical analysis method, and extracts the quantity of characteristic information that reflects the optimization of teaching method selection. The big data analysis model based on fuzzy C-means clustering is used to evaluate the performance of teaching methods, and the method of segmental sample detection is used to carry out regression analysis to realize the sample fitting of teaching method selection. The mathematical model of teaching mode selection is designed through the sample fitting result, and the statistical mathematical model based on big data's teaching mode selection is constructed. The simulation results show that the method is used to select the teaching method; it can effectively extract the regular characteristic quantity, which reflects the teaching performance, realizes the association rule mining of the choice of teaching method. The teaching mode is selected according to the result of big data mining and statistical analysis, improving the teaching quality, and promoting the reform of teaching methods.
- Research Article
1
- 10.24959/sphhcj.24.325
- Jul 16, 2024
- Social Pharmacy in Health Care
Aim. To study modern professional competencies – hard skills (including knowledge, abilities, skills) and soft skills (personal skills) that determine the effective activity of pharmacy professionals (PP) in certain pharmaceutical positions (pharmacy manager, pharmacist, pharmacist’s assistant) and determine the most important of them for pharmacy retail in today’s conditions. Materials and methods. To collect and process data, such scientific methods as the analysis of literary sources, the method of questionnaire survey, the method of comparison, comparison, and generalization, were used. The results of the analysis were processed using licensed software products of Microsoft Office Excel. The methods of descriptive statistics and statistical analysis, such as the Epitools service (Ausvet Ltd., Australia), a STATISTICA 13 statistical package (TIBCO Software Inc., USA) were used to process the questionnaire data. Results. As part of our study, a survey, in which 200 PP took part, was conducted. The questionnaire contained open-ended questions, in which it was necessary to specify 3-5 hard skills (professional skills) or soft skills (personal skills) that are necessary for the effective performance of the professional duties of PP in the relevant position, as well as select one of the most important ones from this list. At the same time, three positions were differentiated: pharmacist, pharmacist’s assistant, and pharmacy manager. According to the survey results, hard skills and soft skills, which, according to the majority of PP, were the most necessary for each position. were identified. Conclusions. Based on the results of the study conducted, it has been found that the actualization of professional competences and soft skills is an important step for effective work by position in pharmacy institutions in today’s conditions. It has been determined that such soft skills as sociability, responsibility, stress resistance, honesty, intelligence, purposefulness, hard work, love for the profession, experience, energy and reliability are equally important for the positions of a pharmacist, pharmacist’s assistant and pharmacy manager. At the same time, specific hard and soft skills are distinguished for each of the positions: empathy and attentiveness are more important for pharmacists and pharmacist’s assistants, while organizational skills, managerial qualities, desire for professional development and knowledge of legislation are important for pharmacy managers. Responsibility is a key quality for pharmacy managers, professionalism and knowledge – for pharmacists, and attentiveness – for pharmacist’s assistants. Differences in the perception of the importance of these hard and soft skills are also related to the age, qualification and type of pharmacy institutions where respondents work.
- Research Article
- 10.19075/2414-0031-2016-1-35-50
- Feb 11, 2016
- Internetnauka
Goal . Described using histograms accumulation, charts, bubble charts volumetric changes of gross yield, yield and acreage of flax fiber in the federal districts of Russia during the past 5 years. Factor analysis of the effect of yield and acreage to changes in gross collection. Built equation for forecasting the gross harvest of flax. Methods . We used the methods of descriptive statistics, visualization, factor analysis, correlation and regression analysis. Goal. Described using histograms accumulation, charts, bubble charts volumetric changes of gross yield, yield and acreage of flax fiber in the federal districts of Russia during the past 5 years. Factor analysis of the effect of yield and acreage to changes in gross collection. Built equation for forecasting the gross harvest of flax. Methods. We used the methods of descriptive statistics, visualization, factor analysis, correlation and regression analysis. Results. To identify differences between the federal districts were summarized figures for gross harvest, yield and acreage of flax fiber in the federal districts of Russia during the past 5 years. It is possible to trace the trend of these changes in the indices of all FD (as well as the relative position of FD), not only for one year, but for the last 5 years at the same time with the help of histograms and graphs accumulation. Bubble charts clearly demonstrated the interdependence of all three indicators that statistical calculations has been confirmed in the future. Factor analysis showed the importance of the influence of yield and acreage on the gross yield of flax fiber in the federal districts of Russia. Conclusion. Using the methods of descriptive statistics, visualization, factor analysis, correlation and regression analysis demonstrated their great ability to assess and analyze the dynamics of the efficiency of production of flax. On the basis of descriptive statistics and visualization techniques have been investigated location of each FD in the development of culture as the data over time and relative to other FD. It has been made appropriate recommendations on the priorities for each DOF. It was concluded on the basis of factor analysis, it is necessary to pay more attention to how to increase the yield of flax fiber and increase the acreage size. The calculated multiple regression equation allows you to make forecasts of gross collection and suggests that the productivity effect on the gross yield is much more significant than the impact of acreage.
- Conference Article
16
- 10.4230/lipics.icdt.2015.177
- Jan 1, 2015
Graph databases are currently one of the most popular paradigms for storing data. One of the key conceptual differences between graph and relational databases is the focus on navigational queries that ask whether some nodes are connected by paths satisfying certain restrictions. This focus has driven the definition of several different query languages and the subsequent study of their fundamental properties. We define the graph query language of Regular Queries, which is a natural extension of unions of conjunctive 2-way regular path queries (UC2RPQs) and unions of conjunctive nested 2-way regular path queries (UCN2RPQs). Regular queries allow expressing complex regular patterns between nodes. We formalize regular queries as nonrecursive Datalog programs with transitive closure rules. This language has been previously considered, but its algorithmic properties are not well understood. Our main contribution is to show elementary tight bounds for the containment problem for regular queries. Specifically, we show that this problem is 2EXPSPACE-complete. For all extensions of regular queries known to date, the containment problem turns out to be non-elementary. Together with the fact that evaluating regular queries is not harder than evaluating UCN2RPQs, our results show that regular queries achieve a good balance between expressiveness and complexity, and constitute a well-behaved class that deserves further investigation.
- Research Article
- 10.19075/2414-0031-2016-3-180-197
- Apr 1, 2016
- Internetnauka
Goal . In this paper, with the help of charts, histograms accumulation volume bubble chart describes the changes of gross yield, yield and acreage of vegetables in the subjects of the Southern Federal District for the past 5 years. It has been studied the location of each entity of the Southern Federal District in the development of culture as the data over time and relative to other regions of the Southern Federal District. Factor analysis of the effect of yield and acreage to changes in gross collection. Built equation for forecasting the gross harvest of vegetables. Methods . We used the methods of descriptive statistics, visualization, factor analysis, correlation and regression analysis. Results . To identify differences between the federal districts were summarized figures for gross harvest, yield and acreage of vegetables in the subjects of the Southern Federal District for the past 5 years. It is possible to trace the trend of changes in these indicators for all regions of the Southern Federal District (as well as the relative position of the subjects), not only for one year, but for the last 5 years at the same time with the help of histograms and graphs accumulation. Bubble charts clearly demonstrated the interdependence of all three indicators that statistical calculations has been confirmed in the future. Factor analysis showed the importance of the influence of yield and acreage on the gross harvest of vegetables in the subjects of the Southern Federal District. Conclusion . Using the methods of descriptive statistics, visualization, factor analysis, correlation and regression analysis demonstrated their great potential for the evaluation and analysis of the dynamics of efficiency of vegetable production. Based on the methods of descriptive statistics, and visualization was investigated the location of each entity in the development of culture as the data over time and relative to other regions of the Southern Federal District. We have been made appropriate recommendations on the priorities for each subject of the Southern Federal District. Correlation analysis confirms the conclusions about relationships between yield, crop areas, the gross collection made on the basis of bubble charts. On the basis of factor analysis, it was concluded that the yield of vegetables in the subjects of the Southern Federal District is given a lot of attention, which is not fully to the increase in acreage. The calculated multiple regression equation allows you to make forecasts of gross collection and suggests that the effect size of the acreage is more important for the gross yield than the yield. The resulting factor analysis results allow management entities of the Southern Federal District to take concrete measures to increase the gross harvest of vegetables.
- Research Article
- 10.19075/2414-0031-2016-3-98-118
- Apr 1, 2016
- Internetnauka
Goal . In this paper, with the help of charts, histograms accumulation volume bubble chart describes the changes of gross yield, yield and acreage of vegetables in the regions of the Far Eastern Federal District for the past 5 years. It has been studied the location of each entity of the Far Eastern Federal District in the development of culture as the data over time and in relation to other subjects of the Far Eastern Federal District. Factor analysis of the effect of yield and acreage to changes in gross collection. Built equation for forecasting the gross harvest of vegetables. Methods . We used the methods of descriptive statistics, visualization, factor analysis, correlation and regression analysis. Results . To identify differences between the federal districts were summarized figures for gross harvest, yield and acreage of vegetables in the regions of the Far Eastern Federal District for the past 5 years. It is possible to trace the trend of changes in these parameters for all subjects of the Far Eastern Federal District (as well as the relative position of the subjects), not only for one year, but for the last 5 years at the same time with the help of histograms and graphs accumulation. Bubble charts clearly demonstrated the interdependence of all three indicators that statistical calculations has been confirmed in the future. Factor analysis showed the importance of the influence of yield and acreage on the gross harvest of vegetables in the regions of the Far Eastern Federal District. Conclusion . Using the methods of descriptive statistics, visualization, factor analysis, correlation and regression analysis demonstrated their great potential for the evaluation and analysis of the dynamics of efficiency of vegetable production. Based on the methods of descriptive statistics, and visualization was investigated the location of each entity in the development of culture as the data over time and in relation to other subjects of the Far Eastern Federal District. We have been made appropriate recommendations on the priorities for each subject of the Far Eastern Federal District. Correlation analysis confirms the conclusions about relationships between yield, crop areas, the gross collection made on the basis of bubble charts. On the basis of factor analysis, it was concluded that the yield of vegetables in the regions of the Far Eastern Federal District is given a lot of attention, which is not fully to the increase in acreage. The calculated multiple regression equation allows you to make forecasts of gross collection and suggests that the effect size of the acreage is more important for the gross yield than the yield. The resulting factor analysis results allow management entities of the Far Eastern Federal District to take concrete measures to increase the gross harvest of vegetables.
- Research Article
- 10.19075/2414-0031-2016-3-59-78
- Apr 1, 2016
- Internetnauka
Goal . Goal. In this paper, with the help of charts, histograms accumulation volume bubble chart describes the changes of gross yield, yield and acreage of vegetables in the subjects of the North Caucasus Federal District over the past 5 years. It has been studied the location of each entity of the North Caucasus Federal District in the development of culture as the data over time and in relation to other subjects of the North Caucasus Federal District. Factor analysis of the effect of yield and acreage to changes in gross collection. Built equation for forecasting the gross harvest of vegetables. Methods . We used the methods of descriptive statistics, visualization, factor analysis, correlation and regression analysis. Results . To identify differences between the federal districts were summarized figures for gross harvest, yield and acreage of vegetables in the subjects of the North Caucasus Federal District over the past 5 years. It is possible to trace the trend of changes in these parameters for all subjects of the North Caucasus Federal District (as well as the relative position of the subjects), not only for one year, but for the last 5 years at the same time with the help of histograms and graphs accumulation. Bubble charts clearly demonstrated the interdependence of all three indicators that statistical calculations has been confirmed in the future. Factor analysis showed the importance of the influence of yield and acreage on the gross harvest of vegetables in the subjects of the North Caucasus Federal District. Conclusion . Using the methods of descriptive statistics, visualization, factor analysis, correlation and regression analysis demonstrated their great potential for the evaluation and analysis of the dynamics of efficiency of vegetable production. Based on the methods of descriptive statistics, and visualization was investigated the location of each entity in the development of culture as the data over time and in relation to other subjects of the North Caucasus Federal District. We have been made appropriate recommendations on the priorities for each subject of the North Caucasus Federal District. Correlation analysis confirms the conclusions about relationships between yield, crop areas, the gross collection made on the basis of bubble charts. On the basis of factor analysis, it was concluded that the yield of vegetables in the subjects of the North Caucasus Federal District is given a lot of attention, which is not fully to the increase in acreage. The calculated multiple regression equation allows you to make forecasts of gross collection and suggests that the effect size of the acreage is more important for the gross yield than the yield. The resulting factor analysis results allow management entities of the North Caucasus Federal District to take concrete measures to increase the gross harvest of vegetables.
- Research Article
- 10.19075/2414-0031-2016-3-144-161
- Apr 1, 2016
- Internetnauka
Goal . In this paper, with the help of charts, histograms accumulation volume bubble chart describes the changes of gross yield, yield and acreage of vegetables in the regions of the Urals Federal District for the past 5 years. It has been studied the location of each entity of the Urals Federal District in the development of culture as the data over time and in relation to other subjects of the Urals Federal District. Factor analysis of the effect of yield and acreage to changes in gross collection. Built equation for forecasting the gross harvest of vegetables. Methods . We used the methods of descriptive statistics, visualization, factor analysis, correlation and regression analysis. Results . To identify differences between the federal districts were summarized figures for gross harvest, yield and acreage of vegetables in the regions of the Urals Federal District for the past 5 years. It is possible to trace the trend of changes in these parameters for all subjects of the Urals Federal District (as well as the relative position of the subjects), not only for one year, but for the last 5 years at the same time with the help of histograms and graphs accumulation. Bubble charts clearly demonstrated the interdependence of all three indicators that statistical calculations has been confirmed in the future. Factor analysis showed the importance of the influence of yield and acreage on the gross harvest of vegetables in the regions of the Urals Federal District. Conclusion . Using the methods of descriptive statistics, visualization, factor analysis, correlation and regression analysis demonstrated their great potential for the evaluation and analysis of the dynamics of efficiency of vegetable production. Based on the methods of descriptive statistics, and visualization was investigated the location of each entity in the development of culture as the data over time and in relation to other subjects of the Urals Federal District. We have been made appropriate recommendations on the priorities for each subject of the Urals Federal District. Correlation analysis confirms the conclusions about relationships between yield, crop areas, the gross collection made on the basis of bubble charts. On the basis of factor analysis, it was concluded that the yield of vegetables in the regions of the Urals Federal District is given a lot of attention, which is not fully to the increase in acreage. The calculated multiple regression equation allows you to make forecasts of gross collection and suggests that the effect size of the acreage is more important for the gross yield than the yield. The resulting factor analysis results allow management entities of the Urals Federal District to take concrete measures to increase the gross harvest of vegetables.
- Conference Article
3
- 10.1145/2905055.2905194
- Mar 4, 2016
These days' Big data is becoming a very essential component for the industries where large volume of data at very high speed is used to solve particular data problems. Generally, big data is first analyzed and then used with other available data in the company to make it more effective. Therefore, big data is never operated in isolation. There are a variety of non-relational data stores (databases) available. These data stores and big data can be used in combination to work with. Attributes of these databases are available for companies where big data is used. In last few years it is the requirement of companies that these databases should operate very fast, it should be extended/contracted whenever required and should generate reports quickly. It also requires that the different means should be available to manage and organize these massive databases. This paper mainly focuses on some methods for data management like key-value databases, document databases, tabular databases, object data bases and graph databases. Use of RDBMS for big data implementation is not practical because of its performance, scale or even cost. Now a day's companies have adopted non-relational databases, known as NoSQL databases. Programmers and analysts may take benefit of non-relational databases as it has simple modeling constraints than the relational databases. Analysts can do various types of analysis by taking different types of non-relational databases every time. For example, key value databases, graph databases. The non-relational databases do not depend on the common traditional relational database management systems.
- Research Article
- 10.1088/1742-6596/1423/1/012036
- Dec 1, 2019
- Journal of Physics: Conference Series
In order to solve the problem that currently, the performance assessment mode of most experimental projects only adopts the individual assessment while neglecting teamwork, puts forward the experimental project performance assessment mode based on team communication, which takes all the students participating in the project as a team, and divides the team into several groups according to project contents and tasks. Team members complete the project tasks jointly through team communication and group and individual quantitative scoring will be carried out according to the completion of tasks. After conducting the descriptive statistical analysis of an experimental project assessment performance of 223 students from 8 classes of 4 undergraduate majors in a university through Stata15.1, the results show that the results of the project assessment performance data all show a normal distribution by major, class, group and individual, the assessment results are ideal, and the assessment mode is feasible. The innovation is to use the Stata, and use the descriptive statistical analysis method to make an in-depth and careful statistical analysis of the experimental project performance assessment mode based on the team communication proposed. And the normality test is carried out to make an objective evaluation to the assessment results.
- Research Article
- 10.24891/ni.16.7.1223
- Jul 16, 2020
- National Interests: Priorities and Security
Subject. The article focuses on the modern financial system of Russia. Objectives. I determine the limit of the contemporary financial system in Russia. Methods. The study is based on methods of descriptive statistics, statistical and cluster analysis. Results. The article shows the possibility of determining the scope of the contemporary financial system in Russia by establishing monetary relations as the order of the internal system and concerted operation of subsystems, preserving the structure of the financial system, maintaining the operational regime, implementing the program and achieving the goal. I found that the Russian financial system correlated with the Angolan one, and the real scope of the contemporary financial system in Russia. Conclusions and Relevance. As an attempt to effectively establish monetary relations and manage them, the limit of the contemporary financial system is related to the possibility of using Monetary Aggregate M0 to maintain the balance of the Central Bank of Russia. To overcome the scope of Russia’s financial system, the economy should have changed its specialization, refocusing it on high-tech export and increasing the foreign currency reserves. This can be done if amendments to Russia’s Constitution are adopted. The findings expand the scope of knowledge and create new competence in the establishment of monetary relations, order of the internal system and concerted interaction of subsystems, structural preservation of the financial system and maintenance of its operational regime.
- Research Article
- 10.3760/cma.j.issn.1674-2907.2016.18.021
- Jun 26, 2016
- Chinese Journal of Modern Nursing
Objective Based on the multi-central large sample statistics, the implementation effect of traditional Chinese medical(TCM)nursing plan for heart failure in China was obtained, further optimizing 'the TCM nursing plan on patients with heart failure (trial)’. Methods Eight thousand five hundred and thirty heart failure patients were selected from 71 hospitals in China such as Beijing, Tianjin, Shanghai and so on during the period from November 2013 to October 2015. Clinical nursing staff of each center evaluated heart failure patients with ′effect evaluation form of heart failure TCM nursing′issued by the state administration of traditional Chinese medicine after the implementation of the plan. The evaluation results were summarized and analyzed in the form of feedback. The data was analyzed in the descriptive statistical analysis method, described in the form of percentage and composition ratio, and the feedback was discussed. Results In the 7 185 patients whose syndromes were consistent with the plan, the ratio of the chronic stable stage and the acute exacerbation was 4.88∶1. There were 1 345 patients whose syndromes were inconsistent with the plan, accounting for 16% of the total number of the patients in the same period. There were 16 new symptoms which were not included in the plan. Conclusions Improving relevant methods and optimizing the plan make 'the TCM nursing plan on patients with heart failure (trial)’become a professional, technical and scientific guide book, guiding clinical nurses more accurately. Key words: Heart failure; Traditional Chinese medical nursing plan; Optimizing ideas; Multi-center study
- Research Article
2
- 10.18178/joig.12.3.283-291
- Jan 1, 2024
- Journal of Image and Graphics
To efficiently store data where the relationships between individual objects are essential, the use of a graph database model is recommended. After storing the data, it is necessary to further analyze it using statistical methods or visualize it within the context of exploratory data analysis. Such visualization is crucial for understanding the structure and content of the database. However, commonly used visualization tools often fall short in terms of interactivity and effectiveness. The main objective of the presented work is the design and implementation of a novel model for the visualization of data structures stored in graph databases with the use of two natural graphical models—the standard topological layout of the database and the so-called clustered layout of a graph database. The presented graphical models are focused on interactive visualization, mainly scaling of visualization and development of database objects, and principles of effective visualization. Implementation of the proposed approach was evaluated via case studies on three model graph databases of various sizes—Messaging database (16 objects, 16 relationships), Library database (16 objects, 32 relationships) and Movie database (171 objects, 253 relationships). Compared to the standard Neo4j tool for the visual representation of property graphs in graph databases, the proposed model presents improvement in terms of the number of visualization models, effectivity of the visualization, and development of objects in the visualized database.
- Research Article
- 10.15294/ijeces.v6i1.15762
- Jun 23, 2017
- Indonesian Journal of Early Childhood Education Studies
The high demand for the education in early childhood is demanding the quality of early childhood program itself, starting from the quality of curriculum, facilities, and infrastructure, and also the quality of teaching, assisting and educating the early childhood students. Early childhood teachers are required to have competence in various fields, including personal, professional, pedagogical and social competence. The aim of this research is to describe teachers’ Quality of Work Life in Early Childhood Education. The concept of teacher quality of work life was measured with a QWL (Indonesian version), which refers to the eight-dimensional construct of Walton and five sub-dimensions of Hackman and Oldham (Walton, 1980). The composition of QWL instrument including a) adequate fair and compensation, b) safe and healthy working condition, c) balance of work and family (non-work-life), d) social integration in the work organization, e) supervisory (the social relevance of work life), f) constitutionalization in the work organization, g) opportunity for career growth, h) opportunity to use and develop human capacities and i) job characteristic. This research uses descriptive statistical analysis method and the results in which shows that the teachers' quality of work life in early childhood education in Jakarta is in the average category. The dimensions of quality of work life, which are included in the high category, are the dimensions of the co-worker, personal development, work-life balance, work culture, supervisory and job characteristic. Otherwise, the dimensions of quality of work life, which are included in the low category, are the dimensions of pay and benefit, promotion and working condition.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.