Abstract

Purpose: We do it to improve the low efficiency in analyzing risk factors of type 2 Diabetes Mellitus by Apriori Algorithm. Method: We use the patients’ data from the information department of one tertiary referral hospital in Lanzhou which include course note of disease and their health record form January 2009 to March 2014.We find out that the improved FP-tree Algorithm analyzes risk factors of type 2 diabetes better. And we analyze the efficiency by programming improved FP-tree and Apriori Algorithm with C .Result: We can analyze the chart of time and number of records, time and support degree, main risk factors. Conclusion: The improved FP-tree Algorithm can be used to analyze the risk factors of Diabetes Mellitus and holds a higher efficiency. Introduction Diabetes Mellitus is considered to be caused by the secretion of insulin and the role of defects caused by chronic high blood sugar with carbohydrates, metabolic disabled of fat and protein chronic disease characterized. Type 2 Diabetes Mellitus, which is called non-insulin-dependent Diabetes Mellitus as well, dues to insulin resistance with relatively lack of insulin secretion, and Type 2 Diabetes Mellitus, which has the characteristic of adult lesion, slow process, light degree, is not together with lesion of β cells and holds most of all the numbers of Diabetes Mellitus patients[1]. It is counted that the number of global patients with Diabetes Mellitus was only 30 million in 1985 which increased to 135 million in 10 years, and it reached 171 million in 2000. Even it is forecasted to overwhelming 300 million before 2025. The so large number and quicker increasing speed shows the importance of research on Diabetes Mellitus. We find defects of Apriori Algorithm in researching on mining association rules of Type 2 Diabetes Mellitus risk factors. First, Apriori Algorithm has to used to scan the database once when generate a frequent item set each time. And second, when generating k candidate item sets from (k-1) frequent item sets, it will product many candidate item sets which is unnecessary later and have a long time in data mining of risk factors and a low work efficiency. We propose a modified Frequent Pattern Tree Algorithm to analyze the risk factors of Type 2 Diabetes Mellitus with the characteristic of large data and variable[2]. Structuring the Mining Rules Frequent Pattern Growth Algorithm Frequent Pattern Tree Algorithm is a kind of basic method without candidate item sets. The improved process and developed tree form is called Frequent Pattern Growth Algorithm. Frequent Pattern Growth Algorithm bases on Divide and Conquer: we first compress the original data of database into one Frequent Pattern Tree, and keep the association information. Then we divide the database by conditions, and each frequent item is connected with one condition[3]. The Frequent Pattern Growth Algorithm can be divided into two parts: structuring the tree form based on original database and recurrently mining in the tree. The first step equals the one which

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call