On the Midpoint of a Set of XML Documents
The WWW contains a huge amount of documents. Some of them share the subject, but are generated by different people or even organizations. To guarantee the interchange of such documents, we can use XML, which allows to share documents that do not have the same structure. However, it makes difficult to understand the core of such heterogeneous documents (in general, schema is not available). In this paper, we offer a characterization and algorithm to obtain the midpoint (in terms of a resemblance function) of a set of semi-structured, heterogeneous documents without optional elements. The trivial case of midpoint would be the common elements to all documents. Nevertheless, in cases with several heterogeneous documents this may result in an empty set. Thus, we consider that those elements present in a given amount of documents belong to the midpoint. A exact schema could always be found generating optional elements. However, the exact schema of the whole set may result in overspecialization (lots of optional elements), which would make it useless.
- Research Article
17
- 10.1007/s002160000433
- Jul 3, 2000
- Fresenius' Journal of Analytical Chemistry
Artificial neural networks with unsupervised learning strategy known as Self-Organizing Maps were applied to classify ancient Roman glazed ceramics. Their clay ceramic bodies were analyzed by Inductively Coupled Plasma-Atomic Emission Spectroscopy and the chemical composition obtained was processed by this neural algorithm. The results obtained provide two types of information: firstly, classification of ceramic samples with identification of several groups and secondly, differentiation between the elemental chemical information. It was found that there are certain chemical elements which can be considered as principal and which can serve to differentiate between ceramics, whereas other elements give redundant information and do not contribute to sample differentiation. Seven chemical elements were considered principal and provide the necessary information. Two types of element were identified: 1- a group formed by common elements, such as: Ca, Fe, Mg, Mn and 2- another formed by optional elements: K or Na and Ba or Sr and Al or Ti.
- Research Article
45
- 10.1044/2018_ajslp-17-0127
- Aug 6, 2018
- American Journal of Speech-Language Pathology
Our aim was to develop a taxonomy of elements comprising phonological interventions for children with speech sound disorders. We conducted a content analysis of 15 empirically supported phonological interventions to identify and describe intervention elements. Measures of element concentration, flexibility, and distinctiveness were used to compare and contrast interventions. Seventy-two intervention elements were identified using a content analysis of intervention descriptions then arranged to form the Phonological Intervention Taxonomy: a hierarchical framework comprising 4 domains, 15 categories, and 9 subcategories. Across interventions, mean element concentration (number of required or optional elements) was 45, with a range of 27 to 59 elements. Mean flexibility of interventions (percentage of elements considered optional out of all elements included in the intervention) was 44%, with a range of 29% to 62%. Distinctiveness of interventions (percentage of an intervention's rare elements and omitted common elements out of all elements included in the intervention [both optional and required]) ranged from 0% to 30%. An understanding of the elements that comprise interventions and a taxonomy that describes their structural relationships can provide insight into similarities and differences between interventions, help in the identification of elements that drive treatment effects, and facilitate faithful implementation or intervention modification. Research is needed to distil active elements and identify strategies that best facilitate replication and implementation.
- Research Article
1
- 10.1016/j.jcomdis.2020.106071
- Dec 30, 2020
- Journal of Communication Disorders
Applying the phonological intervention taxonomy to expansion points intervention
- Research Article
35
- 10.1016/j.future.2017.10.022
- Nov 11, 2017
- Future Generation Computer Systems
Verifiable keyword search for secure big data-based mobile healthcare networks with fine-grained authorization control
- Conference Article
1
- 10.1109/icds47004.2019.8942298
- Oct 1, 2019
Extensible markup language (XML) is nowadays one of the most important standards for web information management and complex data representation. Massive Data is still treated, tagged and stored in XML Document. Yet, the most methods for processing XML Data have its difficulties due to the construction of schema-centric and semi-structured information resources of the major portion of existing XML data. Therefore, a huge amount of data treated, transferred and stored using Temporal Database system. There is a need for integrated methods that deal with for the storage of data from XML document into a temporal database. In this paper, we present the results of a research study concerning the migration of structured XML document including temporal data features into Object-Relational database according to the management varying time Data. First, we propose a data Model based on tree graph for tracking historical data in an XML Document for summarizing and indexing temporal XML documents. To achieve this, we constructing an effective index using Dewey Key with additional information structure. In our approach, we develop our own XML storage document and Build the temporal Database system according To object Relational Database concepts. In the next, we will develop our TORDB design scheme in order to simplify the migration of data from XML files into Object relational database implemented with varying time features. In addition, an algorithm is implemented for validating our solution and show that our mapping strategy is feasible and efficient.
- Research Article
- 10.1541/ieejeiss.123.693
- Jan 1, 2003
- IEEJ Transactions on Electronics, Information and Systems
Since computerized documents, e.g. XML documents, have been increased, it is desired to find particular information from a huge amount of XML documents. This paper proposes an automatic method for extracting keywords from the valid XML documents. Structured elements of an XML document are defined by DTD. We consider that a certain element of the structure represents importance for the document. First, the importance of an element is determined by the definition in DTD. For example, elements that cannot be omitted and elements that appear only once at the maximum in their parent elements are considered important ones. Second, all elements in the target XML document are scored by the tree structure of elements and contained texts in the document. Third, candidates of the keywords are extracted from elements with the scores. Finally, scores are summed up and candidates ranked higher are selected as keywords of the XML document. The validity of this method is examined.
- Book Chapter
2
- 10.4018/978-1-4666-1975-3.ch013
- Jan 1, 2013
Extensible Markup Language (XML) nowadays is one of the most important standard media used for exchanging and representing data through the Internet. Storing, updating, and retrieving the huge amount of web services data such as XML is an attractive area of research for researchers and database vendors. In this chapter, the authors propose and develop a new mapping model, called MAXDOR, for storing, rebuilding, updating, and querying XML documents using a relational database without making use of any XML schemas in the mapping process. The model addressed the problem of solving the structural hole between ordered hierarchical XML and unordered tabular relational database to enable us to use relational database systems for storing, updating, and querying XML data. A multiple link list is used to maintain XML document structure, manage the process of updating document contents, and retrieve document contents efficiently. Experiments are done to evaluate MAXDOR model. MAXDOR will be compared with other well-known models available in the literature (Tatarinov et al., 2002) and (Torsten et al., 2004) using total expected value of rebuilding XML document execution time and insertion of token execution time.
- Research Article
2
- 10.1007/s13740-018-0088-0
- Jun 1, 2018
- Journal on Data Semantics
The WWW contains a huge amount of documents. Some of them share the same subject, but are generated by different people or even by different organizations. A semi-structured model allows to share documents that do not have exactly the same structure. However, it does not facilitate the understanding of such heterogeneous documents. In this paper, we offer a characterization and algorithm to obtain a representative (in terms of a resemblance function) of a set of heterogeneous semi-structured documents. We approximate the representative so that the resemblance function is maximized. Then, the algorithm is generalized to deal with repetitions and different classes of documents. Although an exact representative could always be found using an unlimited number of optional elements, it would cause an overfitting problem. The size of an exact representative for a set of heterogeneous documents may even make it useless. Our experiments show that, for users, it is easier and faster to deal with smaller representatives, even compensating the loss in the approximation.
- Research Article
4
- 10.5539/cis.v2n1p35
- Jan 13, 2009
- Computer and Information Science
<!-- /* Font Definitions */ @font-face {font-family:??; panose-1:2 1 6 0 3 1 1 1 1 1; mso-font-alt:SimSun; mso-font-charset:134; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:3 135135232 16 0 262145 0;} @font-face {font-family:
- Book Chapter
- 10.1016/b978-0-08-021665-2.50010-7
- Jan 1, 1979
- Fundamental Concepts of Mathematics
CHAPTER 4 - SETS AND TRUTH FUNCTIONS
- Conference Article
1
- 10.1109/icemms.2011.6015680
- Aug 1, 2011
The huge amounts of biomedical data are stored in various formats and accessed through numerous interfaces. It is a crucial task for Data integration and exchange in cancer research. Data elements play an important role in data integration. The NCI supports a broad initiative to standardize the common data elements (CDEs) used in cancer research data capture and reporting. The Taiwan Cancer Registry (TCR), established in 1979, is organized and funded by the Health Department of the central government. The TCR's primary goal is to survey the incidence of cancer in Taiwan. The aim of the Taiwan Cancer Common Data Element Project (TCCDEP) is to facilitate convergence towards a common metadata standard in Taiwan cancer registry data. The project is implemented using a set of open source software and tools developed by the NCI, such as the caCORE SDK and caGrid. The experience of building, learning and using the open toolkit, Cancer Data Standards Repository (caDSR), developed by the National Cancer Institute's Center for Bioinformatics, NCICB, in the USA is reported. The caDSR is a metadata repository including CDEs used by NCI-sponsored organizations. The object of this work is to develop a database of metadata for medical data elements, referred to as the TCCDEP and to establish a common classification of data elements used in cancer registry. In this manuscript, we will be developing the common data elements using vocabulary standards, ontology and semantic modeling methodology. The CDEs included demographic data, clinical history, pathology data, and clinical outcome data including treatment, recurrence and vital status. These CDEs will be further enhanced to data sets across the participating cancer institutes to facilitate and supplement translational research. The Taiwan Cancer Registry (TCR) model and standard will be use as the basis for an electronic data standard repository to metadata or data descriptors. The TCCDEP developed 40 data elements to annotate the cancer registry data collected. In this project, we describe the process required to develop the model, the caDSR CDEs, and the results of the modeling effort. We address difficulties we encountered and modifications for solution. The caBIG (TM) grid project, gird model of Taiwan Cancer Registry (girdTCR), using the caCORE tools to define data elements for cancer registry has been shown to caBIG(TM)UML model project. Currently, the Taiwan cancer registry CDEs are released and available in CDE browser for reusing. Furthermore, we will extend our CDEs to daily clinical practice and trials, along with how the methods were used to fully implemented in hospitals and cancer research centers in Taiwan.
- Conference Article
36
- 10.2514/6.2002-5406
- Sep 4, 2002
- 9th AIAA/ISSMO Symposium on Multidisciplinary Analysis and Optimization
Multidisciplinary Design Optimization (MDO) tech- niques were successfully applied in sizing the wing boxes of the newly developed Fairchild Dornier re- gional jet family. A common finite element model for the whole aircraft was used for the static and aero- elastic optimization and analysis purposes. A detailed design model in the order of thousands of design variables was constructed. All relevant sizing re- quirements for structural strength, aeroelastic behav- ior and manufacturing, resulting in over 800,000 con- straints, were applied under all loading conditions. Many auxiliary tools for automating the process of preparing the huge amount of required input data, as well as the rapid assessment of results, were devel- oped. Most of these tools were developed in close coordination with the MSC Software GmbH, since the MDO implementation process is centered around the optimization procedure in MSC.Nastran SOL 200. A new MSC.Nastran feature called External Server was utilized to integrate company specific wing buck- ling constraints into the Nastran optimization loop. An independent and comprehensive analysis of the con- ceived wing box's structural sizes confirmed the va- lidity of the results.
- Research Article
2
- 10.1089/bio.2018.29038.lbabstracts
- Jun 1, 2018
- Biopreservation and Biobanking
Background: Due to the rapid development of translational medicine research, the construction and application clinical biobanks have increasing attention. The information system is the core of the biobank, which plays an important role on sharing of clinical information and specimens. So the development of clinical biobank information system should based on supporting the overall process of operation, management and service.
- Single Book
24
- 10.33918/virvelines
- Nov 14, 2018
VIRVELINĖS KERAMIKOS KULTŪRA LIETUVOJE 2800–2400 cal BC