Linear Conditional Random Field Research Articles

Fingerprinting individual functions in binary code is useful in many security applications ranging from digital forensic analysis of malware corpora to the detection of critical security vulnerabilities. However, existing approaches for fingerprinting functions are typically not resilient to code transformation methods or the use of different compilers. Moreover, another common weakness with these approaches is that when they report a similarity, they do not provide reverse engineers with any insight into the underlying evidence. In order to bridge this gap, our paper presents Plumeria, an obfuscation-resilient and scalable approach based on a stratified architecture comprised of three layers. The first layer retrieves as many candidates as possible by capturing statistical characteristics, function behavior, and function neighborhood relationships. The second layer then trains a linear conditional random field to learn the correlations between the features of the function and its semantics. This layer is designed to reduce the number of false positives. Finally, the third layer is designed to provide insights into the underlying evidence by collecting the side effects exhibited from the candidates selected by the previous layer. Our study evaluates Plumeria in the context of several scenarios: fingerprinting functions in obfuscated/de-obfuscated binaries; fingerprinting functions across different compilers; fingerprinting various vulnerabilities across compilers and versions; and fingerprinting standard library functions. We then benchmark Plumeria on real-world projects and malware binaries, comparing it with existing state-of-the-art solutions. Our results show that Plumeria outperforms existing solutions, with an average precision of over 89%

BackgroundSmall chemical molecules regulate biological processes at the molecular level. Those molecules are often involved in causing or treating pathological states. Automatically identifying such molecules in biomedical text is difficult due to both, the diverse morphology of chemical names and the alternative types of nomenclature that are simultaneously used to describe them. To address these issues, the last BioCreAtIvE challenge proposed a CHEMDNER task, which is a Named Entity Recognition (NER) challenge that aims at labelling different types of chemical names in biomedical text.MethodsTo address this challenge we tested various approaches to recognizing chemical entities in biomedical documents. These approaches range from linear Conditional Random Fields (CRFs) to a combination of CRFs with regular expression and dictionary matching, followed by a post-processing step to tag those chemical names in a corpus of Medline abstracts. We named our best performing systems CheNER.ResultsWe evaluate the performance of the various approaches using the F-score statistics. Higher F-scores indicate better performance. The highest F-score we obtain in identifying unique chemical entities is 72.88%. The highest F-score we obtain in identifying all chemical entities is 73.07%. We also evaluate the F-Score of combining our system with ChemSpot, and find an increase from 72.88% to 73.83%.ConclusionsCheNER presents a valid alternative for automated annotation of chemical entities in biomedical documents. In addition, CheNER may be used to derive new features to train newer methods for tagging chemical entities. CheNER can be downloaded from http://metres.udl.cat and included in text annotation pipelines.

Linear Conditional Random Field Research Articles

Related Topics

Articles published on Linear Conditional Random Field

A stratified approach to function fingerprinting in program binaries using diverse features

Guiding Attention in Sequence-to-Sequence Models for Dialogue Act Prediction

CheNER: a tool for the identification of chemical entities and their classes in biomedical literature.

Semantic-Based Requirements Content Management for Cloud Software

Named entity recognition for tweets

Detection and characterization of regulatory elements using probabilistic conditional random field and hidden Markov models

Two-stage NER for tweets with clustering

Complex Terminology Extraction Model from Unstructured Web Text Based Linguistic and Statistical Knowledge

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Linear Conditional Random Field Research Articles

Related Topics

Articles published on Linear Conditional Random Field

A stratified approach to function fingerprinting in program binaries using diverse features

Guiding Attention in Sequence-to-Sequence Models for Dialogue Act Prediction

CheNER: a tool for the identification of chemical entities and their classes in biomedical literature.

Semantic-Based Requirements Content Management for Cloud Software

Named entity recognition for tweets

Detection and characterization of regulatory elements using probabilistic conditional random field and hidden Markov models

Two-stage NER for tweets with clustering

Complex Terminology Extraction Model from Unstructured Web Text Based Linguistic and Statistical Knowledge