Prominent Field Research Articles

AbstractConsidering the important issue of computer infections by worms spread via networks, the theme of source identification has been a prominent research field that aims at investigating infection propagation including acquiring knowledge about the infection and the node removal times when a worm infection happens. This information helps in identifying the patient zero in the worm attack and may be used by computer forensic investigators and network administrators to spot the culprits and to identify related network vulnerabilities. In this paper, we tackle this problem by developing new probabilistic models based on Bayesian networks. We learn a probability distribution to calculate, at every time step, the probability that each node is infected by a scanning worm, using historical data and features extracted from the network and application layers. With the mentioned probability distribution, the node infection status can be inferred using feature values at each time step. We propose a four‐step method to investigate the time of infection and removal of each node probabilistically. First, features are extracted and derived from network traffic data. There are no suitable training and test datasets publicly available for our tests; therefore, we developed the training and test datasets using simulations of the Code Red II worm. Second, a prior model is built using training data. Third, the probabilistic model is built by the estimation of distribution algorithm. Fourth, the infection probability of nodes is inferred given the probability distribution and feature values at each time step. It has already been shown that the number of infectious nodes can be probabilistically approximated backward in time through the stochastic Back‐to‐Origin Markov model. We combine our first model with the prior stochastic Back‐to‐Origin Markov model to develop our second model. To evaluate our first and second models, we conducted experiments that show that these models can pinpoint the source node and the infection time of nodes with acceptable accuracy. It should be noted that our method could be employed with other propagating worm types including ransomware worms.

Read full abstract

BackgroundIn recent years, research on cancer predisposition germline variants has emerged as a prominent field. The identity of somatic mutations is based on a reliable mapping of the patient germline variants. In addition, the statistics of germline variants frequencies in healthy individuals and cancer patients is the basis for seeking candidates for cancer predisposition genes. The Cancer Genome Atlas (TCGA) is one of the main sources of such data, providing a diverse collection of molecular data including deep sequencing for more than 30 types of cancer from > 10,000 patients.MethodsOur hypothesis in this study is that whole exome sequences from blood samples of cancer patients are not expected to show systematic differences among cancer types. To test this hypothesis, we analyzed common and rare germline variants across six cancer types, covering 2241 samples from TCGA. In our analysis we accounted for inherent variables in the data including the different variant calling protocols, sequencing platforms, and ethnicity.ResultsWe report on substantial batch effects in germline variants associated with cancer types. We attribute the effect to the specific sequencing centers that produced the data. Specifically, we measured 30% variability in the number of reported germline variants per sample across sequencing centers. The batch effect is further expressed in nucleotide composition and variant frequencies. Importantly, the batch effect causes substantial differences in germline variant distribution patterns across numerous genes, including prominent cancer predisposition genes such as BRCA1, RET, MAX, and KRAS. For most of known cancer predisposition genes, we found a distinct batch-dependent difference in germline variants.ConclusionTCGA germline data is exposed to strong batch effects with substantial variabilities among TCGA sequencing centers. We claim that those batch effects are consequential for numerous TCGA pan-cancer studies. In particular, these effects may compromise the reliability and the potency to detect new cancer predisposition genes. Furthermore, interpretation of pan-cancer analyses should be revisited in view of the source of the genomic data after accounting for the reported batch effects.

Read full abstract

Prominent Field Research Articles

Related Topics

Articles published on Prominent Field

Eco-Systems Mapping and Forecasting of Techno-Science Linkages at the Level of Scholarly Journals and Fields

A probability distribution function for investigating node infection and removal times

A New Generalized Projection and Its Application to Acceleration of Audio Declipping

Multigroup Classification using Privacy Preserving Data Mining

A Survey of Security Services, Attacks, and Applications for Vehicular Ad Hoc Networks (VANETs).

Substantial batch effects in TCGA exome sequences undermine pan-cancer analysis of germline variants

Computer Vision in Autonomous Unmanned Aerial Vehicles—A Systematic Mapping Study

Responsibility, planning and risk management: moralizing everyday finance through financial education.

Analysis of the 1.3–1.7 yr Oscillation Relationship between Solar and Geomagnetic Activities

Improving public data for building segmentation from Convolutional Neural Networks (CNNs) for fused airborne lidar and image data using active contours

PTMphinder: an R package for PTM site localization and motif extraction from proteomic datasets.

SFTRD: A novel information propagation model in heterogeneous networks: Modeling and restraining strategy

Process Technology, Applications and Thermal Resistivity of Basalt Fiber Reinforced SiOC Composites

Nonradiating photonics with resonant dielectric nanostructures

Subsurface Sediment Mobilization in the Southern Chryse Planitia on Mars

Bringing compassion into information systems research: A research agenda and call to action

Theoretical Analysis of the Optical Response of Silicon/Silica/Gold Multishell Nanoparticles in Biological Tissue

SEMANTIC PHOTOGRAMMETRY – BOOSTING IMAGE-BASED 3D RECONSTRUCTION WITH SEMANTIC LABELING

Cloud Computing-Positive Impacts and Challenges in Business Perspective

Improving the visibility of developmental biology: time for induction and specification.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Prominent Field Research Articles

Related Topics

Articles published on Prominent Field

Eco-Systems Mapping and Forecasting of Techno-Science Linkages at the Level of Scholarly Journals and Fields

A probability distribution function for investigating node infection and removal times

A New Generalized Projection and Its Application to Acceleration of Audio Declipping

Multigroup Classification using Privacy Preserving Data Mining

A Survey of Security Services, Attacks, and Applications for Vehicular Ad Hoc Networks (VANETs).

Substantial batch effects in TCGA exome sequences undermine pan-cancer analysis of germline variants

Computer Vision in Autonomous Unmanned Aerial Vehicles—A Systematic Mapping Study

Responsibility, planning and risk management: moralizing everyday finance through financial education.

Analysis of the 1.3–1.7 yr Oscillation Relationship between Solar and Geomagnetic Activities

Improving public data for building segmentation from Convolutional Neural Networks (CNNs) for fused airborne lidar and image data using active contours

PTMphinder: an R package for PTM site localization and motif extraction from proteomic datasets.

SFTRD: A novel information propagation model in heterogeneous networks: Modeling and restraining strategy

Process Technology, Applications and Thermal Resistivity of Basalt Fiber Reinforced SiOC Composites

Nonradiating photonics with resonant dielectric nanostructures

Subsurface Sediment Mobilization in the Southern Chryse Planitia on Mars

Bringing compassion into information systems research: A research agenda and call to action

Theoretical Analysis of the Optical Response of Silicon/Silica/Gold Multishell Nanoparticles in Biological Tissue

SEMANTIC PHOTOGRAMMETRY – BOOSTING IMAGE-BASED 3D RECONSTRUCTION WITH SEMANTIC LABELING

Cloud Computing-Positive Impacts and Challenges in Business Perspective

Improving the visibility of developmental biology: time for induction and specification.