Empirical Bayes Procedure Research Articles

신용전이행렬을 추정함에 있어서 국내의 등급전이자료의 축적이 부족한 점을 극복하기 위하여 외국의 신용평가기관(무디스)의 전이행렬자료와 국내의 신용등급 부여자료를 이용하여 경험적 베이지안 추정방법에 의한 전이행렬을 도출하고, 이 전이행렬을 다른 전이행렬과 비교해보기 위하여 전이행렬의 동적인 요소를 평균전이확률의 개념으로 표시할 수 있는 특성척도를 개발하여 신용전이행렬의 시계열 특성과 통계적 특성을 비교한다. 시계열자료의 척도는 베이지안 추정행렬이 안정적임을 보여주는 반면 국내 행렬은 시간적으로 변화의 폭이 크고 무디스나 베이지안 행렬보다 상대적으로 인접전이의 비율이 높게 나타났다. 붓스트랩 검정을 통하여 세 가지 추정방법이 통계적으로 유의한 차이가 있음을 보이고 베이지안 행렬이 무디스 자료보다는 국내자료에 더 많은 영향을 받았음을 유추할 수 있다. 신용등급 전이에 따른 포트폴리오의 가치변화를 고려하는 몬테칼로 시뮬레이션을 통하여 신용 VaR를 구하여 비교하였다. 국내 전이행렬의 경우에 평균은 가장 크고 신용위험도 가장 큰 값을 보였다. 시뮬레이션에서도 베이지안 추정에 의한 결과가 국내자료에 의한 결과와 더 가깝다는 것을 알 수 있다. In order to overcome the lack of Korean credit rating migration data, we consider an empirical Bayes procedure to estimate credit rating migration matrices. We derive the posterior probabilities of Korean credit rating transitions by utilizing the Moody's rating migration data and the credit rating assignments from Korean rating agency as prior information and likelihood, respectively. Metrics based upon the average transition probability are developed to characterize the migration matrices and compare our Bayesian migration matrices with some given matrices. Time series data for the metrics show that our Bayesian matrices are stable, while the matrices based on Korean data have large variation in time. The bootstrap tests demonstrate that the results from the three estimation methods are significantly different and the Bayesian matrices are more affected by Korean data than the Moody's data. Finally, Monte Carlo simulations for computing the values of a portfolio and its credit VaRs are performed to compare these migration matrices.

The aim of this paper is to discuss the effect of missing values in detecting differentially expressed genes in a cDNA microarray experiment in the context of a one sample problem. We conducted a cDNA microarray experiment to detect differentially expressed genes for the metastasis of colorectal cancer based on twenty patients who underwent liver resection due to liver metastasis from colorectal cancer. Total RNAs from metastatic liver tumor and adjacent normal liver tissue from a single patient were labeled with cy5 and cy3, respectively, and competitively hybridized to a cDNA microarray with 7775 human genes. We used M=log2(R/G) for the signal evaluation, where R and G denoted the fluorescent intensities of Cy5 and Cy3 dyes, respectively. The statistical problem comprises a one sample test of testing E(M)=0 for each gene and involves multiple tests. The twenty cDNA microarray data would comprise a matrix of dimension 7775 by 20, if there were no missing values. However, missing values occur for various reasons. For each gene, the no missing proportion (NMP) was defined to be the proportion of non-missing values out of twenty. In detecting differentially expressed (DE) genes, we used the genes whose NMP is greater than or equal to 0.4 and then sequentially increased NMP by 0.1 for investigating its effect on the detection of DE genes. For each fixed NMP, we imputed the missing values with K-nearest neighbor method (K=10) and applied the nonparametric t-test of Dudoit et al. (2002), SAM by Tusher et al. (2001) and empirical Bayes procedure by Lonnstedt and Speed (2002) to find out the effect of missing values in the final outcome. These three procedures yielded substantially agreeable result in detecting DE genes. Of these three procedures we used SAM for exploring the acceptable NMP level. The result showed that the optimum no missing proportion (NMP) found in this data set turned out to be 80%. It is more desirable to find the optimum level of NMP for each data set by applying the method described in this note, when the plot of (NMP, Number of overlapping genes) shows a turning point. Corresponding author: Byung Soo Kim (Tel: +82-22123-4541, Fax:+82-2-313-5331, Email: bskim@yonsei.ac.kr) B.S. Kim’s study was supported by Yonsei University Research Fund of 2001. S.Y. Rha’s study was supported by a grant of the IMT-2000 project, Ministry of Health & Welfare, Republic of Korea (01-PJ11-PG9-01BT00A-0028). Introduction The DNA microarray has been established as a major tool in biological researches due to its ability of monitoring gene expression levels of thousands of genes simultaneously under different conditions (Jin et al., 2001: Gibson 2002; Hedenfalk, 2002; Olesiak 2002; Ramaswamy 2002; Huang 2003; Keshave and Ong, 2003). It is not trivial to analyze the data from microarray experiment, not because they just involve large amount of data, but because they comprise a non-standard statistical problem which is often referred to as a “large p, small n” problem (West, 2003). Typically, we have thousands of genes (=p) for a microarray experiment with tens of microarrays (=n). Several analysis tools including SAM (Tusher et al, 2001) and BRB-ArrayTools (Simon and Peng) have been introduced in the pubic domain to provide a guidance to laboratory scientists on the statistical analysis of microarray data. Microarray experiment data can be represented by a p x n matrix, where the (i, j)th element of the matrix indicates the i-th gene expression level for the j-th microarray, i=1, .., p, and j=1, ,n. It is quite often that we observe the missing values in the data of p x n matrix. Missing values occur for various reasons not only from the technical problems but also from the biological characteristics. Currently, as the chip quality and the hybridization techniques have been improved to a certain level, the missing values usually come from the biological reasons, such as no expression of the specific genes in the sample or the

Empirical Bayes Procedure Research Articles

Related Topics

Articles published on Empirical Bayes Procedure

Accuracy of genotypic value predictions for marker-based selection in biparental plant populations

Bayesian detection of non-sinusoidal periodic patterns in circadian expression data

A Bayesian machine learning method for sensor selection and fusion with application to on-board fault diagnostics

신용등급전이행렬의 경험적 베이지안 추정과 비교

Selecting the Best Process Based on Capability Index via Empirical Bayes Approach

Appraisal of the Interactive Highway Safety Design Model’s Crash Prediction and Design Consistency Modules: Case Studies from Pennsylvania

Multitask Compressive Sensing

Tests for gene‐environment interaction from case‐control data: a novel study of type I error, power and designs

In-season prediction of batting averages: A field test of empirical Bayes and Bayes methodologies

Semi-Parametric Differential Expression Analysis via Partial Mixture Estimation

Selection of the best Bernoulli population provided it is better than a control: An empirical Bayes approach

Empirical Bayes Inference of Pairwise FST and Its Distribution in the Genome

Safety Index for Evaluation of Two-Lane Rural Highways

Sampling Survey and Statistical Genetics in Fishery Resource Management and Conservation

Effect of missing values in detecting differentially expressed genes in a cDNA microarray experiment

Estimating the mean effect size in meta-analysis: bias, precision, and mean squared error of different weighting methods.

Uncertainty in most probable number calculations for microbiological assays.

Effects of clofibric acid on mRNA expression profiles in primary cultures of rat, mouse and human hepatocytes

Non-parametric empirical Bayes procedure

An Empirical Bayes Group-Testing Approach to Estimating Small Proportions

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Empirical Bayes Procedure Research Articles

Related Topics

Articles published on Empirical Bayes Procedure

Accuracy of genotypic value predictions for marker-based selection in biparental plant populations

Bayesian detection of non-sinusoidal periodic patterns in circadian expression data

A Bayesian machine learning method for sensor selection and fusion with application to on-board fault diagnostics

신용등급전이행렬의 경험적 베이지안 추정과 비교

Selecting the Best Process Based on Capability Index via Empirical Bayes Approach

Appraisal of the Interactive Highway Safety Design Model’s Crash Prediction and Design Consistency Modules: Case Studies from Pennsylvania

Multitask Compressive Sensing

Tests for gene‐environment interaction from case‐control data: a novel study of type I error, power and designs

In-season prediction of batting averages: A field test of empirical Bayes and Bayes methodologies

Semi-Parametric Differential Expression Analysis via Partial Mixture Estimation

Selection of the best Bernoulli population provided it is better than a control: An empirical Bayes approach

Empirical Bayes Inference of Pairwise FST and Its Distribution in the Genome

Safety Index for Evaluation of Two-Lane Rural Highways

Sampling Survey and Statistical Genetics in Fishery Resource Management and Conservation

Effect of missing values in detecting differentially expressed genes in a cDNA microarray experiment

Estimating the mean effect size in meta-analysis: bias, precision, and mean squared error of different weighting methods.

Uncertainty in most probable number calculations for microbiological assays.

Effects of clofibric acid on mRNA expression profiles in primary cultures of rat, mouse and human hepatocytes

Non-parametric empirical Bayes procedure

An Empirical Bayes Group-Testing Approach to Estimating Small Proportions