Wenjiang J. Fu

Learn More
MOTIVATION Estimation of misclassification error has received increasing attention in clinical diagnosis and bioinformatics studies, especially in small sample studies with microarray data. Current error estimation methods are not satisfactory because they either have large variability (such as leave-one-out cross-validation) or large bias (such as(More)
Real-time RT-PCR has been frequently used in quantitative research in molecular biology and bioinformatics. It provides remarkably useful technology to assess expression of genes. Although mathematical models for gene amplification process have been studied, statistical models and methods for data analysis in real-time RT-PCR have received little attention.(More)
MOTIVATION The standard paradigm for a classifier design is to obtain a sample of feature-label pairs and then to apply a classification rule to derive a classifier from the sample data. Typically in laboratory situations the sample size is limited by cost, time or availability of sample material. Thus, an investigator may wish to consider a sequential(More)
Affymetrix SNP arrays have been widely used for single-nucleotide polymorphism (SNP) genotype calling and DNA copy number variation inference. Although numerous methods have achieved high accuracy in these fields, most studies have paid little attention to the modeling of hybridization of probes to off-target allele sequences, which can affect the accuracy(More)
BACKGROUND The genetic etiology of complex diseases in human has been commonly viewed as a complex process involving both genetic and environmental factors functioning in a complicated manner. Quite often the interactions among genetic variants play major roles in determining the susceptibility of an individual to a particular disease. Statistical methods(More)
Oligonucleotide microarrays are commonly adopted for detecting and qualifying the abundance of molecules in biological samples. Analysis of microarray data starts with recording and interpreting hybridization signals from CEL images. However, many CEL images may be blemished by noises from various sources, observed as "bright spots", "dark clouds", and(More)
The genomic wave has been identified as a major artifact in genome data and is highly correlated with the sequence GC content. Although statistical methods have been developed to filter this artifact, the mechanism underlying the genomic wave has not been studied yet. Understanding of the artifact, specifically the sources of the artifact, may lead to(More)
Recent genomic research increases our understanding of the causes of complex diseases and strengthens the evidence that many complex diseases, even those with late age of onset, are caused in part by genetically induced, adverse prenatal environments. This special issue of the Journal of Biomedicine and Biotechnology specifically examines the evidence that(More)
BACKGROUND Increasing evidence shows that whole genomes of eukaryotes are almost entirely transcribed into both protein coding genes and an enormous number of non-protein-coding RNAs (ncRNAs). Therefore, revealing the underlying regulatory mechanisms of transcripts becomes imperative. However, for a complete understanding of transcriptional regulatory(More)