Chengyong Yang

Learn More
BACKGROUND Whole-genome sequencing may revolutionize medical diagnostics through rapid identification of alleles that cause disease. However, even in cases with simple patterns of inheritance and unambiguous diagnoses, the relationship between disease phenotypes and their corresponding genetic changes can be complicated. Comprehensive diagnostic assays must(More)
In this paper, we have modified a constrained clustering algorithm to perform exploratory analysis on gene expression data using prior knowledge presented in the form of constraints. We have also studied the effectiveness of various constraints sets. To address the problem of automatically generating constraints from biological text literature, we(More)
There is very little information available with regard to gene regulatory relationships in Plasmodium falciparum. In an attempt to discover transcription factor binding motifs (TFBMs) in P. falciparum, we considered two approaches. In the first approach, gene expression data of all the conditions were fed into the Iterative Signature Algorithm (ISA), which(More)
PURPOSE Dynamic contrast-enhanced T2*-weighted MR imaging has been helpful in characterizing intracranial mass lesions by providing information on vascularity. Tumefactive demyelinating lesions (TDLs) can mimic intracranial neoplasms on conventional MR images, can be difficult to diagnose, and often result in surgical biopsy for suspected tumor. The purpose(More)
Support vector machines (SVM) and K-nearest neighbors (KNN) are two computational machine learning tools that perform supervised classification. This paper presents a novel application of such supervised analytical tools for microbial community profiling and to distinguish patterning among ecosystems. Amplicon length heterogeneity (ALH) profiles from(More)
Clustering of gene expression data is a standard technique used to identify closely related genes. In this paper, we develop a new clustering algorithm, MSC (Multi-Source Clustering), to perform exploratory analysis using two or more diverse sources of data. In particular, we investigate the problem of improving the clustering by integrating information(More)
Background There is considerable ongoing effort towards making DNA sequencing machines faster and more affordable today. Improving the accuracy of next-generation sequencers directly lowers sequencing costs by reducing the need for resequencing, making genome-based diagnostics and research more affordable [1]. In this paper, we show how the accuracy of(More)
Clustering of gene expression data is a standard exploratory technique used to identify closely related genes. Many other sources of data are also likely to be of great assistance in the analysis of gene expression data. This data provides a mean to begin elucidating the large-scale modular organization of the cell. The authors consider the challenging task(More)
This paper presents an innovative, adaptive variant of Kohonen’s selforganizing maps called ASOM, which is an unsupervised clustering method that adaptively decides on the best architecture for the self-organizing map. Like the traditional SOMs, this clustering technique also provides useful information about the relationship between the resulting clusters.(More)