Learn More
Due to the rapidly increasing amount of biomedical literature , automatic processing of biomedical papers is extremely important. Named Entity Recognition (NER) in this type of writing has several difficulties. In this paper we present a system to find phenotype names in biomedical literature. The system is based on Metamap and makes use of the UMLS(More)
Many multimeric transcription factors recognize DNA sequence patterns by cooperatively binding to bipartite elements composed of half sites separated by a flexible spacer. We developed a novel bipartite algorithm, bipartite pattern discovery (Bipad), which produces a mathematical model based on information maximization or Shannon's entropy minimization(More)
BACKGROUND We present Delila-genome, a software system for identification, visualization and analysis of protein binding sites in complete genome sequences. Binding sites are predicted by scanning genomic sequences with information theory-based (or user-defined) weight matrices. Matrices are refined by adding experimentally-defined binding sites to(More)
We developed single copy probes from the draft genome sequence for fluorescence in situ hybridization (scFISH) which precisely delineate chromosome abnormalities at a resolution equivalent to genomic Southern analysis. This study illustrates how scFISH probes detect cryptic and subtle abnormalities and localize the sites of chromosome rearrangements. scFISH(More)
Diagnostic DNA hybridization relies on probes composed of single copy (sc) genomic sequences. Sc sequences in probe design ensure high specificity and avoid cross-hybridization to other regions of the genome, which could lead to ambiguous results that are difficult to interpret. We examine how the distribution and composition of repetitive sequences in the(More)
Cross-hybridization of repetitive sequences in genomic and expression arrays is reported to be suppressed with repeat-blocking nucleic acids (C(o)t-1 DNA). Contrary to expectation, we demonstrated that C(o)t-1 also enhanced non-specific hybridization between probes and genomic targets. When added to target DNA, C(o)t-1 enhanced hybridization (2.2- to(More)
Interpretation of variants present in complete genomes or exomes reveals numerous sequence changes, only a fraction of which are likely to be pathogenic. Mutations have been traditionally inferred from allele frequencies and inheritance patterns in such data. Variants predicted to alter mRNA splicing can be validated by manual inspection of transcriptome(More)
BACKGROUND Segmental duplicons (SDs) predispose to an increased frequency of chromosomal rearrangements. These rearrangements can cause a diverse range of phenotypes due to haploinsufficiency, in cis positional effects or gene interruption. Genomic microarray analysis has revealed gene dosage changes adjacent to duplicons, but the high degree of similarity(More)
BACKGROUND The identification of promoter regions that are regulated by a given transcription factor has traditionally relied upon the identification and distributions of binding sites recognized by the factor. In this study, we have developed a tandem machine learning approach for the identification of regulatory target genes based on these parameters and(More)