Learn More
Development of a high quality reference sequence is a daunting task in crops like wheat with large (~17Gb), highly repetitive (>80%) and polyploid genome. To achieve complete sequence assembly of such genomes, development of a high quality physical map is a necessary first step. However, due to the lack of recombination in certain regions of the(More)
The uneven distribution of recombination across the length of chromosomes results in inaccurate estimates of genetic to physical distances. In wheat (Triticum aestivum L.) chromosome 3B, it has been estimated that 90% of the cross over events occur in distal sub-telomeric regions representing 40% of the chromosome. Radiation hybrid (RH) mapping which does(More)
The species cytoplasm specific (scs) genes affect nuclear-cytoplasmic interactions in interspecific hybrids. A radiation hybrid (RH) mapping population of 188 individuals was employed to refine the location of the scs ae locus on Triticum aestivum chromosome 1D. “Wheat Zapper,” a comparative genomics tool, was used to predict synteny between wheat(More)
Keywords: Knowledge discovery Pattern mining Financial applications Stock market Time series data a b s t r a c t Similarities among subsequences are typically regarded as categorical features of sequential data. We introduce an algorithm for capturing the relationships among similar, contiguous subsequences. Two time series are considered to be similar(More)
Doubts have been raised that time series subsequences can be clustered in a meaningful way. This paper introduces a kernel-density-based algorithm that detects meaningful patterns in the presence of a vast number of random-walk-like subsequences. The value of density-based algorithms for noise elimination in general has long been demonstrated. The challenge(More)
Biofilms are communities of bacteria whose formation on surfaces requires a large portion of the bacteria’s transcriptional network. To identify environmental conditions and transcriptional regulators that contribute to sensing these conditions, we used a high-throughput approach to monitor biofilm biomass produced by an isogenic set of Escherichia coli(More)
The representation of multiple continuous attributes as dimensions in a vector space has been among the most influential concepts in machine learning and data mining. We consider sets of related continuous attributes as vector data and search for patterns that relate a vector attribute to one or more items. The presence of an item set defines a subset of(More)
Clustering of time series subsequence data commonly produces results that are unspecific to the data set. This paper introduces a clustering algorithm, that creates clusters exclusively from those subsequences that occur more frequently in a data set than would be expected by random chance. As such, it partially adopts a pattern mining perspective into(More)