Learn More
The sequential context modeling framework is generalized to a non-sequential one by context relaxation from consecutive suffix of the subsequences of symbols to the permutation of the preceding symbols as result of considering complex context structures in such sources as video and program binaries. Context weighting tree is also extended to a series of(More)
MOTIVATION Genome-wide association studies (GWAS) have been widely used in discovering the association between genotypes and phenotypes. Human genome data contain valuable but highly sensitive information. Unprotected disclosure of such information might put individual's privacy at risk. It is important to protect human genome data. Exact logistic(More)
BACKGROUND The increasing availability of genome data motivates massive research studies in personalized treatment and precision medicine. Public cloud services provide a flexible way to mitigate the storage and computation burden in conducting genome-wide association studies (GWAS). However, data privacy has been widely concerned when sharing the sensitive(More)
Introduction Genome data are playing a significantly important role in modern medicine, e.g., personalized medicine and earlier detection of diseases. With the increasing demand in genome data, advanced sequencing techniques have been developed, among which the flexible miniaturized sequencing devices [1] are very promising, especially due to their(More)
Previous reference-based compression on DNA sequences do not fully exploit the intrinsic statistics by merely concerning the approximate matches. In this paper, an adaptive difference distribution-based coding framework is proposed by the fragments of nucleotides with a hierarchical tree structure. To keep the distribution of difference sequence from the(More)
BACKGROUND In biomedical research, data sharing and information exchange are very important for improving quality of care, accelerating discovery, and promoting the meaningful secondary use of clinical data. A big concern in biomedical data sharing is the protection of patient privacy because inappropriate information leakage can put patient privacy at(More)
—This paper proposes generalized context modeling (GCM) for heterogeneous data compression. The proposed model extends the suffix of predicted subsequences in classic context modeling to arbitrary combinations of symbols in multiple directions. To address the selection of contexts, GCM constructs a model graph with a combinatorial structuring of finite(More)