Learn More
India being a multilingual nation, with 22 recognised official languages, also has literature in all these languages; they find representation in the Digital Library of India (DLI) which holds over 120,000 books. DLI has driven the creation of a large number of applications to process and present the Indian language content. In this paper, we present the(More)
BACKGROUND Biological processes in cells are carried out by means of protein-protein interactions. Determining whether a pair of proteins interacts by wet-lab experiments is resource-intensive; only about 38,000 interactions, out of a few hundred thousand expected interactions, are known today. Active machine learning can guide the selection of pairs of(More)
Autosomal dominant leukodystrophy (ADLD) is an adult onset demyelinating disorder that is caused by duplications of the lamin B1 (LMNB1) gene. However, as only a few cases have been analyzed in detail, the mechanisms underlying LMNB1 duplications are unclear. We report the detailed molecular analysis of the largest collection of ADLD families studied, to(More)
BACKGROUND Prediction of transmembrane (TM) helices by statistical methods suffers from lack of sufficient training data. Current best methods use hundreds or even thousands of free parameters in their models which are tuned to fit the little data available for training. Further, they are often restricted to the generally accepted topology(More)
Developing a better mechanistic understanding of membrane protein folding is urgently needed because of the discovery of an increasing number of human diseases, where membrane protein instability and misfolding is involved. Towards this goal, we investigated folding and stability of 7-transmembrane (TM) helical bundles by computational methods. We compared(More)
Similar retinitis pigmentosa (RP) phenotypes can result from mutations affecting different rhodopsin regions, and distinct amino acid substitutions can cause different RP severity and progression rates. Specifically, both the R135L and R135W mutations (cytoplasmic end of H3) result in diffuse, severe disease (class A), but R135W causes more severe and more(More)
Here, we studied systematically the association between amino acids, the constituents of protein sequences in datasets of different hierarchy, i.e. genome (human), protein type (membrane proteins), protein family (specific types of membrane receptors and transporters) and transmembrane helices versus loops (either for membrane proteins in general or(More)
— Telugu is an Indian language spoken by over 50 million people in the country. The language is rich in literature and has been studied by native and foreign linguists significantly, yet it has not benefited significantly from the recent advances in computational approaches for linguistic or statistical processing of natural language texts. However with the(More)
Gene and protein sequence analyses, central components of studies in modern biology are easily amenable to string matching and pattern recognition algorithms. The growing need of analysing whole genome sequences more efficiently and thoroughly, has led to the emergence of new computational methods. Suffix trees and suffix arrays are data structures, well(More)
Understanding the structure, dynamics and function of proteins strongly parallels the mapping of words to meaning in natural language. Availability of large amounts of text in digital form has led to the convergence of linguistics with computational science, and has resulted in applications such as information retrieval and document summarization. In direct(More)