Maria Federico

Learn More
Finding motifs in biological sequences is one of the most intriguing problems for string algorithms designers due to, on the one hand, the numerous applications of this problem in molecular biology and, on the other hand, the challenging aspects of the computational problem. Indeed, when dealing with biological sequences it is necessary to work with(More)
With a growing number of online videos, many producers feel the need to use video captions in order to expand content accessibility and face two main issues: production and alignment of the textual transcript. Both activities are expensive either for the high labor of human resources or for the employment of dedicated software. In this paper, we focus on(More)
We present a tool for detecting long similar fragments that occur two or more times in a set of biological sequences. The problem has interesting applications in the analysis of biological sequences and their correlation, and becomes computationally challenging when a certain non negligible number of insertions, deletions and substitutions are allowed. For(More)
The notion of DNA motif is a mathematical abstraction used to model regions of the DNA (known as Transcription Factor Binding Sites, or TFBSs) that are bound by a given Transcription Factor to regulate gene expression or repression. In turn, DNA structured motifs are a mathematical counterpart that models sets of TFBSs that work in concert in the gene(More)
We present an algorithm for detecting long similar fragments occurring at least twice in a set of biological sequences. The problem becomes computationally challenging when the frequency of a repeat is allowed to increase and when a non-negligible number of insertions, deletions and substitutions are allowed. We introduce in this paper an algorithm, Rime1(More)
This paper presents new optimizations designed to improve an algorithm at the state-of-the-art for filtering sequences as a preprocessing step to the task of finding multiple repeats allowing a given pairwise edit distance between pairs of occurrences. The target application is to find possibly long repeats having two or more occurrences, such that each(More)
This paper reports on experiments of porting the ITC-irst Italian broadcast news recognition system to two spontaneous dialogue domains. Porting was investigated by applying state-of-the-art adaptation methods on acoustic and language models, and by evaluating the trade-off between performance and required amount of task specific annotated data. The use of(More)
In this paper we present an algorithm for the problem of planted structured motif extraction from a set of sequences. This problem is strictly related to the structured motif extraction problem, which has many important applications in molecular biology. We propose an algorithm that uses a simple two-stage approach: first it extracts simple motifs, then the(More)
Radiation therapy (RT) is a component of the treatment of patients with head and neck malignancies. This therapy may damage the nearby carotid arteries, thereby initiating or accelerating the atherosclerotic process (atheroma formation). Dentists treating patients who have been irradiated should examine the patient's panoramic radiograph for evidence of(More)