M. Oguzhan Külekci

Learn More
Given a pattern x of length m and a text y of length n, both over an ordered alphabet, the order-preserving pattern matching problem consists in finding all substrings of the text with the same relative order as the pattern. It is an approximate variant of the well known exact pattern matching problem which has gained attention in recent years. This(More)
Given a pattern and text, both over a common ordered alphabet, the orderpreserving pattern matching problem consists in finding all substrings of the text with the same relative order as the pattern. This problem, an approximate variant of the well-known exact pattern matching problem, finds applications in such fields as time series analysis (e.g., share(More)
We revisit the problem of finding shortest unique substring (SUS) proposed recently by [6]. We propose an optimal O(n) time and space algorithm that can find an SUS for every location of a string of size n. Our algorithm significantly improves the O(n) time complexity needed by [6]. We also support finding all the SUSes covering every location, whereas the(More)
Finding repetitive structures in genomes and proteins is important to understand their biological functions. Many data compressors for modern genomic sequences rely heavily on finding repeats in the sequences. Small-scale and local repetitive structures are better understood than large and complex interspersed ones. The notion of maximal repeats captures(More)
Searching for all occurrences of a given set of patterns in a text is a fundamental problem in computer science with applications in many fields, like computational biology and intrusion detection systems. In the last two decades a general trend has appeared trying to exploit the power of the word RAM model to speed-up the performances of classical string(More)
The problem of order-preserving matching is to find all substrings in the text which have the same relative order and length as the pattern. Several online and one offline solution were earlier proposed for the problem. In this paper, we introduce three new solutions based on filtration. The two online solutions rest on the SIMD (Single Instruction Multiple(More)
We investigate the usage of the wavelet tree and the rank/select-dictionary data structures on hybrid-structured variable-length codes, which represent an integer in the form of a unary code section followed by a binary section. We propose to handle unary and binary partitions as separate streams and create wavelet trees or R/S dictionaries over the unary(More)
The usual way of ensuring the confidentiality of the compressed data is to encrypt it with a standard encryption algorithm. Although the computational cost of encryption is practically tolerable in most cases, the lack of flexibility to perform pattern matching on the compressed data due to the encryption level is the main disadvantage. Another alternative(More)