Learn More
The edit distance is a basic string similarity measure for many applications such as string searching, text mining, signal processing, bioinformatics and so on. However, its high computational cost often prevents it from being used for a large set of strings like similar string searching. A promising solution for the problem is to approximate the edit(More)
(Abstract) Given an undirected graph G, we consider enumerating all Eulerian trails, that is, walks containing each of the edges in G just once. We consider achieving it with the enumeration of Hamiltonian paths with the zero-suppressed decision diagram (ZDD), a data structure that can efficiently store a family of sets satisfying given conditions. First we(More)
We study large-scale classification problems in changing environments where a small part of the dataset is modified, and the effect of the data modification must be quickly incorporated into the clas-sifier. When the entire dataset is large, even if the amount of the data modification is fairly small, the computational cost of retraining the classifier(More)
Privacy concern has been increasingly important in many machine learning (ML) problems. We study empirical risk minimization (ERM) problems under secure multi-party computation (MPC) frameworks. Main technical tools for MPC have been developed based on cryptography. One of limitations in current cryptographically private ML is that it is computationally(More)
In this paper we consider the problem of similar substring searching in the q-gram distance. The q-gram distance d q (x, y) is a similarity measure between two strings x and y defined by the number of different q-grams between them. The distance can be used instead of the edit distance due to its lower computation cost, O(|x| + |y|) vs. O(|x||y|), and its(More)
The edit distance is a basic string similarity measure used in many applications such as text mining, signal processing, bioinformatics, and so on. However, the computational cost can be a problem when we repeat many distance calculations as seen in real-life searching situations. A promising solution to cope with the problem is to approximate the edit(More)
The q-gram distance d<sub>q</sub>(x, y) between two strings x and y is a string similarity measure correlated with a famous string distance: the edit distance. In addition, it can be computed much faster, in linear (O(|x|+|y|)) time, than the edit distance in quadratic (O(|x||y|)) time, where | &#x00B7; | denotes the string length. However, it does not mean(More)
  • 1