• Publications
  • Influence
Algorithms for Generating Fundamental Cycles in a Graph
TLDR
It is shown that for regular graphs of order n the expected value of the total length of a minimum fundamentalcycle set does not exceed O(n2).
A random walk method for alleviating the sparsity problem in collaborative filtering
TLDR
A novel item-oriented algorithm that first infers transition probabilities between items based on their similarities and models finite length random walks on the item space to compute predictions, and suggests a method to enhance similarity matrices under sparse data as well.
Finding communities by clustering a graph into overlapping subgraphs
TLDR
Two novel efficient algorithms are designed, implemented, and tested, which find communities according to the definition of a community: a community is a subset of actors who induce a locally optimal subgraph with respect to a density function defined on subsets of actors.
Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals
TLDR
It is shown that families of technical documents that share the same layout conventions can be readily analyzed and backtracking for error recovery and branch and bound for maximum-area labeling are implemented with Unix Shell programs.
A Mechanizable Induction Principle for Equational Specifications
TLDR
A new induction principle based on a constructor model of a data structure is developed that can be used for proving properties by induction for data structures such as integers, finite sets, whose values cannot be freely constructed.
Modeling and Multiway Analysis of Chatroom Tensors
TLDR
This work identifies the limitations of n-way data analysis techniques in multidimensional stream data, and establishes a link between data collection and performance of these techniques, and extends data analysis to multiple dimensions by constructing n- way data arrays known as high order tensors.
LOGML: Log Markup Language for Web Usage Mining
TLDR
The usefulness of LOGML in web usage mining is illustrated; the simplicity with which mining algorithms can be specified and implemented efficiently using LOGML is shown.
Graph Theoretic and Spectral Analysis of Enron Email Data
TLDR
This paper analyzes the Enron email data set to discover structures within the organization and shows that preprocessing of data has significant impact on the results, thus a standard form is needed for establishing a benchmark data.
An NP-hard problem in bipartite graphs
Checking for Hamiltonian circuit in bipartite graphs is shown to be NP-hard.
...
...