Learn More
This paper proposes a new method for computing page importance, referred to as BrowseRank. The conventional approach to compute page importance is to exploit the link graph of the web and to build a model based on that graph. For instance, PageRank is such an algorithm, which employs a discrete-time Markov process as the model. Unfortunately, the link graph(More)
Regularized empirical risk minimization (R-ERM) is an important branch of machine learning, since it constrains the capacity of the hypothesis space and guarantees the generalization ability of the learning algorithm. Two classic proximal optimization algorithms, i.e., proximal stochastic gradient descent (ProxSGD) and proximal stochastic coordinate descent(More)
This paper is concerned with the generalization ability of learning to rank algorithms for information retrieval (IR). We point out that the key for addressing the learning problem is to look at it from the viewpoint of <i>query</i>. We define a number of new concepts, including query-level loss, query-level risk, and query-level stability. We then analyze(More)
Learning to rank has become an important research topic in machine learning. While most learning-to-rank methods learn the ranking functions by minimizing loss functions, it is the ranking measures (such as NDCG and MAP) that are used to evaluate the performance of the learned ranking functions. In this work, we reveal the relationship between ranking(More)
MOTIVATION Cancer is well known to be the end result of somatic mutations that disrupt normal cell division. The number of such mutations that have to be accumulated in a cell before cancer develops depends on the type of cancer. The waiting time T(m) until the appearance of m mutations in a cell is thus an important quantity in population genetics models(More)
The length of ancestral tracks decays with the passing of generations which can be used to infer population admixture histories. Previous studies have shown the power in recovering the histories of admixed populations via the length distributions of ancestral tracks even under simple models. We believe that the deduction of length distributions under a(More)
This paper presents a theoretical framework for ranking, and demonstrates how to perform generalization analysis of listwise ranking algorithms using the framework. Many learning-to-rank algorithms have been proposed in recent years. Among them, the listwise approach has shown higher empirical ranking performance when compared to the other approaches.(More)
We propose a <i>General Markov Framework</i> for computing page importance. Under the framework, a <i>Markov Skeleton Process</i> is used to model the random walk conducted by the web surfer on a given graph. Page importance is then defined as the product of <i>page reachability</i> and <i>page utility</i>, which can be computed from the transition(More)