Learn More
This paper proposes a new method for computing page importance, referred to as BrowseRank. The conventional approach to compute page importance is to exploit the link graph of the web and to build a model based on that graph. For instance, PageRank is such an algorithm, which employs a discrete-time Markov process as the model. Unfortunately, the link graph(More)
This paper is concerned with rank aggregation, the task of combining the ranking results of individual rankers at meta-search. Previously, rank aggregation was performed mainly by means of <i>unsupervised</i> learning. To further enhance ranking accuracies, we propose employing <i>supervised</i> learning to perform the task, using labeled data. We refer to(More)
microRNAs (miRNAs) are endogenous, noncoding, small RNAs that have essential regulatory functions in plant growth, development, and stress response processes. However, limited information is available about their functions in sexual reproduction of flowering plants. Pollen development is an important process in the life cycle of a flowering plant and is a(More)
Learning to rank has become an important research topic in machine learning. While most learning-to-rank methods learn the ranking functions by minimizing loss functions, it is the ranking measures (such as NDCG and MAP) that are used to evaluate the performance of the learned ranking functions. In this work, we reveal the relationship between ranking(More)
This paper presents a theoretical framework for ranking, and demonstrates how to perform generalization analysis of listwise ranking algorithms using the framework. Many learning-to-rank algorithms have been proposed in recent years. Among them, the listwise approach has shown higher empirical ranking performance when compared to the other approaches.(More)
This paper is concerned with the generalization ability of learning to rank algorithms for information retrieval (IR). We point out that the key for addressing the learning problem is to look at it from the viewpoint of <i>query</i>. We define a number of new concepts, including query-level loss, query-level risk, and query-level stability. We then analyze(More)
Regularized empirical risk minimization (R-ERM) is an important branch of machine learning, since it constrains the capacity of the hypothesis space and guarantees the generalization ability of the learning algorithm. Two classic proximal optimization algorithms, i.e., proximal stochastic gradient descent (ProxSGD) and proximal stochastic coordinate descent(More)
Since the website is one of the most important organizational structures of the Web, how to effectively rank websites has been essential to many Web applications, such as Web search and crawling. In order to get the ranks of websites, researchers used to describe the inter-connectivity among websites with a so-called HostGraph in which the nodes denote(More)
MOTIVATION Cancer is well known to be the end result of somatic mutations that disrupt normal cell division. The number of such mutations that have to be accumulated in a cell before cancer develops depends on the type of cancer. The waiting time T(m) until the appearance of m mutations in a cell is thus an important quantity in population genetics models(More)
The inference of demographic history of populations is an important undertaking in population genetics. A few recent studies have developed identity-by-descent (IBD) based methods to reveal the signature of the relatively recent historical events. Notably, Pe&#x0027;er and his colleagues have introduced a novel method (named PIBD here) by employing IBD(More)