Learn More
LSHTC is a series of challenges which aims to assess the performance of classification systems in large-scale classification in a a large number of classes (up to hundreds of thousands). This paper describes the dataset that have been released along the LSHTC series. The paper details the construction of the datsets and the design of the tracks as well as(More)
This article provides an overview of the first BioASQ challenge, a competition on large-scale biomedical semantic indexing and question answering (QA), which took place between March and September 2013. BioASQ assesses the ability of systems to semantically index very large numbers of biomedical scientific articles, and to return concise and(More)
Some Statistical Software Testing approaches rely on sampling the feasible paths in the control flow graph of the program; the difficulty comes from the tiny ratio of feasible paths. This paper presents an adaptive sampling mechanism called EXIST for Ex-ploration/eXploitation Inference for Software Testing , able to retrieve distinct feasible paths with(More)
We consider the problem of discovering links of an evolving undirected graph given a series of past snapshots of that graph. The graph is observed through the time sequence of its adjacency matrix and only the presence of edges is observed. The absence of an edge on a certain snapshot cannot be distinguished from a missing entry in the adjacency matrix.(More)
How to determine <i>a priori</i> whether a learning algorithm is suited to a learning problem instance is a major scientific and technological challenge. A first step toward this goal, inspired by the Phase Transition (PT) paradigm developed in the Constraint Satisfaction domain, is presented in this paper.Based on the PT paradigm, extensive and principled(More)
Motivated by Structural Statistical Software Testing (SSST), this paper is interested in sampling the feasible execution paths in the control flow graph of the program being tested. For some complex programs , the fraction of feasible paths becomes tiny, ranging in [10 −10 , 10 −5 ]. When relying on the uniform sampling of the program paths, SSST is thus(More)
Distributed term models are a powerful tool for classifying, clustering and representing documents. For dynamic collections, we need to model both temporal and topical evolution. In this work, we present a model that uses a continuous time distribution and that represent words as trajectories in a continuous space. We perform some preliminary experiments on(More)
Context: Finite State Machine (FSM) inference from execution traces has received a lot of attention over the past few years. Various approaches have been explored, each holding different properties for the resulting models, but the lack of standard benchmarks limits the ability of comparing the proposed techniques. Evaluation is usually performed on a few(More)