Data Set Used
In this paper we describe an ecient and scalable implementation for grammar induction based on the EMILE approach (2], 3],,4], 5], 6]). The current EMILE 4.1 implementation ((11]) is one of the rst eecient grammar induction algorithms that work on free text. Although EMILE 4.1 is far from perfect, it enables researchers to do empirical grammar induction… (More)
This introductory paper to the special issue on Data Mining Lessons Learned presents lessons from data mining applications, including experience from science, business, and knowledge management in a collaborative data mining setting.
Large scale scientific applications require extensive support from middleware and frameworks that provide the capabilities for distributed execution in the Grid environment. In particular, one of the examples of such frameworks is a Grid-enabled workflow management system. In this paper we present WS-VLAM workflow management system, describe its current… (More)
BACKGROUND Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed.… (More)
This paper describes how binary associations in databases of items can be organised and clustered. Two similarity measures are presented that can be used to generate a weighted graph of associations. Each measure focuses on different kinds of regularities in the database. By calculating a Minimum Spanning Tree on the graph of associations, the most… (More)
In e-Science, meaningful experiment processes and workflow engines emerge as important scientific resources. A complex experiment often involves services and processes developed in different scientific domains. Aggregating different workflows into one meta workflow avoids unnecessary rewriting of experiment processes and thus improves the reuse efficiency.… (More)