The Weka4WS framework for distributed data mining in service‐oriented Grids

@article{Talia2008TheWF,
  title={The Weka4WS framework for distributed data mining in service‐oriented Grids},
  author={Domenico Talia and Paolo Trunfio and Oreste Verta},
  journal={Concurrency and Computation: Practice and Experience},
  year={2008},
  volume={20}
}
The service‐oriented architecture paradigm can be exploited for the implementation of data and knowledge‐based applications in distributed environments. The Web services resource framework (WSRF) has recently emerged as the standard for the implementation of Grid services and applications. WSRF can be exploited for developing high‐level services for distributed data mining applications. This paper describes Weka4WS, a framework that extends the widely used open source Weka toolkit to support… Expand
How distributed data mining tasks can thrive as knowledge services
TLDR
A strategy and a model based on the use of services for the design of distributed knowledge discovery services are described and how Grid frameworks can be developed as a collection of services and how they can be used to develop distributed data analysis tasks and knowledge discovery processes using the SOA model. Expand
A new web-based solution for modelling data mining processes
TLDR
A new web-based solution named DAMIS, inspired by the Cloud, is proposed and implemented that allows making massive data mining simpler, effective, and easily understandable for data scientists and business intelligence professionals by constructing scientific workflows for data mining using a drag and drop interface. Expand
An Approach to Enhance Web Service Resource Framework using the Improved PLWAP Algorithm for Large Scale Hybrid Data in Distributed Environment
— With the accessing of millions of web pages for business and personal transactions, huge amounts of web page access data have been stored in web servers. The existing web pattern mining systems doExpand
Development of a Grid-based Framework for High-Performance Scientific Knowledge Discovery
TLDR
The SINDI-Grid framework provides a variety of grid services for distributed data analysis and scientific knowledge processing and the SindI-Workflow tool exploits these services so that performs the design and execution for scientific and technological knowledge discovery applications which integrate various information processing algorithms. Expand
Scaling Effectivity of Research Contributions in Distributed Data mining over Grid Infrastructures
With the increasing need of data availability and cloud-based services, distributed database management has already gained a maximum momentum in technological advancement. With the data stored inExpand
A Testbed for Collecting QoS Data of Cloud-Based Analytic Services
TLDR
This paper proposes a testbed system which can be used to collect the QoS data for software services hosted in the cloud, mainly focus on analytic services, whose QoS values could be dependent on the data they are used to process and analyze. Expand
Data-Centered Service Composition for Information Analysis
TLDR
This research avoided the rerun of the workflow by storing service invocation results on a platform and realized data-centered service composition by adding and deleting rules to be fired. Expand
Data Mining Techniques in Parallel and Distributed Environment – A Comprehensive Survey
TLDR
Various data mining tools and techniques that can be used in distributed environment are described and different algorithmic and architectural approaches are followed in various distributed mining techniques. Expand
Grid-based framework for high-performance processing of scientific knowledge
TLDR
This study proposes a scientific-knowledge processing framework, which offers high performance by using grid computing technology for extracting important entities and their relations from the scientific literature by flexibly adjusting the number of computing nodes that constitute the grid environment as thenumber of documents for processing increases. Expand
Data stream mining in ubiquitous environments: state‐of‐the‐art and current directions
TLDR
This article reviews the state‐of‐the‐art techniques in mining data streams for mobile and ubiquitous environments, and identifies the key characteristics of these algorithms and present illustrative applications. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 31 REFERENCES
Weka4WS: A WSRF-Enabled Weka Toolkit for Distributed Data Mining on Grids
TLDR
The paper describes the design and the implementation of Weka4WS, a framework that extends the Weka toolkit for supporting distributed data mining on Grid environments using a first release of the WSRF library. Expand
Web services composition for distributed data mining
A Web services-based toolkit for supporting distributed data mining is presented. A workflow engine is provided within the toolkit to enable a user to compose Web services to implement particularExpand
Alternative Software Stacks for OGSA-based Grids
TLDR
This paper uniquely shows that there could be alternative software stacks for OGSA-based Grids, based on WS-Transfer and WS-Eventing, by qualitatively and quantitatively evaluating each approach. Expand
The Physiology of the Grid
In both e-business and e-science, we often need to integrate services across distributed, heterogeneous, dynamic “virtual organizations” formed from the disparate resources within a single enterpriseExpand
Towards an open service architecture for data mining on the grid
TLDR
This paper describes design principles and a service based software architecture of a novel infrastructure for distributed and high-performance data mining in Grid environments. Expand
Mining Large Data Sets on Grids: Issues and Prospects
TLDR
The main issues, requirements, and design approaches for the implementation of grid-based knowledge discovery systems are discussed and some prospects and promising research directions in datacentric and knowledge-discovery oriented grids are outlined. Expand
Discovery net: towards a grid of knowledge discovery
TLDR
This paper shows how this architecture will behave during a typical KDD process design and deployment, how it enables the execution of complex and distributed data mining tasks with high performance and how it provides a community of e- scientists with means to collaborate, retrieve and reuse both KDD algorithms, discovery processes and knowledge in a visual analytical environment. Expand
The WS-Resource Framework
TLDR
The WS-Resource framework is introduced, a set of proposed Web services specifications that define a rendering of the WS- resource approach in terms of specific message exchanges and related XML definitions, and is provided for review and evaluation only. Expand
Globus Toolkit Version 4: Software for Service-Oriented Systems
TLDR
The principal characteristics of the latest release, the Web services-based GT4, which provides significant improvements over previous releases in terms of robustness, performance, usability, documentation, standards compliance, and functionality are summarized. Expand
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
TLDR
The authors present an extensible and open Grid architecture, in which protocols, services, application programming interfaces, and software development kits are categorized according to their roles in enabling resource sharing. Expand
...
1
2
3
4
...