Andrey Kashlev

Learn More
Apache Cassandra is a leading distributed database of choice when it comes to big data management with zero downtime, linear scalability, and seamless multiple data center deployment. With increasingly wider adoption of Cassandra for online transaction processing by hundreds of Web-scale companies, there is a growing need for a rigorous and practical data(More)
In this new era of Big Data, there is a growing need to enable scientific workflows to perform computations at a scale far exceeding a single workstation's capabilities. When running such data intensive workflows in the cloud distributed across several physical locations, the execution time and the resource utilization efficiency highly depends on the(More)
Scientific workflows have become an important paradigm for domain scientists to formalize and structure complex data-intensive scientific processes. The ever-increasing volumes of scientific data motivate researchers to extend scientific workflow management systems (SWFMSs) to utilize the power of Cloud computing to perform big data analyses. Unlike(More)
Article history: Received 21 December 2011 Received in revised form 30 August 2013 Accepted 31 August 2013 Available online xxxx Provenance has become increasingly important in scientific workflows to understand, verify, and reproduce the result of scientific data analysis. Most existing systems store provenance data in provenance stores with proprietary(More)
When designing scientific workflows, users often face the so-called shimming problem when connecting two related but incompatible components. The problem is addressed by inserting a special kind of adaptors, called shims, that perform appropriate data transformations to resolve data type inconsistencies. However, existing shimming techniques provide limited(More)
Provenance, which records the history of an insilico experiment, has been identified as an important requirement for scientific workflows to support scientific discovery reproducibility, result interpretation, and problem diagnosis. Large provenance datasets are composed of many smaller provenance graphs, each of which corresponds to a single workflow(More)
Geosciences Web portals are becoming increasingly important for supporting geoscientists in their research. The GEO-SEED portal is a repository of geosciences web services metadata, represented in Resource Description Framework (RDF), which supports management and discovery by machines and automated agents. This project uses SPARQL, the W3C standard for(More)
There is an increasing demand for data-intensive applications in which scientists use scientific workflows to integrate together data management, analysis, simulation and visualization services over often voluminous complex and distributed scientific data and services. One major limitation of current scientific workflow models is that each workflow task is(More)
When composing Web services into scientific workflows, users often face the so-called shimming problem when connecting two related but incompatible components. The problem is addressed by inserting a special kind of adaptors, called shims, that perform appropriate data transformations to resolve data type inconsistencies. However, existing shimming(More)