Share This Author
A view of cloud computing
Clearing the clouds away from the true potential and obstacles posed by this computing capability.
Above the Clouds: A Berkeley View of Cloud Computing
This work focuses on SaaS Providers (Cloud Users) and Cloud Providers, which have received less attention than SAAS Users, and uses the term Private Cloud to refer to internal datacenters of a business or other organization, not made available to the general public.
Spark SQL: Relational Data Processing in Spark
Spark SQL is a new module in Apache Spark that integrates relational processing with Spark's functional programming API, and includes a highly extensible optimizer, Catalyst, built using features of the Scala programming language.
Apache Spark: a unified engine for big data processing
This open source computing framework unifies streaming, batch, and interactive big data workloads to unlock new applications.
Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark
Structured Streaming is a new high-level streaming API in Apache Spark based on the experience with Spark Streaming that achieves high performance via Spark SQL's code generation engine and can outperform Apache Flink by up to 2x and Apache Kafka Streams by 90x.
Drizzle: Fast and Adaptable Stream Processing at Scale
Drizzle is a system that decouples the processing interval from the coordination interval used for fault tolerance and adaptability and exhibits better adaptability, and can recover from failures 4x faster than Flink while having up to 13x lower latency during recovery.
PIQL: Success-Tolerant Query Processing in the Cloud
- Michael Armbrust, Kristal Curtis, Tim Kraska, A. Fox, M. Franklin, D. Patterson
- Computer ScienceProc. VLDB Endow.
- 1 November 2011
This paper proposes PIQL, a declarative language that also provides scale independence by calculating an upper bound on the number of key/value store operations that will be performed for any query, and presents the PIQL query processing system and evaluates its scale independence on hundreds of machines using two benchmarks.
Scaling Spark in the Real World: Performance and Usability
The main challenges and requirements that appeared in taking Spark to a wide set of users, and usability and performance improvements made to the engine in response are described.
Above the Clouds : A View of Cloud Computing
Cloud Computing, the long-held dream of computing as a utility, has the potential to transform a large part of the IT industry, making software even more attractive as a service and shaping the way…