Share This Author
'The First Day of Summer': Parsing Temporal Expressions with Distributed Semantics
SUTime, a state-of-the-art NLP system, is extended to incorporate the proposed alternate paradigm: that of distributed temporal semantics—where a probability density function models relative probabilities of the various interpretations.
HarmonicIO: Scalable Data Stream Processing for Scientific Datasets
- Preechakorn Torruangwatthana, Håkan Wieslander, Ben Blamey, A. Hellander, S. Toor
- Computer ScienceIEEE 11th International Conference on Cloud…
- 1 July 2018
HarmonicIO is presented, a lightweight streaming framework specialized for scientific datasets that boasts a smart dynamic architecture, is highly elastic, and enforces a clear separation between framework components and application execution environment using container technology.
R U : -) or : -( ? Character- vs. Word-Gram Feature Selection for Sentiment Classification of OSN Corpora
This work presents an investigation of the application of the character n-gram model to text classification of corpora from online social networks, the first such documented study, where text is known to be rich in so-called unnatural language, also introducing a novel corpus of Facebook photo comments.
Apache Spark Streaming, Kafka and HarmonicIO: A Performance Benchmark and Architecture Comparison for Enterprise and Scientific Computing
A benchmark of stream processing throughput comparing Apache Spark Streaming, with a prototype P2P stream processing framework, HarmonicIO, is presented, suggesting which frameworks and streaming sources are likely to offer good performance for a given load.
Adapting the Secretary Hiring Problem for Optimal Hot-Cold Tier Placement Under Top-K Workloads
- Ben Blamey, Fredrik Wrede, Johan Karlsson, A. Hellander, S. Toor
- Computer Science19th IEEE/ACM International Symposium on Cluster…
- 22 January 2019
An approach for optimal tiered storage allocation under stream processing workloads using top-K queries, which derives expressions for optimal parameter values in terms of tier storage and transport costs a priori, without needing to monitor the application.
Differentiated Assessments for Advanced Courses that Reveal Issues with Prerequisite Skills: A Design Investigation
This work conducted an inductive qualitative analysis of existing assessment questions from instructors and from a concept inventory with a validity argument, and found dependencies on a variety of prerequisite knowledge and mixed potential for diagnosing difficulties with prerequisites.
Resource- and Message Size-Aware Scheduling of Stream Processing at the Edge with application to Realtime Microscopy
This paper investigates scheduling stream processing in hybrid cloud/edge deployment settings with sensitivity to CPU costs and message size, with the aim of maximizing throughput with respect to limited edge resources.
Apache Spark Streaming and HarmonicIO: A Performance and Architecture Comparison
A performance benchmark comparison between Apache Spark Streaming (ASS) under both file and TCP streaming modes; and HarmonicIO, comparing maximum throughput over a broad domain of message sizes and CPU loads is presented.
Smart Resource Management for Data Streaming using an Online Bin-packing Strategy
- Oliver Stein, Ben Blamey, S. Toor
- Computer ScienceIEEE International Conference on Big Data (Big…
- 29 January 2020
A real world use case from large-scale microscopy pipelines is compared and two different strategies of auto-scaling implemented in the HarmonicIO and Spark Streaming frameworks for efficient resource utilization are compared.
Apache Spark Streaming , Kafka and HarmonicIO : A Performance and Architecture Comparison for Enterprise and Scientific Computing
A benchmark of stream processing throughput comparing Apache Spark Streaming (under file-, socketand Kafka-based stream integration), with a prototype P2P stream processing framework, HarmonicIO, suggests which frameworks and integrations are likely to offer good performance for a given load.