• Corpus ID: 160009450

Ingesting High-Velocity Streaming Graphs from Social Media Sources

  title={Ingesting High-Velocity Streaming Graphs from Social Media Sources},
  author={Subhasish Dasgupta and Aditya Bagchi and Amarnath Gupta},
Many data science applications like social network analysis use graphs as their primary form of data. [] Key Method We have developed an adaptive buffering mechanism and a graph compression technique that effectively mitigates the problem. A novel aspect of our method is that the adaptive buffering algorithm uses the data rate, the data content as well as the CPU resources of the database machine to determine an optimal data ingestion mechanism. We further show that an ingestion-time graph-compression…



An experimental survey on big data frameworks

Evolving Centralities in Temporal Graphs: A Twitter Network Analysis

It is found that Twitter users are fairly dynamic and from one moment to the next, they can assume central roles in the network and show how to compute closeness and betweenness centralities using fastest paths.

Data Ingestion for the Connected World

It is argued that in many “Big Data” applications, getting data into the system correctly and at scale via traditional ETL processes is a fundamental roadblock to being able to perform timely analytics or make real-time decisions.

GraphChallenge.org: Raising the Bar on Graph Analytic Performance

Graph Challenge 2017 received 22 submissions by 111 authors from 36 organizations and highlighted graph analytic innovations in hardware, software, algorithms, systems, and visualization that produced many comparable performance measurements that can be used for assessing the current state of the art of graph analysis.

Influence Ranking Model for Social Networks Users

In this paper, influence ranking model (IRM) is presented to rank SN users based on their contribution in spreading a specific content, inspired by the pruning process of the powerful k-shell decomposition methodology.

A Graph Database of Yelp Dataset Challenge 2018 and Using Cypher for Basic Statistics and Graph Pattern Exploration

This paper uses Neo4j, a popular graph database, to store the Yelp Dataset for 2018 Challenge, which is a real-world dataset, and uses cypher with graph algorithm library to explore interesting graph patterns such as bipartite and connected components.

Analytics-driven data ingestion and derivation in the AWESOME polystore

ADIL, the data ingestion language of AWESOME allows a user to flexibly specify the placement of original and derived data into and across component stores and the computation engine.

Data Ingestion in AsterixDB

This paper describes the support for data ingestion in AsterixDB, an open-source Big Data Management System (BDMS) that provides a platform for storage and analysis of large volumes of semi-structured data, and describes how to make this component fault-tolerant so the system manages input in the presence of failures.

An IDEA: An Ingestion Framework for Data Enrichment in AsterixDB

This paper presents a new data ingestion framework that supports data ingestion at scale, enrichments requiring complex operations, and adaptiveness to reference data changes, built on top of Apache AsterixDB.

A Survey on Graph Processing Accelerators: Challenges and Opportunities

This paper reviews the relevant techniques in three core components toward a graph processing accelerator: preprocessing, parallel graph computation, and runtime scheduling and finds that there is not an absolute winner for all three aspects in graph acceleration.