Ingesting High-Velocity Streaming Graphs from Social Media Sources
@article{Dasgupta2019IngestingHS, title={Ingesting High-Velocity Streaming Graphs from Social Media Sources}, author={Subhasish Dasgupta and Aditya Bagchi and Amarnath Gupta}, journal={ArXiv}, year={2019}, volume={abs/1905.08337} }
Many data science applications like social network analysis use graphs as their primary form of data. [] Key Method We have developed an adaptive buffering mechanism and a graph compression technique that effectively mitigates the problem. A novel aspect of our method is that the adaptive buffering algorithm uses the data rate, the data content as well as the CPU resources of the database machine to determine an optimal data ingestion mechanism. We further show that an ingestion-time graph-compression…
Figures and Tables from this paper
References
SHOWING 1-10 OF 21 REFERENCES
Evolving Centralities in Temporal Graphs: A Twitter Network Analysis
- Computer Science2016 17th IEEE International Conference on Mobile Data Management (MDM)
- 2016
It is found that Twitter users are fairly dynamic and from one moment to the next, they can assume central roles in the network and show how to compute closeness and betweenness centralities using fastest paths.
Data Ingestion for the Connected World
- Computer ScienceCIDR
- 2017
It is argued that in many “Big Data” applications, getting data into the system correctly and at scale via traditional ETL processes is a fundamental roadblock to being able to perform timely analytics or make real-time decisions.
GraphChallenge.org: Raising the Bar on Graph Analytic Performance
- Computer Science2018 IEEE High Performance extreme Computing Conference (HPEC)
- 2018
Graph Challenge 2017 received 22 submissions by 111 authors from 36 organizations and highlighted graph analytic innovations in hardware, software, algorithms, systems, and visualization that produced many comparable performance measurements that can be used for assessing the current state of the art of graph analysis.
Influence Ranking Model for Social Networks Users
- Computer ScienceAMLTA
- 2019
In this paper, influence ranking model (IRM) is presented to rank SN users based on their contribution in spreading a specific content, inspired by the pruning process of the powerful k-shell decomposition methodology.
A Graph Database of Yelp Dataset Challenge 2018 and Using Cypher for Basic Statistics and Graph Pattern Exploration
- Computer Science2018 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)
- 2018
This paper uses Neo4j, a popular graph database, to store the Yelp Dataset for 2018 Challenge, which is a real-world dataset, and uses cypher with graph algorithm library to explore interesting graph patterns such as bipartite and connected components.
Analytics-driven data ingestion and derivation in the AWESOME polystore
- Computer Science2016 IEEE International Conference on Big Data (Big Data)
- 2016
ADIL, the data ingestion language of AWESOME allows a user to flexibly specify the placement of original and derived data into and across component stores and the computation engine.
Data Ingestion in AsterixDB
- Computer ScienceEDBT
- 2015
This paper describes the support for data ingestion in AsterixDB, an open-source Big Data Management System (BDMS) that provides a platform for storage and analysis of large volumes of semi-structured data, and describes how to make this component fault-tolerant so the system manages input in the presence of failures.
An IDEA: An Ingestion Framework for Data Enrichment in AsterixDB
- Computer ScienceProc. VLDB Endow.
- 2019
This paper presents a new data ingestion framework that supports data ingestion at scale, enrichments requiring complex operations, and adaptiveness to reference data changes, built on top of Apache AsterixDB.
A Survey on Graph Processing Accelerators: Challenges and Opportunities
- Computer ScienceJournal of Computer Science and Technology
- 2019
This paper reviews the relevant techniques in three core components toward a graph processing accelerator: preprocessing, parallel graph computation, and runtime scheduling and finds that there is not an absolute winner for all three aspects in graph acceleration.