Managing data lakes in big data era: What's a data lake and why has it became popular in data management ecosystem

@article{Fang2015ManagingDL,
  title={Managing data lakes in big data era: What's a data lake and why has it became popular in data management ecosystem},
  author={Huang Fang},
  journal={2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER)},
  year={2015},
  pages={820-824}
}
  • Huang Fang
  • Published 8 June 2015
  • Computer Science
  • 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER)
The concept of a data lake is emerging as a popular way to organize and build the next generation of systems to master new big data challenges, but there are lots of concerns and questions for large enterprises to implement data lakes. The paper discusses the concept of data lakes and shares the author's thoughts and practices of data lakes. 

Figures and Tables from this paper

Metadata Systems for Data Lakes: Models and Features

TLDR
Data querying and analysis depend on a metadata system that must be efficient and comprehensive, and metadata management in data lakes remains a current issue and the criteria for evaluating its effectiveness are more or less nonexistent.

Data lake: a new ideology in big data era

TLDR
Data Lake is one of the arguable concepts appeared in the era of big data and has the potential to change the data landscape makes the research of Data Lake worthwhile.

Data Lakes: Trends and Perspectives

TLDR
This work studies the existing work and proposes a complete definition and a generic and extensible architecture of data lake and introduces three future research axes related to metadata management that consists of intra- and inter-metadata.

Big Data Lakes: Models, Frameworks, and Techniques

  • A. Cuzzocrea
  • Computer Science, Business
    2021 IEEE International Conference on Big Data and Smart Computing (BigComp)
  • 2021
TLDR
An overview of state-of-the-art approaches that are at the foundations of big data lake research, and innovative open problems and issues, which drive future research directions, on advancing the big dataLake research trend are proposed.

Current Trends in Building and Managing Data Lakes

TLDR
This paper discusses the development and management of data lakes up till now with a focus on the challenges and benefits.

Textual Data Analysis from Data Lakes

TLDR
This thesis proposes in this thesis a methodological approach to enable textual data analyses from data lakes through an efficient metadata system.

Data lake concept and systems: a survey

TLDR
This survey reviews the development, definition, and architectures of data lakes and classify the existing data lake systems based on their provided functions, which makes this survey a useful technical reference for designing, implementing and applying data lakes.

Enterprise Data Lake Management in Business Intelligence and Analytics

TLDR
Concrete analytics projects of a globally industrial enterprise are used to identify existing practical challenges and drive requirements for enterprise data lakes to identify research gaps in analytics practice.

Leveraging the Data Lake: Current State and Challenges

TLDR
This work investigates existing data lake literature and discusses various design and realization aspects for data lakes, such as governance or data models, to identify challenges and research gaps and identify a comprehensive strategy to realize data lakes.

Modeling Data Lake Metadata with a Data Vault

TLDR
This paper instantiate the metadata conceptual model into relational and document-oriented logical and physical models, respectively, and compares the physical models in terms of metadata storage and query response time.
...

References

SHOWING 1-7 OF 7 REFERENCES

Predicts 2015 - Managing Data Lakes of Unprecedented Enormity”, Garnter

  • 2014

Predicts 2015 -Managing Data Lakes of Unprecedented Enormity

  • Predicts 2015 -Managing Data Lakes of Unprecedented Enormity
  • 2014

The Data Lake Fallacy: All Water and Little Substance”, Garnter

  • 2014

Market Overview – Big Data Integration

  • Market Overview – Big Data Integration
  • 2014

Putting The Data Lake To Work

  • CITO Research
  • 2014

The Data Lake Fallacy: All Water and Little Substance

  • The Data Lake Fallacy: All Water and Little Substance
  • 2014

The Data Lake Dream

  • The Data Lake Dream
  • 2014