The Fourth Paradigm: Data-Intensive Scientific Discovery

@inproceedings{Hey2009TheFP,
  title={The Fourth Paradigm: Data-Intensive Scientific Discovery},
  author={Tony (Anthony) John Grenville Hey},
  year={2009}
}
This presentation will set out the eScience agenda by explaining the current scientific data deluge and the case for a “Fourth Paradigm” for scientific exploration. Examples of data intensive science will be used to illustrate the explosion of data and the associated new challenges for data capture, curation, analysis, and sharing. The role of cloud computing, collaboration services, and research repositories will be discussed. 
Science in the Cloud: Accelerating Discovery in the 21st Century
TLDR
With appropriate software support, scientists can do large-scale computation using unused cycles in commercial clouds to promote data sharing and collaborations that will accelerate scientific discovery. Expand
Coping with “Big Data”: eScience
TLDR
This chapter looks at research data from different perspectives and shows how academic libraries and other stakeholders are engaging in supporting eScience, a term that refers to large datasets and tools that facilitate the acquisition, management, and exchange of digital scientific data. Expand
Enhancing the impact of science data toward data discovery and reuse
TLDR
This work proposes exploiting Semantic Web technologies and best practices to make metadata both discoverable and easy to publish, and shares experiences in curating metadata to illustrate the cumbersome nature of data reuse in the current research environment. Expand
The Myria Big Data Management and Analytics System and Cloud Services
TLDR
An overview of the Myria stack for big data management and analytics that was developed in the database group at the University of Washington and that has been operating as a cloud service aimed at domain scientists around the UW campus is presented. Expand
Data Centric Discovery with a Data-Oriented Architecture
TLDR
It is argued that a more systematic perspective is required, and in particular, a data-centric approach in which discovery stands on a foundation of data and data collections, rather than on fleeting transformations and operations is proposed. Expand
Active provenance for data intensive research
TLDR
This work wants to explore how in data-intensive systems, flexibility in the management of the provenance and its adaptation to the different users and application contexts can lead to new opportunities for its exploitation, improving productivity. Expand
A Science Cloud for Data Intensive Sciences
TLDR
This work proposes a cloud system especially designed for science, which is a new concept based on the fact that most scientific data are digitalized and the amount of data is huge. Expand
Data sharing in a technological-driven research environment
TLDR
An overview on how metadata can a useful asset to improve data sharing between researchers is presented and a case-study from the biodiversity domain is presented to illustrate this scenario. Expand
Data-Intensive Research: making best use of research data (Draft 1)
This report focuses on the dramatic change in the ways in which research is undertaken, chris-tened, \The Fourth Paradigm" by Jim Gray, that is sweeping the world. This is driven by andresponding toExpand
A Vision for Global Research Data Infrastructures
TLDR
The main challenges faced by the future GRDI are identified, a conceptual framework for GRDIs based on the ecosystem metaphor is defined, a core set of functionality that these GRD is must provide is described, and a set of recommendations for building the futureGRDIs are given. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 11 REFERENCES
Beyond the Data Deluge
TLDR
The demands of data-intensive science represent a challenge for diverse scientific communities and this paper presents a response to this challenge in the context of climate change. Expand
The Beta Workbench: a computational tool to study the dynamics of biological systems
TLDR
The Beta Workbench is introduced, a scalable tool built on top of the newly defined BlenX language to model, simulate and analyse biological systems and a comparison with related approaches is provided. Expand
Mapping evolutionary trajectories: Applications to the growth and transformation of medical knowledge
TLDR
A large-scale empirical analysis of the development of treatments for coronary artery disease reveals the structure of medical understanding of the disease and the path-dependent co-evolution of scientific and technical knowledge in the search for solutions to the relevant set of problems. Expand
Stochastic pi-Calculus
  • C. Priami
  • Mathematics, Computer Science
  • Comput. J.
  • 1995
TLDR
This work defines a stratified transition system that is finitely branching and gives a transition rule to directly yield a continuous time Markov chain from an Sπ specification, with no transition system manipulation. Expand
Routinely-collected general practice data are complex, but with systematic processing can be used for quality improvement and research.
TLDR
Routinely collected primary care data could contribute more to the process of health improvement; however, those working with these data need to understand fully the complexity of the context within which data entry takes place. Expand
Electronic Health Records Should Support Clinical Research
TLDR
Electronic records could facilitate new interfaces between care and research environments, leading to great improvements in the scope and efficiency of research, which could ensue given sufficient development of the care-research interface via electronic records. Expand
Variational Message Passing
TLDR
Variational Message Passing is introduced, a general purpose algorithm for applying variational inference to Bayesian Networks and can be applied to very general class of conjugate-exponential models because it uses a factorised variational approximation. Expand
The big three concept: a way to tackle the health care crisis?
TLDR
The Big Three concept is suggested, which aims for identification of susceptible smokers; screening for early diagnosis; development of new treatment modalities that target shared disease mechanisms, thus having the potential to affect more than one of the comorbidities; increased awareness of these co-existing diseases and modification of current guidelines across specialties. Expand
Randomized Controlled Trials: Do They Have External Validity for Patients With Multiple Comorbidities?
TLDR
Results from this study suggest that RCTs targeting a chronic medical condition such as hypertension could find that, in a sample taken from family practice, most eligible patients have comorbid conditions. Expand
Growth and decentralization of the medical literature: implications for evidence-based medicine.
L'article examine les tendances des articles indexes dans MEDLINE au niveau du volume, des auteurs, du contenu et du financement, entre 1978 et 2001. Cette periode se caracterise par une forteExpand
...
1
2
...