The European Bioinformatics Institute’s data resources

  title={The European Bioinformatics Institute’s data resources},
  author={Cath Brooksbank and Graham Cameron and Janet M. Thornton},
  journal={Nucleic Acids Research},
  pages={D17 - D25}
The wide uptake of next-generation sequencing and other ultra-high throughput technologies by life scientists with a diverse range of interests, spanning fundamental biological research, medicine, agriculture and environmental science, has led to unprecedented growth in the amount of data generated. It has also put the need for unrestricted access to biological data at the centre of biology. The European Bioinformatics Institute (EMBL-EBI) is unique in Europe and is one of only two… 

Figures and Tables from this paper

The European Bioinformatics Institute in 2016: Data growth and integration
The Embassy Cloud service, which allows users to run large analyses in a virtual environment next to EMBL-EBI's vast public data resources, is launched.
EBI Genome Resources
The European Bioinformatics Institute (EBI) is the main European repository of nucleotide sequence data and has resources for protein sequences and structures, as well as access to sequences from patents.
Visualizing Next-Generation Sequencing Cancer Data Sets with Cloud Computing
An overview of a novel cloud computing based next-generation sequencing research management software system which has simplicity, scalability, speed and reproducibility at its core is presented.
The development of computational methods for large-scale comparisons and analyses of genome evolution
The work described in this thesis details the application of comparative bioinformatics analyses on interand intra-genomic datasets, to elucidate those genomic changes, which may underlie organismal adaptations and contribute to changes in the complexity of genome content and structure over time.
Biological Databases for Human Research
Analysis Tool Web Services from the EMBL-EBI
Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces, which allow their integration into other tools, applications, web sites, pipeline processes and analytical workflows.
The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI
The EBI Search engine is presented, referred to here as ‘EBI Search’, an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities.
PGP repository: a plant phenomics and genomics data publication infrastructure
A novel developed data submission tool was made available for the consortium that features a high level of automation to lower the barriers of data publication and enable PGP to fulfil the FAIR data principles—findable, accessible, interoperable, reusable.
A new bioinformatics analysis tools framework at EMBL–EBI
A new framework aimed at both novice as well as expert users that exposes novel methods of obtaining annotations and visualizing sequence analysis results through one uniform and consistent interface is presented.
Comparative ranking of human chromosomes based on post-genomic data.
This work introduces a scoring method for chromosome ranking based on several characteristics, including relevance to health problems, existing published knowledge, and current transcriptome and proteome coverage, and is advantageous in that it takes into account currently available information.


DNA Data Bank of Japan (DDBJ) for genome scale research in life science
The DNA Data Bank of Japan (DDBJ) has made an effort to collect as much data as possible mainly from Japanese researchers, and developed the Genome Information Broker (GIB) and HGS, a database of the human genome, which have been updated incorporating newly available data and retrieval tools.
Web services at the European Bioinformatics Institute-2009
The European Bioinformatics Institute (EMBL-EBI) has been providing access to mainstream databases and tools in bioInformatics since 1997, and APIs exist for core data resources such as EMBL-Bank, Ensembl, UniProt, InterPro, PDB and ArrayExpress that allow users to systematically access databases and analytical tools.
The Ensembl genome database project
The Ensembl ( database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic
The Proteomics Identifications database: 2010 update
Several new and improved features in PRIDE are described, including the revised submission process, which now includes direct submission of fragment ion annotations and the importance of data sharing in the proteomics field, and the corresponding integration of PRIDE with other databases in the ProteomExchange consortium.
Improvements to services at the European Nucleotide Archive
The content and scope of the European Nucleotide Archive is described, major improvements to the services are introduced and metadata formats for capillary and next-generation sequencing traces are introduced.
Using the Reactome Database
This unit describes how to use the Reactome database to learn the steps of a biological pathway and see how one pathway interacts with another, and use the Pathfinder tool to search the database for possible connections within and between pathways.
The EMBL Nucleotide Sequence Database
Changes over the past year include the removal of the sequence length limit, the launch of the EMBLCDSs dataset, extension of the Sequence Version Archive functionality and the revision of quality rules for TPA data.
NCBI GEO: archive for high-throughput functional genomic data
The Gene Expression Omnibus at the National Center for Biotechnology Information (NCBI) is the largest public repository for high-throughput gene expression data and offers many tools and features that allow users to effectively explore, analyze and download expression data from both gene-centric and experiment-centric perspectives.
Ensembl Genomes: Extending Ensembl across the taxonomic space
Ensembl Genomes is a new portal offering integrated access to genome-scale data from non-vertebrate species of scientific interest, developed using the Ensembl genome annotation and visualisation platform.
Recent improvements to the SMART domain-based sequence annotation resource
The SMART database now contains information on intrinsic sequence features such as transmembrane regions, coiled-coils, signal peptides and internal repeats and new advanced queries provide direct access to the SMART relational database using SQL.