An Assessment of Google Books’ Metadata

@article{James2012AnAO,
  title={An Assessment of Google Books’ Metadata},
  author={Ryan James and Andrew Philip Weiss},
  journal={Journal of Library Metadata},
  year={2012},
  volume={12},
  pages={15 - 22}
}
This article reports on a study of error rates found in the metadata records of texts scanned by the Google Books digitization project. A review of the author, title, publisher, and publication year metadata elements for 400 randomly selected Google Books records was undertaken. The results show 36% of sampled books in the digitization project contained metadata errors. This error rate is higher than one would expect to find in a typical library online catalog. 
Metadata Extraction from Books with Facts about Austria
TLDR
This paper proposes an approach to find relevant data by extracting metadata relevant for each page and allow to search for pages on the basis of their metadata as alternative to full-text search.
Assessing the coverage of Hawaiian and Pacific books in the Google Books Digitization Project
TLDR
Results show that Google Books has a sizable number of metadata records for Hawaiian and Pacific books, but has only a limited number available for full‐text searching.
An Examination of Massive Digital Libraries' Coverage of Spanish Language Materials: Issues of Multi-lingual Accessibility in a Decentralized, Mass-Digitized World
  • A. Weiss, Ryan James
  • Computer Science
    2013 International Conference on Culture and Computing
  • 2013
TLDR
The results of an ensuing study that examines the coverage and accessibility of Spanish language books in four Massive Digital Libraries: Google Books, HathiTrust, Internet Archive, and Open Library show little difference in accessibility between Spanish and English books in Google Books.
Google books' coverage of Hawai'i and Pacific books
TLDR
A recent quantitative study of Google Books' coverage of Hawaiian and Pacific books using the University of Hawaii's collection as a benchmark shows that Google Books has a sizable number of metadata records for Hawaiian andPacific books, but has only a limited number available for full-text searching.
An automatic method for extracting citations from Google Books
TLDR
A method to automatically remove false and irrelevant matches from GB citation searches is introduced in addition to introducing refinements to a previous GB manual citation extraction method.
Producing “one vast index”: Google Book Search as an algorithmic system
TLDR
It is argued that far from simply “scanning” books, Google’s efforts may be characterized as algorithmic digitization, strongly shaped by an equation of digital access with full-text searchability, which enacts one possible future for books in which they are defined largely by their textual content.
Comparing the Access to and Legibility of Japanese Language Texts in Massive Digital Libraries
  • A. Weiss, Ryan James
  • Computer Science
    2015 International Conference on Culture and Computing (Culture Computing)
  • 2015
TLDR
A random sample of 800 Japanese-language books with publication dates prior to 1943 was extracted from the OCLC World Cat database and 409 were examined for their level of typical user access, their accuracy in metadata, and their scan quality.
Examining Massive Digital Libraries (MDLs) and Their Impact on Reference Services
TLDR
Some of the flaws and unintended consequences of relying on Massive Digital Libraries at the expense of local print collections are examined.
Massive Digital Libraries (MDLs) and the Impact of Mass-Digitized Book Collections
This chapter describes the characteristics of massive digital libraries (MDLs) and outlines their impact upon current information science issues, especially digital collection metadata, copyright and
"Google Libros" y la digitalización masiva: La aportación de la Universidad Complutense
Study of mass digitization project that has enabled Google Books scan more than 20 million books worldwide (80% from participating libraries and the rest from more than 50,000 publishers
...
1
2
3
...

References

SHOWING 1-10 OF 13 REFERENCES
An Assessment of the Legibility of Google Books
TLDR
The results of a preliminary study on the legibility of texts scanned by Google Books suggest that while Google Books is not perfect, the majority of texts sampled were legible.
Google Book Search and Metadata
TLDR
The authors recommend that users should be able to submit error reports to Google to correct faulty metadata and that metadata for books scanned as part of the Google Book Search (GBS) project should be updated.
Metadata and Data Quality Problems in the Digital Library
  • J. Beall
  • Computer Science
    J. Digit. Inf.
  • 2005
TLDR
The main types of data quality errors that occur in digital libraries, both in full-text objects and in metadata, are described and suggestions for managing digital library data quality are offered.
Inheritance and loss? A brief survey of Google Books
TLDR
It is suggested that a strain of romanticism may limit Google's ability to deal with that very awkward object, the book.
Prediction of OPAC spelling errors through a keyword inventory
TLDR
It is widely perceived that spelling errors in OPACs and other large databases are few in number, randomly distributed, and impossible to locate in any systematic fashion, but the results of this study demonstrate that these perceptions are incorrect.
A catalogue quality audit tool
TLDR
The self‐assessment audit tool for catalogue quality developed by UKOLN in collaboration with Essex libraries and reports on the results of the pilot study carried out at the University of Bath library in 2000 are described.
A catalogue quality audit
  • 2002
Google Books: A Metadata Train Wreck
  • Language Log. Retrieved
  • 2009
Google Books: A metadata train wreck. Language Log
  • 2009
Google book search and metadata. Cataloging & Classification Quarterly
  • 2009
...
1
2
...