Advantages and disadvantages in the use of internet as a corpus : the case of the online dictionaries of Spanish Valladolid-UVa

  title={Advantages and disadvantages in the use of internet as a corpus : the case of the online dictionaries of Spanish Valladolid-UVa},
  author={Sven Tarp and Pedro A. Fuertes-Olivera},
This paper initially discusses some of the consequences which the technological development has for lexicography, especially in terms of the different types of empirical basis which can be used in dictionary projects. The most important advantages and disadvantages of using the Internet as a corpus are then listed and compared to the usefulness of "traditional" corpora. As an example, the paper shows how the Internet is used as the main empirical source in order to select lemmata and meaning… 
9 Citations

Figures and Tables from this paper

New Insights in the Design and Compilation of Digital Bilingual Lexicographical Products: The Case of the Diccionarios Valladolid-UVa

The article presents the philosophy underpinning the project, highlights some of the innovations introduced, e.g. the use of logfiles for compiling the initial lemma list and the order of compilation, and illustrates a compilation methodology which starts by assuming the relevance of new concepts.

A window to the future: Proposal for a lexicography-assisted writing assistant

Abstract The paper initially discusses some of the challenges posed to contemporary lexicography and stresses the need to move upstream in the value chain to guarantee future work. Today’s

Designing and making commercially driven integrated dictionary portals: the Diccionarios Valladolid-UVa

This paper will be mostly concerned with presenting the concept of typological individualization, i.e. offering specific data types for specific user types in specific situation types.

How to select and present cultural data: a challenge to lexicography

Abstract Foreign language learners need to get cultural information during their learning process for their oral and written comprehension and expression activities. Current lexicographic products

Connecting the Dots:

This article botanizes in the history of lexicography trying to connect the dots and get a deeper understanding of what is happening to the discipline in the framework of the Fourth Industrial

La metalexicografía del siglo XXI: un estado de la cuestión

La lexicografía del siglo xxi afronta una transición decisiva. En un contexto de disrupción tecnológica, la proliferación de diccionarios en línea de consulta abierta, la consolidación de plataformas

Diccionarios del español para la producción de textos

El diccionario del español para la producción de textos es una herramienta de consulta diseñada para ayudar a un usuario tipo, por ejemplo un ser humano o un programa informático, a solucionar

Diccionario Español de Definiciones

El artículo se sitúa en el marco teórico de la Teoría Funcional de la Lexicografía al defender que un diccionario, por ejemplo el diccionario español de definiciones que presentamos, es una



Lexicography and the Internet as a (Re-)source

This article presents the concept of the Internet as a lexicographical corpus, which implies three hypotheses. Firstly, although there are many corpus-based and/or corpus-driven dictionaries, e.g.,

e-Lexicography : the internet, digital initiatives and lexicography

Ten Key Issues in e-Lexicography for the Future, Eva Samaniego Fernandez & Beatriz Perez Cabello de Alba References Index Notes on Contributors.

Lexicography in the Borderland between Knowledge and Non-Knowledge: General Lexicographical Theory with Particular Focus on Learner's Lexicography

The book contains a state-of-the-art summary of the theoretical discussions within the field of lexicography during the last decades. On this basis it presents and argues for a new general theory,

From Print to Digital: Implications for Dictionary Policy and Lexicographic Conventions

This paper looks at some familiar editorial and presentational conventions of dictionaries which are no longer appropriate in the digital medium — and what new policies might replace them.

"I Don’t Believe in Word Senses"

It is suggested, by contrast, that wordsenses exist only relative to a task, and whether and how word sense ambiguity is infact a problem for current NLP applications is explored.

Structures in the communication between lexicographer and programmer: Database and interface / Strukturen in der Kommunikation zwischen Lexikograph und Programmierer: Datenbasis und Schnittstelle / Les structures de la communication entre lexicographe et programmeur: base de données et interface

This paper intends to answer the question how much a lexicographer in charge of a new e-dictionary project should know about lexicographical structures, and how this knowledge could be communicated to the IT programmer designing the underlying database and the corresponding user interfaces.

Introduction to the Special Issue on the Web as Corpus

This special issue of Computational Linguistics explores ways in which this dream of freely available language data in vast quantity and freely available is being explored.

Mind over Machine: The Power of Human Intuition and Expertise in the Era of the Computer

The authors, one a philosopher and the other a computer scientist, argue that even highly advanced systems only correspond to the very early stages of human learning and that there are many human skills that computers will never be able to emulate.

Excesos en el uso de corpus en la lexicografía: “pesca” de términos y definiciones

Esta contribucion discute la seleccion de base empirica para la seleccion y definicion de terminos especializados tanto en los diccionarios especializados como los generales. Sin negar el valor de

Korpusbaseret Leksikografi

  • LexicoNordica 3: 1-15. Bergenholtz, H. 2013. The Role of Linguists in Planning and Making Dictionaries in Modern
  • 1996