From D-Coi to SoNaR: a reference corpus for Dutch

  title={From D-Coi to SoNaR: a reference corpus for Dutch},
  author={Nelleke Oostdijk and Martin Reynaert and Paola Monachesi and Gertjan van Noord and Roeland Ordelman and Ineke Schuurman and Vincent Vandeghinste},
The computational linguistics community in The Netherlands and Belgium has long recognized the dire need for a major reference corpus of written Dutch. In part to answer this need, the STEVIN programme was established. To pave the way for the effective building of a 500-million-word reference corpus of written Dutch, a pilot project was established. The Dutch Corpus Initiative project or D-Coi was highly successful in that it not only realized about 10% of the projected large reference corpus… CONTINUE READING