Helka Folch

Learn More
The increasing use of methods in natural language processing (NLP) which are based on huge corpora require that the lexical, morpho-syntactic and syntactic homogeneity of texts be mastered. We have developed a methodology and associate tools for text calibration or "profiling" within the ELRA benchmark called "Contribution to the construction of(More)
OCEAN is a tool for a posteriori visual data mining that uses the output of a text miner to help users better explore a document space. Clustered documents are transformed into a hierarchical 3D representation analog to Reconfigurable Disk Trees. An intermediary document representation allows for interface customization and offers a generic approach to 3D(More)
Very large corpora are increasingly exploited to improve Natural Language Processing (NLP) Systems. This however implies that the lexical, morpho−syntactic and syntactic homogeneity of the data used are mastered. This control in turn requires the development of tools aimed at text calibration or profiling. We are implementing such profiling tools and(More)
Our work carried out as part of the Scriptorium project has confronted us with a variety of problems which shed light on important issues related to corpus architectural design, such as the definition of fine-grained textual units, extraction of relevant subsections of the corpus, and in particular linking techniques enabling , text annotation with(More)
We describe a novel biotope at 633 to 762 m depth on a vertical wall in the Whittard Canyon, an extensive canyon system reaching from the shelf to the deep sea on Ireland's continental margin. We explored this wall with an ROV and compiled a photomosaic of the habitat. The assemblage contributing to the biotope was dominated by large limid bivalves, Acesta(More)
In 2009, the Marine Biodiscovery Laboratory was set-up at the Marine Institute with funds from the Marine Institute and the Beaufort Marine Biodiscovery Research Programme. The Marine Biodiscovery Laboratory has already processed over 130 marine specimens from coastal zones and from the Deep Sea (≤3,000 m) within the Marine Irish Exclusive Economic Zone.(More)
RESUME Afin de découvrir les concepts autour d'un domaine métier, 4600 pages Web ont été moissonnées et analysées par le lo-giciel TEMIS en 900 classes hiérarchiques. Nous décrivons ici une interface de visualisation spatialisée pour l'explora-tion de ces classes ainsi qu'une évaluation qualitative par six professionnels. L'approche retenue repose sur un(More)
Extensible Markup Language (XML) is playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere. It is a simple, very flexible text format, used to annotate data by means of markup. XML documents can be checked for syntactic well-formedness and semantic coherence through DTD and schema validation which makes(More)
  • 1