Learn More
This paper presents a new approach to text processing, based on textemes. These are atomic text units generalising the concepts of character and glyph by merging them in a common data structure, together with an arbitrary number of user-defined properties. In the first part, we give a survey of the notions of character and glyph and their relation with(More)
The distinction between " characters " and " glyphs " is a rather new issue in computing, although the problem is as old as humanity: our species turns out to be a writing one because, amongst other things, our brain is able to interpret images as symbols belonging to a given writing system. Computers deal with text in a more abstract way. When we agree(More)
The code for the Ω Typesetting System has been substantially reorganised. All fixed-size arrays implemented in Pascal Web have been replaced with interfaces to extensible C ++ classes. The code for interaction with fonts and Ω Translation Processes (ΩTP's) has been completely rewritten and placed in C ++ libraries, whose methods are called by the (now)(More)
The distinction between " characters " and " glyphs " is a rather new issue in computing, although the problem is as old as humanity: our species turns out to be a writing one because, amongst other things, our brain is able to interpret images as symbols belonging to a given writing system. Computers deal with text in a more abstract way. When we agree(More)
State-of-the-art multilingual ontology matchers use machine translation to reduce the problem to the monolingual case. We investigate an alternative, self-contained solution based on semantic matching where labels are parsed by multilingual natural language processing and then matched using a language-independent knowledge base acting as an interlingua. As(More)
  • 1