Małgorzata Marciniak

Learn More
In this paper we present general assumptions and goals of the LUNA (spoken Language UNderstanding in multilinguAl communication systems) project. We describe the process of collecting a Polish corpus of spoken dialogs and the accepted annotation schema of this corpus at several levels, from transcription of dialogs and morphosyntactic analysis, to semantic(More)
BACKGROUND Hospital documents contain free text describing the most important facts relating to patients and their illnesses. These documents are written in specific language containing medical terminology related to hospital treatment. Their automatic processing can help in verifying the consistency of hospital documentation and obtaining statistical data.(More)
The paper focuses on resolving natural language issues which have been affecting performance of our system processing Polish medical data. In particular, we address phenomena such as ellipsis, anaphora, comparisons, coordination and negation occurring in mammogram reports. We propose practical data-driven solutions which allow us to improve the system's(More)
In the paper we present the method of automatic recognition and annotation of proper names which occur in dialogs gathered at the Warsaw city transportation information center. We describe different types of proper names and how people use them in dialogs. We present rules of automatic recognition and lemmatization of proper names in the transportation(More)
The paper presents both conceptual and technical issues related to the construction of an HPSG test-suite for Polish. The test-suite consists of sentences of written Polish — both grammatical and ungrammatical. Each sentence is annotated with a list of linguistic phenomena it illustrates. Additionally, grammatical sentences are encoded in HPSG-style AVM(More)
  • 1