Learn More
The goal of this paper is to present CROVALLEX - the first Croatian Verb Valence Lexicon. It contains 1739 verbs with 5118 valence frames. It also contains 173 syntactic-semantic classes (72 classes with two further levels of subdivision). Functional Generative Description (FGD) is used as the background theory for the description of valence frames.(More)
The goal of this paper is to discuss the language identification problem of Croatian, language that even state-of-the-art language identification tools find, hard to distinguish from similar languages, such as Serbian, Slovenian or Slovak language. We developed the tool that implements the list of Croatian most frequent words with the threshold that each(More)
OBJECTIVE Despite their importance in achieving good glycemic control, few real-world data on insulin dosing irregularities and hypoglycemia are available. The multinational, online Global Attitude of Patients and Physicians (GAPP2) survey was conducted to address this situation. METHODS Insulin-treated patients with type 2 diabetes and healthcare(More)
The aim of this paper is to compare different methods for automatic extraction of semantic similarity measures from corpora. The semantic similarity measure is proven to be very useful for many tasks in natural language processing like information retrieval, information extraction, machine translation etc. Additionally, one of the main problems in natural(More)
The goal of this paper is to discuss the origin of the higher education (HE) issues in Croatia and to introduce service learning methodology into our faculties. Service learning (SL) is a form of education where learning occurs when students apply what they learn to community problems and reflect upon their experience as they seek to achieve real objectives(More)
This research is a first step towards a system for translating Croatian weather forecast into multiple languages. This steps deals with the Croatian-English language pair. The parallel corpus consists of a one-year sample of the weather forecasts for the Adriatic consisting of 7,893 sentence pairs. Evaluation is performed by best known automatic evaluation(More)
This paper describes methods used for generating a morphological lexicon of organization entity names in Croatian. This resource is intended for two primary tasks: template-based natural language generation and named entity identification. The main problems concerning the lexicon generation are high level of inflection in Croatian and low linguistic quality(More)
The paper describes automatic summarization of the scientific papers in Croatian language. The goal of the CROSUM is to generate extracts with high percent of extract-worthiness and about the same size as the author's abstract. This preliminary research shows that extracts generated using the lemmatized wordforms dictionary are not quite different from(More)
A CIP catalogue record for this book is available from the National and University Library in Zagreb under 678366 ABSTRACT This paper sheds new light on the left periphery of subjunctive clauses in Balkan languages by comparing the sentential complements to verbs in two groups: Romance versus Slavic Balkan. The tests indicate a systematic contrast in the(More)
The paper describes automatic summarization of newspaper texts in Croatian language. The goal of the CroWebSum is to generate high-quality extracts that are both coherent and keep relevant information from the original text. The preliminary evaluation shows that extracts in the size of 10 % of the original text have good coherence, while the extract in the(More)