Compact Representations by Finite-State Transducers

  title={Compact Representations by Finite-State Transducers},
  author={Mehryar Mohri},
  • M. Mohri
  • Published in ACL 27 June 1994
  • Biology
Finite-state transducers give efficient representations of many Natural Language phenomena. They allow to account for complex lexicon restrictions encountered, without involving the use of a large set of complex rules difficult to analyze. We here show that these representations can be made very compact, indicate how to perform the corresponding minimization, and point out interesting linguistic side-effects of this operation. 

Figures and Tables from this paper

On some applications of finite-state automata theory to natural language processing

  • M. Mohri
  • Computer Science
    Nat. Lang. Eng.
  • 1996
We describe new applications of the theory of automata to natural language processing: the representation of very large scale dictionaries and the indexation of natural language texts. They are based

Finite-State Transducers in Language and Speech Processing

  • M. Mohri
  • Computer Science
    Comput. Linguistics
  • 1997
This work recalls classical theorems and gives new ones characterizing sequential string-to-string transducers, including algorithms for determinizing and minizizing these transducers very efficiently, and characterizations of the transducers admitting determinization and the corresponding algorithms.

Finite State Transducers with Predicates and Identities

An extension to finite state transducers is presented, in which atomic symbols are replaced by arbitrary predicates over symbols, which is fairly trivial for finite state acceptors, but the introduction of predicates is more interesting for transducers.

Weighted Automata in Text and Speech Processing

An efficient composition algorithm for weighted transducers is described, and examples illustrating the value of determinization and minimization algorithms for weighted automata are given.

An Efficient Compiler for Weighted Rewrite Rules

This work describes a new algorithm for compiling rewrite rules into finite-state transducers, and shows it to be simpler and more efficient than existing algorithms.

Recognition by Composition of Weighted Finite Automata

A single composition algorithm is used both to combine in advance information sources such as language models and dictionaries, and to combine acoustic observations and information sources dynamically during recognition.

Speech Recognition by Composition of Weighted Finite Automata

A single composition algorithm is used both to combine in advance information sources such as language models and dictionaries, and to combine acoustic observations and information sources dynamically during recognition.

Syntactic Analysis by Local Grammars Automata: an Efficient Algorithm

An efficient algorithm for the application of local grammars put in this form to lemmatized texts is described and illustrated.

Building Automata on Schemata and Acceptability Tables: Application to French Data Adverbials

  • D. Maurel
  • Computer Science
    Workshop on Implementing Automata
  • 1996
A lexical finite states automaton to parse French Date Adverbials using an original model of representation and the computation and use of which will be explained in this paper.

The Design Principles of a Weighted Finite-State Transducer Library



Finite-State Parsing And Disambiguation

A language-independent method of finite-state surface syntactic parsing and word-disambiguation is discussed, which excludes all ungrammatical possibilities leaving the correct interpretation of the sentence.

Finite-State Approximation of Phrase Structure Grammars

An algorithm is described that computes finite-state approximations for context-free grammars and equivalent augmented phrase-structure grammar formalisms, and the approximation is exact for certain context- Free Grammars generating regular languages, including all left-linear and right-linear context- free grammARS.

Regular Models of Phonological Rule Systems

This paper shows in detail how this framework applies to ordered sets of context-sensitive rewriting rules and also to grammars in Koskenniemi's two-level formalism.

Minimization of Sequential Transducers

An algorithm for minimizing sequential transducers is presented, which is shown to be efficient, since in the case of acyclic transducers it operates in O(¦E¦+¦V¦ +¦F¦)+¦P max steps.

Two-Level Morphology with Composition

Two-Level Morphology with Composition Lauri Karttunen, Ronald M. Kaplan, and Annie Zaenen Xerox Palo Alto Research Center Center for the Study of language and Information StanJbrd University 1.

The Design and Analysis of Computer Algorithms

This text introduces the basic data structures and programming techniques often used in efficient algorithms, and covers use of lists, push-down stacks, queues, trees, and graphs.

Analyse syntaxique transformationnelle du francais par transducteurs et lexique-grammaire

A de rares exceptions pres, le cheminement de l'analyse syntaxique automatique suit la creation de modeles de grammaires formelles (gb, hpsg, etc. ) censees refleter les mecanismes internes de la

Méthodes algorithmiques et lexicales de phonétisation de textes : applications au français

Les methodes de phonetisation automatique sont exposees. Certaines de ces methodes sont fondees sur des algorithmes et des systemes de regles, les autres sur des dictionnaires phonetiques. Un

Analyse et représentation par automates de structures syntaxiques composées. Application aux complétives

L'analyse fait appel a des phrases a verbe support sous-jacentes qui sont decrites dans le lexique-grammaire du l, y compris sur les relations of coreference qu'on peut y observer alors that celui-ci n'en porte aucune marque explicite.

To appear in Computational Linguistics. Klarsfeld, Gaby. phologique de l'anglais Finite-state Parsing and Disambiguation

  • Proceedings of the thirteenth International Conference on Computational Linguistics (COLING'90)
  • 1990