Technical Report : Adding Missing Words to Regular Expressions

@inproceedings{Rebele2018TechnicalR,
  title={Technical Report : Adding Missing Words to Regular Expressions},
  author={Thomas Rebele and Katerina Tzompanaki and Fabian M. Suchanek},
  year={2018}
}
Regular expressions (regexes) are patterns that are used in many applications to extract words or tokens from text. However, even hand-crafted regexes may fail to match all the intended words. In this paper, we propose a novel way to generalize a given regex so that it matches also a set of missing (previously non-matched) words. Our method finds an approximate match between the missing words and the regex, and adds disjunctions for the unmatched parts appropriately. We show that this method… CONTINUE READING

Figures and Tables from this paper.

Citations

Publications citing this paper.

References

Publications referenced by this paper.