Automatic Segmentation of Text into Structured Records

  title={Automatic Segmentation of Text into Structured Records},
  author={Vinayak R. Borkar and Kaustubh Deshmukh and Sunita Sarawagi},
  booktitle={SIGMOD Conference},
In this paper we present a method for automatically segmenting unformatted text records into structured elements. Several useful data sources today are human-generated as continuous text whereas convenient usage requires the data to be organized as structured records. A prime motivation is the warehouse address cleaning problem of transforming dirty addresses stored in large corporate databases as a single text field into subfields like “City” and “Street”. Existing tools rely on hand-tuned… CONTINUE READING
Highly Influential
This paper has highly influenced 26 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 200 citations. REVIEW CITATIONS
130 Extracted Citations
10 Extracted References
Similar Papers

Citing Papers

Publications influenced by this paper.
Showing 1-10 of 130 extracted citations

200 Citations

Citations per Year
Semantic Scholar estimates that this publication has 200 citations based on the available data.

See our FAQ for additional information.

Referenced Papers

Publications referenced by this paper.
Showing 1-10 of 10 references

Information extraction using HMMs and shrinkage

  • D. Freitag, A. McCallum
  • In Papers from the AAAI-99 Workshop on Machine…
  • 1999
Highly Influential
7 Excerpts

Similar Papers

Loading similar papers…