AutoDict: Automated Dictionary Discovery

  title={AutoDict: Automated Dictionary Discovery},
  author={Fei Chiang and P. Andritsos and Erkang Zhu and R. Miller},
  journal={2012 IEEE 28th International Conference on Data Engineering},
  • Fei Chiang, P. Andritsos, +1 author R. Miller
  • Published 2012
  • Computer Science
  • 2012 IEEE 28th International Conference on Data Engineering
  • An attribute dictionary is a set of attributes together with a set of common values of each attribute. Such dictionaries are valuable in understanding unstructured or loosely structured textual descriptions of entity collections, such as product catalogs. Dictionaries provide the supervised data for learning product or entity descriptions. In this demonstration, we will present AutoDict, a system that analyzes input data records, and discovers high quality dictionaries using information… CONTINUE READING
    8 Citations

    Figures and Topics from this paper.

    Data Driven Discovery of Attribute Dictionaries
    Exploiting Pre-Existing Datasets to Support IETS
    Deepec: An Approach For Deep Web Content Extraction And Cataloguing
    • 2
    • PDF
    DeepEC: uma abordagem para extração e catalogação de conteúdo presente na Deep Web
    Transactions on Computational Collective Intelligence XXI
    Bringing semantic structures to user intent detection in online medical queries
    • 3
    • PDF
    MC2:MPEG-7 content modelling communities


    Automatic segmentation of text into structured records
    • 255
    • PDF
    ONDUX: on-demand unsupervised learning for information extraction
    • 30
    • PDF
    Building re-usable dictionary repositories for real-world text mining
    • 46
    • PDF
    Structured annotations of web queries
    • 85
    • PDF
    Information-theoretic tools for mining database structure from large data sets
    • 63
    • PDF
    Unsupervised query segmentation using generative language models and wikipedia
    • 165
    • PDF
    Agglomerative Information Bottleneck
    • 403
    • PDF
    Dynamic itemset counting and implication rules for market basket data
    • 2,180
    • PDF
    Modeling By Shortest Data Description*
    • 5,970
    • Highly Influential
    • PDF