Distilling Task Knowledge from How-To Communities
@article{Chu2017DistillingTK, title={Distilling Task Knowledge from How-To Communities}, author={Cuong Xuan Chu and Niket Tandon and Gerhard Weikum}, journal={Proceedings of the 26th International Conference on World Wide Web}, year={2017} }
Knowledge graphs have become a fundamental asset for search engines. A fair amount of user queries seek information on problem-solving tasks such as building a fence or repairing a bicycle. However, knowledge graphs completely lack this kind of how-to knowledge. This paper presents a method for automatically constructing a formal knowledge base on tasks and task-solving steps, by tapping the contents of online communities such as WikiHow. We employ Open-IE techniques to extract noisy candidates…
Figures and Tables from this paper
38 Citations
Task2KB: A Public Task-Oriented Knowledge Base
- Computer Science
- 2023
A novel knowledge base, ‘Task2KB’, is proposed, which is constructed using data crawled from WikiHow, an online knowledge resource offer- ing instructional articles on a wide range of tasks, which encapsulates various types of task-related information andributes.
Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases
- Computer ScienceFound. Trends Databases
- 2021
Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI. Over the last decade, large-scale knowledge bases, also known as…
Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data
- Computer ScienceACL
- 2022
This work develops a simple and efficient method that links steps in an article to other articles with similar goals, recursively constructing an open-domain hierarchical knowledge-base of procedures based on wikiHow, a website containing more than 110k instructional articles.
Know-How in Programming Tasks: From Textual Tutorials to Task-Oriented Knowledge Graph
- Computer Science2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)
- 2019
The resulting knowledge graph, TaskKG, includes a hierarchical taxonomy of activities, three types of activities relationships and five types of activity attributes, and enables activity-centric knowledge search and is promising in helping developers finding correct answers to programming how-to questions.
Information to Wisdom: Commonsense Knowledge Extraction and Compilation
- Computer ScienceWSDM
- 2021
This tutorial presents state-of-the-art methodologies towards the compilation and consolidation of commonsense knowledge (CSK), covering text-extraction-based, multi-modal and Transformer-based techniques, with special focus on the issues of web search and ranking, as of relevance to the WSDM community.
What Computers Should Know, Shouldn't Know, and Shouldn't Believe
- Computer ScienceWWW
- 2017
Automatically constructed knowledge bases are a powerful asset for search, analytics, recommendations and data integration, with intensive use at big industrial stake-holders, forming the Web of Linked Open Data.
Reasoning about Goals, Steps, and Temporal Ordering with WikiHow
- Computer ScienceEMNLP
- 2020
This work proposes a suite of reasoning tasks on two types of relations between procedural events: goal-step relations and step-step temporal relations, and introduces a dataset targeting these two relations based on wikiHow, a website of instructional how-to articles.
Procedural Knowledge Mining - A New Method for Extracting Best Practices by Applying Machine Learning on Data Graph
- Computer ScienceRevue d'Intelligence Artificielle
- 2022
This work presents a new method for formalizing good practices extracted from the web, and extracting the best practice for a given request by applying the techniques of artificial learning and text summary on data graphs.
HealthAid: Extracting domain targeted high precision procedural knowledge from on-line communities
- Computer ScienceInf. Process. Manag.
- 2020
MyFixit: An Annotated Dataset, Annotation Tool, and Baseline Methods for Information Extraction from Repair Manuals
- Computer ScienceLREC
- 2020
This paper introduces a semi-structured dataset of repair manuals and proposes methods that can serve as baselines for information extraction (IE) from the instructional text in repair manuals, including an unsupervised method based on a bags-of-n-grams similarity for extracting the needed tools in each repair step, and a deep-learning-based sequence labeling model for extracts the identity of disassembled parts.
References
SHOWING 1-10 OF 59 REFERENCES
Leveraging Procedural Knowledge for Task-oriented Search
- Computer ScienceSIGIR
- 2015
A set of textual features and structural features are proposed to identify key search phrases from task descriptions, and then adapt similar features to extract wikiHow-style procedural knowledge descriptions from search queries and relevant text snippets.
Acquiring Comparative Commonsense Knowledge from the Web
- Computer ScienceAAAI
- 2014
This paper relies on open information extraction methods to obtain large amounts of comparisons from the Web and develops a joint optimization model for cleaning and disambiguating this knowledge with respect to WordNet, which relies on integer linear programming and semantic coherence scores.
Open Information Extraction: The Second Generation
- Computer ScienceIJCAI
- 2011
The second generation of Open IE systems are described, which rely on a novel model of how relations and their arguments are expressed in English sentences to double precision/recall compared with previous systems such as TEXTRUNNER and WOE.
Open Language Learning for Information Extraction
- Computer ScienceEMNLP
- 2012
Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitrary…
Knowlywood: Mining Activity Knowledge From Hollywood Narratives
- Computer ScienceCIKM
- 2015
A pipeline for semantic parsing and knowledge distillation is developed, to systematically compile semantically refined activity frames, mined from about two million scenes of movies, TV series, and novels.
Creating Causal Embeddings for Question Answering with Minimal Supervision
- Computer ScienceEMNLP
- 2016
This work argues that a better approach is to look for answers that are related to the question in a relevant way, according to the information need of the question, which may be determined through task-specific embeddings, and implements causality as a use case.
Cross Sentence Inference for Process Knowledge
- Computer ScienceEMNLP
- 2016
This work extends standard within sentence joint inference to inference across multiple sentences, which promotes role assignments that are compatible across different descriptions of the same process.
POLY: Mining Relational Paraphrases from Multilingual Sentences
- Computer ScienceEMNLP
- 2016
A new method for building language resources that systematically organize paraphrases for binary relations and the resource itself, called POLY is presented, which shows significant improvements in precision and recall over the prior works on PATTY and DEFIE.
PATTY: A Taxonomy of Relational Patterns with Semantic Types
- Computer ScienceEMNLP
- 2012
PATTY is a large resource for textual patterns that denote binary relations between entities that are semantically typed and organized into a subsumption taxonomy that harnesses the rich type system and entity population of large knowledge bases.
Question-Answer Driven Semantic Role Labeling: Using Natural Language to Annotate Natural Language
- Computer ScienceEMNLP
- 2015
The results show that non-expert annotators can produce high quality QA-SRL data, and also establish baseline performance levels for future work on this task, and introduce simple classifierbased models for predicting which questions to ask and what their answers should be.