Mapping Complex Technologies via Science-Technology Linkages; The Case of Neuroscience - A transformer based keyword extraction approach
@article{Hain2022MappingCT, title={Mapping Complex Technologies via Science-Technology Linkages; The Case of Neuroscience - A transformer based keyword extraction approach}, author={Daniel Stefan Hain and Roman Jurowetzki and Mariagrazia Squicciarini}, journal={ArXiv}, year={2022}, volume={abs/2205.10153} }
In this paper, we present an e cient deep learning based approach to extract technology-related topics and keywords within scientific literature, and identify corresponding technologies within patent applications. Specifically, we utilize transformer based language models, tailored for use with scientific text, to detect coherent topics over time and describe these by relevant keywords that are automatically extracted from a large text corpus. We identify these keywords using Named Entity…
Figures and Tables from this paper
References
SHOWING 1-10 OF 65 REFERENCES
SPECTER: Document-level Representation Learning using Citation-informed Transformers
- Computer ScienceACL
- 2020
This work proposes SPECTER, a new method to generate document-level embedding of scientific papers based on pretraining a Transformer language model on a powerful signal of document- level relatedness: the citation graph, and shows that Specter outperforms a variety of competitive baselines on the benchmark.
Deep learning, deep change? Mapping the evolution and geography of a general purpose technology
- Computer ScienceScientometrics
- 2021
An analysis of Deep Learning, a core technique of artificial intelligence systems increasingly being recognized as the latest example of a transformational general purpose technology, finds that strong research clusters tend to appear in regions that specialise in research and industrial activities related to Deep Learning.
A topic model analysis of science and technology linkages: A case study in pharmaceutical industry
- Computer Science2017 IEEE Technology & Engineering Management Conference (TEMSCON)
- 2017
Qualitative analysis of the clusters shows that the topic clustering algorithm is valuable approach in detection of patent and publication linkage, and hence Latent Dirichlet Allocation is considered a valuable approach.
SciBERT: A Pretrained Language Model for Scientific Text
- Computer ScienceEMNLP
- 2019
SciBERT leverages unsupervised pretraining on a large multi-domain corpus of scientific publications to improve performance on downstream scientific NLP tasks and demonstrates statistically significant improvements over BERT.
Detecting potential technological fronts by comparing scientific papers and patents
- Computer Science
- 2011
A comparative study was performed to measure the semantic similarity between academic papers and patents in order to discover research fronts that do not correspond to any patents.
Delineating the scientific footprint in technology: Identifying scientific publications within non-patent references
- Computer ScienceScientometrics
- 2011
The results of a machine-learning algorithm that allows identifying scientific references in an automated manner are introduced, which signal the relevancy of delineating scientific references when using NPRs to assess the occurrence and impact of science–technology interactions.
Measuring industry-science links through inventor-author relations: A profiling methodology
- Computer ScienceScientometrics
- 2007
Text-based profile methodology performs significantly better than a random matching of patents and publications, suggesting that text-based profiling is a valuable complementary tool to the name searches used in previous studies.
Measuring science-technology interaction using rare inventor-author names
- Environmental ScienceJ. Informetrics
- 2008
Linking science to technology: Using bibliographic references in patents to build linkage schemes
- Computer ScienceScientometrics
- 2004
A method to design a linkage scheme that links the systems of science and technology through the use of patent citation data is developed and tested on and applied to subsets of USPTO patents.