• Corpus ID: 235591781

Document Matching for Job Descriptions

  title={Document Matching for Job Descriptions},
  author={Lum and Yao Jun},
We train a document encoder to match online job descriptions to one of many standardized job roles from Singapore’s Skills Framework. The encoder generates semantically meaningful document encodings from textual descriptions of job roles, which are then compared using Cosine Similarity to determine matching. During training, we implement the methodology used by Sentence-BERT, fine tuning pre-trained BERT models using a siamese network architecture on labelled document pairs. Overall, we find… 

Figures from this paper


Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity is presented.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
HuggingFace's Transformers: State-of-the-art Natural Language Processing
The \textit{Transformers} library is an open-source library that consists of carefully engineered state-of-the art Transformer architectures under a unified API and a curated collection of pretrained models made by and available for the community.
Sentence-bert: Sentence embeddings using siamese bertnetworks, 2019
  • 2019