Beyond crosswalks: reliability of exposure assessment following automated coding of free-text job descriptions for occupational epidemiology.

@article{Burstyn2014BeyondCR,
  title={Beyond crosswalks: reliability of exposure assessment following automated coding of free-text job descriptions for occupational epidemiology.},
  author={Igor Burstyn and Anton Slutsky and Derrick G. Lee and Alison B Singer and Yuan An and Yvonne Michael},
  journal={The Annals of occupational hygiene},
  year={2014},
  volume={58 4},
  pages={
          482-92
        }
}
Epidemiologists typically collect narrative descriptions of occupational histories because these are less prone than self-reported exposures to recall bias of exposure to a specific hazard. However, the task of coding these narratives can be daunting and prohibitively time-consuming in some settings. The aim of this manuscript is to evaluate the performance of a computer algorithm to translate the narrative description of occupational codes into standard classification of jobs (2010 Standard… 
Occupation Coding of Job Titles: Iterative Development of an Automated Coding Algorithm for the Canadian National Occupation Classification (ACA-NOC) (Preprint)
TLDR
The ACA-NOC is a rigorous algorithm for automatically coding the Canadian NOC system and is readily extensible upon further benchmarking on larger data sets and has been evaluated using real-world data.
Feasibility and Utility of Lexical Analysis for Occupational Health Text
TLDR
Use of free text rather than narrowly defined numerically coded fields is feasible, flexible, and efficient and has potential to encourage workers and clinicians to provide more data and to support automated knowledge creation.
Occupation Coding of Job Titles: Iterative Development of an Automated Coding Algorithm for the Canadian National Occupation Classification (ACA-NOC)
TLDR
The ACA-NOC algorithm is a rigorous algorithm for automatically coding the Canadian NOC system and has been evaluated using real-world data and indicates that it has state-of-the-art performance and is readily extensible upon further benchmarking on larger data sets.
Computer-based coding of free-text job descriptions to efficiently identify occupations in epidemiological studies
TLDR
An algorithm called SOCcer (Standardized Occupation Coding for Computer-assisted Epidemiologic Research) to assign SOC-2010 codes based on free-text job description components may improve the efficiency of incorporating occupation into large-scale epidemiological studies.
Feasibility and Reliability of Automated Coding of Occupation in the Health and Retirement Study
TLDR
It is found that NIOCCS is a tool that might be best used to reduce the number of cases human coders must code, either in coding historical data to a consistent codeframe or in coding data from future HRS waves, but it is not yet ready to fully replace humancoders.
Trends in OSHA Compliance Monitoring Data 1979-2011: Statistical Modeling of Ancillary Information across 77 Chemicals.
TLDR
The relationships observed between exposure levels and ancillary variables across a vast majority of agents suggest that certain elements of OSHA's process of selecting worksites for inspection influence the exposure levels that OSHA inspectors encounter.
Statistical Modeling of Occupational Exposure to Polycyclic Aromatic Hydrocarbons Using OSHA Data
TLDR
Mixed-effects logistic models were used to predict the exceedance fraction (EF), i.e., the probability of exceeding OSHA's Permissible Exposure Limit (PEL) for PAHs based on industry and occupation, and will be used to create a job-exposure matrix for use in a population-based case-control study exploring PAH exposure and breast cancer risk.
A Working Semantic Model for the Integration of Occupation, Function and Health
TLDR
An integrated semantic model is defined and coded patient data representing disease (ICD), functional impairment (ICF), occupation (NOC), and job attributes (N OC Career Handbook) are populated.
Occupation Coding During the Interview in a Web-First Sequential Mixed-Mode Survey
Abstract Coding respondent occupation is one of the most challenging aspects of survey data collection. Traditionally performed manually by office coders post-interview, previous research has
Women’s occupational exposure to polycyclic aromatic hydrocarbons and risk of breast cancer
TLDR
It is suggested that prolonged occupational exposure to PAH may increase breast cancer risk, especially among women with a family history of breast cancer.
...
...

References

SHOWING 1-10 OF 22 REFERENCES
JEMs and incompatible occupational coding systems: effect of manual and automatic recoding of job codes on exposure assignment.
TLDR
Results of this study indicate that using automated crosswalks to recode job codes from one occupational classification system to another results only in a limited loss in agreement in assigned occupational exposure estimates compared with direct manual recoding.
Automatic approaches to clustering occupational description data for prediction of probability of workplace exposure to beryllium
TLDR
The study indicated that the Tolerance Rough Set with Jaccard similarity was a better combination overall and the predictive power of the automatically generated classifications closely approached that of the manually assembled classification of the same 12,148 records.
Inside the black box: starting to uncover the underlying decision rules used in a one-by-one expert assessment of occupational exposure in case-control studies
TLDR
CART and random forest models extracted decision rules and accurately predicted an expert's exposure decisions for the majority of jobs, and identified questionnaire response patterns that would require further expert review if the rules were applied to other jobs in the same or different study.
Comparison of exposure estimates in the Finnish job-exposure matrix FINJEM with a JEM derived from expert assessments performed in Montreal
TLDR
The authors' observations suggest that information concerning several agents can be successfully transported from Finland to Canada and probably other countries, however, for other agents, there was considerable disagreement, and hence, transportability of FINJEM cannot be assumed by default.
Occupational exposure assessment in case–control studies: opportunities for improvement
TLDR
Methods to improve assessments are suggested, including the incorporation of hygiene measurements: using data from administrative exposure databases, using results of studies identifying determinants of exposure to develop questionnaires, and where reasonable given latency and biological half life considerations, directly measuring exposures of study subjects.
Development of an asthma specific job exposure matrix and its application in the epidemiological study of genetics and environment in asthma (EGEA)
TLDR
This asthma JEM, when enhanced by expert re-evaluation of exposure estimates from job title texts, may be a useful tool in general population studies of asthma.
Data linkage to estimate the extent and distribution of occupational disease: new onset adult asthma in Alberta, Canada.
TLDR
Data linkage of administrative records can demonstrate under-reporting of occupational asthma and indicate areas for prevention.
Formaldehyde Exposure in U.S. Industries from OSHA Air Sampling Data
TLDR
Although limited by availability of relevant exposure determinants and potential selection biases in IMIS, these results provide useful insight on formaldehyde occupational exposure in the United States in the last two decades.
Estimating Occupational Beryllium Exposure from Compliance Monitoring Data
TLDR
In plausible models to estimate occupational airborne beryllium exposure, probability of exposure decreased over time, was highest in full-shift personal samples, and varied with industry and job.
Estimating the extent and distribution of new-onset adult asthma in British Columbia using frequentist and Bayesian approaches.
TLDR
The distribution of NOAA in BC appeared somewhat similar to that in Alberta, except for isocyanates, and Bayesian analyses allowed incorporation of prior evidence into risk estimates, permitting reconsideration of the apparently protective effect of isOCyanate exposure.
...
...