Using the Census Bureau’s surname list to improve estimates of race/ethnicity and associated disparities

@article{Elliott2009UsingTC,
  title={Using the Census Bureau’s surname list to improve estimates of race/ethnicity and associated disparities},
  author={Marc N. Elliott and Peter A. Morrison and Allen M. Fremont and Daniel F. McCaffrey and Philip M Pantoja and Nicole Lurie},
  journal={Health Services and Outcomes Research Methodology},
  year={2009},
  volume={9},
  pages={69-83}
}
Commercial health plans need member racial/ethnic information to address disparities, but often lack it. We incorporate the U.S. Census Bureau’s latest surname list into a previous Bayesian method that integrates surname and geocoded information to better impute self-reported race/ethnicity. We validate this approach with data from 1,921,133 enrollees of a national health plan. Overall, the new approach correlated highly with self-reported race-ethnicity (0.76), which is 19% more efficient than… Expand

Tables from this paper

Using First Name Information to Improve Race and Ethnicity Classification
This paper uses a recent first name list to improve on a previous Bayesian classifier, the Bayesian Improved Surname Geocoding (BISG) method, which combines surname and geography information toExpand
Developing and evaluating methods to impute race/ethnicity in an incomplete dataset
The availability of race data is essential for identifying and addressing racial/ethnic disparities in the health care system; however, patient self-reported racial/ethnic information is oftenExpand
Using First Name Information to Improve Race and Ethnicity Classification
ABSTRACT This article uses a recent first name list to develop an improvement to an existing Bayesian classifier, namely the Bayesian Improved Surname Geocoding (BISG) method, which combines surnameExpand
Imputation of race/ethnicity to enable measurement of HEDIS performance by race/ethnicity
TLDR
Improved Medicare Bayesian Improved Surname Geocoding 2.0 represents a substantial improvement over MBISG 1.0 and the use of CMS administrative data on race/ethnicity alone. Expand
When Race/Ethnicity Data Are Lacking: Using Advanced Indirect Estimation Methods to Measure Disparities.
TLDR
Advances in methods for estimating race/ethnicity are enabling health plans and other health care organizations to overcome a long-standing barrier to routine monitoring and actions to reduce disparities in care, and new estimation methods are promising. Expand
Race and Ethnicity Data Quality and Imputation Using U.S. Census Data in an Integrated Health System
TLDR
The Bayesian Improved Surname Geocoding method produced imputation results far better than chance assignment for the four most common race/ethnicity groups in the health system: Whites, Hispanics, Blacks, and Asians. Expand
Indirect Estimation of Race/Ethnicity for Survey Respondents Who Do Not Report Race/Ethnicity
TLDR
It may be worthwhile to impute race/ethnicity when this information is unavailable in survey data sets due to item nonresponse, especially when missingness is high. Expand
Comparison of Imputation Methods for Race and Ethnic Information in Administrative Health Data
In the United States of America where there is no national health care, All-Payer Claims Databases provide great resources to investigate and address disparities in access to, utilization, andExpand
Improving Occupational Health Disparity Research: Testing a method to estimate race and ethnicity in a working population
TLDR
The BISG estimation method was accurate for White, Black, Latino, and Asian Pacific Islanders in a sample of workers and using it in administrative datasets will expand research into occupational health disparities. Expand
Imputing race and ethnic information in administrative health data.
TLDR
Predictive models using Census information and patients' demographic characteristics can be used to accurately populate race/ethnicity information in health care databases, enhancing opportunities to investigate and address disparities in access to, utilization of, and outcomes of care. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 22 REFERENCES
A new method for estimating race/ethnicity and associated disparities where administrative records lack self-reported race/ethnicity.
TLDR
The Bayesian Surname and Geocoding (BSG) method presented here efficiently integrates administrative data, substantially improving upon what is possible with a single source or from other hybrid methods; it offers a powerful tool that can help health care organizations address disparities until self-reported race/ethnicity data are available. Expand
Use of geocoding and surname analysis to estimate race and ethnicity.
TLDR
Geocoding and surname analysis show promise for estimating racial/ethnic health plan composition of enrollees when direct data on major racial and ethnic groups are lacking and can be used to assess disparities in care, pending availability of self-reported race/ethnicity data. Expand
From single-race reporting to multiple-race reporting: using imputation methods to bridge the transition.
TLDR
Exploratory analyses of data from the National Health Interview Survey suggest that imputation methods that use demographic and contextual covariate information to predict primary race can have advantages with respect to lower bias and improved variance estimation compared to simpler methods discussed by the Office of Management and Budget. Expand
Sample designs for measuring the health of small racial/ethnic subgroups.
TLDR
Three potentially promising sample design strategies for increasing the accuracy of national health estimates for a small target subgroup when used to supplement a small probability sample of that group are identified and applied to American Indians/Alaska Natives (AI/AN) and Chinese using National Health Interview Survey data. Expand
Asian American ethnic identification by surname
Few data sources include ethnicity-levelclassification for Asian Americans. However, it isoften more informative to study the ethnic groupsseparately than to use an aggregate Asian Americancategory,Expand
Surname analysis for estimating local concentration of Hispanics and Asians
Surname analysis is a potentially useful technique for identifying members of particular racial, ethnic, or language communities within a population. We review the existing state of the art forExpand
Power of tests for a dichotomous independent variable measured with error.
TLDR
The information loss from not observing actual values of dichotomous predictors can be quite large, and direct substitution is easy to implement and interpret and nearly as efficient as the PIMLE. Expand
Composite Estimates from Incomplete and Complete Frames for Minimum-Mse Estimation in a Rare Population An Application to Families with Young Children
Random digit dialing (RDD) can be costly for a rare population, but inexpensive convenience samples are unrepresentative by themselves. We combine biased estimates from an incomplete frame (a listedExpand
Hypersegregation in U.S. Metropolitan Areas: Black and Hispanic Segregation Along Five Dimensions
TLDR
It is concluded that blacks occupy a unique and distinctly disadvantaged position in the U.S. urban environment and are likely to be segregated on all five dimensions simultaneously, which never occurs for Hispanics. Expand
The meaning and use of the area under a receiver operating characteristic (ROC) curve.
A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics,Expand
...
1
2
3
...