Often when various estuarine benthic indices disagree in their assessments of benthic condition, they are reflecting different aspects of benthic condition. We describe a process to screen indices for associations and, after identifying candidate metrics, evaluate metrics individually against the indices. We utilize radar plots as a multi-metric visualization tool, and conditional probability plots and receiver operating characteristic curves to evaluate associations seen in the plots. We investigated differences in two indices, the US EPA Environmental Monitoring and Assessment Program's benthic index for the Virginian Province and the New York Harbor benthic index of biotic integrity using data collected in New York Harbor and evaluated overall agreement of the indices and associations between each index and measures of habitat and sediment contamination. The indices agreed in approximately 78% of the cases. The New York Harbor benthic index of biotic integrity showed stronger associations with sediment metal contamination and grain size.