OBJECTIVE To develop an algorithm for the discovery of drug treatment patterns for endocrine breast cancer therapy within an electronic medical record and to test the hypothesis that information extracted using it is comparable to the information found by traditional methods. MATERIALS The electronic medical charts of 1507 patients diagnosed with histologically confirmed primary invasive breast cancer. METHODS The automatic drug treatment classification tool consisted of components for: (1) extraction of drug treatment-relevant information from clinical narratives using natural language processing (clinical Text Analysis and Knowledge Extraction System); (2) extraction of drug treatment data from an electronic prescribing system; (3) merging information to create a patient treatment timeline; and (4) final classification logic. RESULTS Agreement between results from the algorithm and from a nurse abstractor is measured for categories: (0) no tamoxifen or aromatase inhibitor (AI) treatment; (1) tamoxifen only; (2) AI only; (3) tamoxifen before AI; (4) AI before tamoxifen; (5) multiple AIs and tamoxifen cycles in no specific order; and (6) no specific treatment dates. Specificity (all categories): 96.14%-100%; sensitivity (categories (0)-(4)): 90.27%-99.83%; sensitivity (categories (5)-(6)): 0-23.53%; positive predictive values: 80%-97.38%; negative predictive values: 96.91%-99.93%. DISCUSSION Our approach illustrates a secondary use of the electronic medical record. The main challenge is event temporality. CONCLUSION We present an algorithm for automated treatment classification within an electronic medical record to combine information extracted through natural language processing with that extracted from structured databases. The algorithm has high specificity for all categories, high sensitivity for five categories, and low sensitivity for two categories.