Database of somatic mutations in EGFR with analyses revealing indel hotspots but no smoking-associated signature.


We created an Epidermal Growth Factor Receptor (EGFR) Mutation Database ( that curates a convenient compilation of somatic EGFR mutations in non-small-cell lung cancer (NSCLC) and associated epidemiological and methodological data, including response to the tyrosine kinase inhibitors Gefitinib and Erlotinib. Herein, we analyze 809 mutations collected from 26 publications. Four super hotspots account for 70% of reported mutations while two-thirds of 131 unique mutations have been reported only once and account for only 11% of reported mutations. Consistent with strong biological selection for gain of function, the reported mutations are virtually all missense substitutions or in-frame microdeletions, microinsertions, or microindels (colocalized insertion and deletion with a net gain or loss of 1-50 nucleotides). Microdeletions and microindels are common in a region of exon 19. Microindels, which account for 8% of mutations, have smaller inserted sequences (95% are 1 to 5 bp) and are elevated 16-fold relative to mouse somatic microindels and to human germline microindels. Microdeletions/microindels are significantly more frequent in responders to Gefitinib or Erlotinib (P = 0.003). In addition, EGFR mutations in smokers do not carry signatures of mutagens in cigarette smoke. Otherwise, the mutation pattern does not differ significantly with respect to gender, age, or tumor histology. The EGFR Mutation Database is a central resource of EGFR sequence variant data for clinicians, geneticists, and other researchers. Authors are encouraged to submit new publications with EGFR sequence variants to be included in the database or to provide direct submissions via The WayStation submission and publication process (

Citations per Year

405 Citations

Semantic Scholar estimates that this publication has 405 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Gu2007DatabaseOS, title={Database of somatic mutations in EGFR with analyses revealing indel hotspots but no smoking-associated signature.}, author={Dongqing Gu and William A Scaringe and Kai Li and Juan-Sebastian Saldivar and Kathleen A Hill and Zhenbin Chen and Kelly D Gonzalez and Steve S Sommer}, journal={Human mutation}, year={2007}, volume={28 8}, pages={760-70} }