DNA Repair (Amst
- M. Diaz
Activation-induced cytidine deaminase (AID) plays an essential role in the generation of a highly competent repertoire of antibodies by participating in class switch recombination (CSR) and somatic hypermutation (SHM). After B cell stimulation by antigens, AID initiates SHM and CSR by deamination of cytidine to uridine in the variable and constant regions of Ig genes. An adverse effect of AID’s ability to directly modify genome sequences is its mutagenic potential. AID has been shown to occasionally target non-Ig genes and its abnormal expression is strongly associated with tumorigenesis. Aberrant AID expression can also be triggered by some oncogenic pathogens, such as Helicobacter pylori and several viruses. The latest reports show that also under physiological conditions AID might act beyond the immune system. Data exist suggesting that AID can play a role in the process of active genome demethylation – the heart of epigenetic gene activation and reprogramming. Moreover, it has been shown that abnormal genome demethylation mediated by AID might be associated with human colon cancers. The potential role of AID in the active demethylation process is still controversial, but the hypothesis that aberrant AID expression may cause cancerogenesis by changing genome methylation patterns appears highly attractive. As a unique human enzyme able to induce both genetic and epigenetic alterations under physiological and pathological conditions, AID could be a promising and versatile drug target. In this review we present the current state of knowledge on this topic and the controversies surrounding the pleiotropic effect of AID function. The AID/APOBEC protein family AID was discovered not long ago (only in 1999), as a protein selectively expressed in activated B cells in germinal centers (Muramatsu et al., 1999). More detailed genetic analyses permitted classification of AID as a member of the AID/APOBEC family. Members of this family represent a unique group of enzymes that function as DNA/RNA mutators. They can insert mutations in DNA and/or RNA as a result of their ability to deaminate cytidine to uridine (Conticello, 2008; Prochnow et al., 2009). This family is found in vertebrates and comprises zinc-coordinating deaminases. In the human genome 11 genes encoding AID/APOBEC proteins have been identified. AID, APOBEC1, APOBEC2 and APOBEC4 are encoded by single genes. There are, however, 7 genes encoding APOBEC3 proteins (A, B, C, D, E, F, G and H). It is believed that the AID/APOBEC family originated from tRNA adenosine deaminases (Tad/ADAT2) that edit adenosine to inosine at the anticodon of various tRNAs in eukaryotes and prokaryotes (Conticello, 2008). Recent phylogenetic analyses suggest that the rise of the AID/APOBEC gene family have been concurrent with the appearance of the vertebrate lineage and the evolution of adaptive immunity. AID has been found in some primitive vertebrates, such as lamprey, whose immunity is based on variable lymphocyte receptors (VLRs) instead of immunoglobulins (Rogozin et al., 2007). This indicates that AID is one of the ancestral family members. All AID/APOBECs catalyze the same reaction but seem to have different biological functions. AID and APOBEC3 target ssDNA and play a key role in adaptive and innate immune response, respectively (Holmes et al., 2007). APOBEC1 is responsible for apolipoprotein B pre-mRNA editing (Teng et al., 1993). The functions of APOBEC2 and APOBEC 4 are still unknown (Liao et al., 1999; Rogozin et al., 2005). Because of the extraordinary mutagenic potential of all APOBEC proteins, it is not surprising that their localization is mostly cytoplasmic, even though at least some of them act in the nucleus. Due to the specific properties of AID/APOBEC proteins, L. Budzko, P. Jackowiak, M. Figlerowicz 16 their activity and the expression of their genes have to be very strictly and precisely controlled at all possible levels (transcriptional, post-transcriptional, translational and post-translational) (Smith et al., 2011). As a result, studies of AID/APOBEC proteins are extremely complex and difficult. Consequently, our knowledge of the role of these enzymes, the cellular and molecular mechanisms of their action and their regulatory networks is very limited and the biological importance of all the AID/APOBEC family remains a mystery. The structure of AID/APOBEC proteins Currently, a crystal structure is only available for two members of the AID/APOBEC family. One of them is the functionally uncharacterized APOBEC2 and the second is APOBEC3G (Holden et al., 2008; Prochnow et al., 2007; Smith et al., 2011). However, for the latter only the structure of the carboxy-terminal domain has been established. Thus, one can use bioinformatic tools to predict the structural features of other members of the AID/APOBEC protein family by comparing their primary and secondary structures and creating models based on the crystal structures of APOBEC2, APOBEC3G, and other zinc-dependent deaminases, especially TadA from the Tad/ADAT2 family (Losey et al., 2006). So far, the generated models indicate that each AID/APOBEC family member contains either one or two deaminase domains with a characteristic Zn coordinating motif: H[AV]E-x [24-36]-PCxx C (where x is any amino acid) (Bransteitter et al., 2009). The motif forms a catalytic center responsible for the enzyme activity. The histidine (H) and the two cysteines (C) coordinate a zinc atom and form a pocket in which the cytidine is bound. The mechanism of cytidine deamination presumes a direct nucleophilic attack at position 4 of the pyrimidine ring by an activated water molecule coordinated by the zinc atom and the glutamate (which acts as a proton donor) (Samaranayake et al., 2006). The overall structure of the AID/APOBECs resembles that of other zinc-dependent deaminases. A series of five β strands forms the backbone of the enzyme, and two α helices that contain the histidine, two cysteines and glutamic acid shape a catalytic pocket. A comparison of the crystal structures and models of AID/APOBEC proteins with the crystal structure of the bacterial deaminase (TadA) reveals the presence of a conserved loop that may participate in substrate recognition. This loop, together with the catalytic center, forms a channel where the DNA strand could be positioned and recognized. Recognition of the substrate through these loops might explain the observation that different AID/APOBECs display sequence-context preferences in regard to the nucleotides immediately upstream of the cytidine to be deaminated (Bransteitter et al., 2009; Conticello, 2008). Dimerization and oligomerization of the AID/APOBEC proteins has been reported. In the case of APOBEC3G, the collected data indicate that this process can occur in an RNA-dependent manner (Holden et al., 2008). The crystal structure of a functionally uncharacterized APOBEC2 reveals that it forms a tetramer via head-to-head interaction of two APOBEC2 dimers (Prochnow et al., 2007). However, in the case of AID, the formation of dimers or oligomers does not seem to be necessary for the enzymatic activity (Brar et al., 2008). Consequently, no conclusion can be drawn with regard to the potential oligomerization of AID. AID and antibody diversification AID was first identified by Honjo’s group in 1999 as an enzyme both necessary and sufficient in the process of secondary antibody diversification (Muramatsu et al., 1999). AID is selectively expressed in germinal center B cells in response to stimulation by an antigen. AID initiates the diversification of antibodies which leads to the generation of a repertoire of antibodies highly specific to the antigen and with different effector functions (Muramatsu et al., 2000). Each antibody consists of four polypeptide chains – two identical heavy chains and two identical light chains. Within each of the four chains one can distinguish a variable region (V) and a constant region (C). V regions of light and heavy chains form an antigen-binding site. The structure of region C in heavy chains defines the antibody class (IgM, IgD, IgG, IgA and IgE), and the structure of region C in light chains determines the antibody type (κ or λ). Generation of the antibody diversity occurs in two steps: primary and secondary diversification. Primary antibody diversification is an AID-independent process. Secondary diversification includes: (i) somatic hypermutation (SHM), in which the immunoglobulin genes are mutated to increase the antibody affinity for particular antigens; (ii) class switch recombiActivation-induced cytidine deaminase (AID): single activity – pleiotropic effect 17 nation (CSR), in which activated B cells change their expression from IgM to other antibody classes; and (iii) gene conversion (GC), a form of genetic recombination in which one allele replaces its counterpart. SHM and CSR have been shown to occur in humans, and GC is the process through which immunoglobulins are diversified in chickens, rabbits, cattle and pigs. AID has been proved to be a key factor in all these processes (Arakawa et al., 2004; Chaudhuri and Alt, 2004; Delker et al., 2009). Fig. 1. The role of AID in antibody diversification in humans. A) Organization of the immunoglobulin locus. The locus contains a VDJ segment (V) that encodes the antigen-binding region of the antibody heavy chain, and several constant region coding sequences (C) separated by switch regions (S). B) Somatic hypermutation. AID targets the V region and induces mutations (nucleotide substitutions), thus enabling the generation of antibodies with higher affinity for antigens. Mutations are indicated as red stars in the V region. C) Class switch recombination. AID targets switch regions (S) and triggers the formation of double-stranded breaks in DNA, thereby enabling the exchange of constant regions (C). As a result, antibodies with new effector functions are generated The process of class switch recombination (CSR) initiated by AID in humans is a unique intrachromosomal deletion-recombination reaction which occurs within GC-rich tandem repeated DNA sequences (called switch or S regions located upstream of each of the heavy chain region C coding sequences, except Cδ). The second process of secondary antibody diversification indentified in humans – somatic hypermutation (SHM) – is a programmed process of mutation (mainly single base substitutions) affecting the V regions of immunoglobulin genes in mature B cells (Fig. 1) (Chaudhuri and Alt, 2004; Peled et al., 2008). AID is believed to initiate SHM and CSR by deaminating cytidine to uridine,respectively, in variable or constant regions of immunoglobulin genes. The resultant U:G (U – uridine) mismatch is then subjected to one of a number of fates. The first scenario assumes that the double-stranded DNA carrying the U:G mismatch is replicated. As a result, two daughter strands are created: one that remains unmutated (it contains a C:G pair), and one that undergoes a C to T transition mutation (it contains a T:A pair) (Peled et al., 2008). The second scenario is that the U:G mismatch is recognized by the cellular DNA repair machinery and U is digested by uracil DNA glycosylase (UNG), resulting in an abasic site. In the next step, double-stranded DNA can be replicated. Replication of DNA that contains the abasic site results in random incorporation of any of the four nucleotides (i.e., A, G, C or T). Alternatively, the abasic site may be cleaved by apurinic endonuclease (APE), which generates a singlestrand break (SSB) in DNA. This break may be then subjected to normal DNA repair, or the second DNA strand may be cleaved (in an similar process) leading to the formation of a double-strand break (DSB). It is thought that the formation of these DSBs in either the switch region of the immunoglobulin gene or the variable region of the immunoglobulin gene can lead to specific recombination (CSR or GC). According to the third scenario, the U:G mismatch is recognized by the DNA mismatch repair (MMR) machinery. The MMR proteins recognize single base distortions (such as a U:G mismatch) and remove a fragment of the mutated DNA strand. Then, the exposed single stranded region of the complementary DNA serves as a template for the error prone DNA polymerase which fills the gap. This error prone polymerase is thought to introduce additional mutations randomly across the DNA gap (Peled et al., 2008; Stavnezer, 2011). L. Budzko, P. Jackowiak, M. Figlerowicz 18 The fact that AID enzymatic activity is indispensable for SHM and CSR has been confirmed by experiments involving both patients lacking AID and AID knock-out mice (Revy et al., 2000; Stavnezer, 2011). In both cases, the lack of AID led to the hyper IgM syndrome – only one class of antibodies (IgM) with low affinity to the antigen was produced. Experiments performed in vitro and in E. coli seem to confirm these observations indicating that both SHM and CSR are initiated by AID-mediated deamination of C to U (Chaudhuri et al., 2003; PetersenMahrt et al., 2002). In addition, several AID mutants have been identified in patients with diagnosed immunological disorders (Durandy et al., 2006). Next, the specific mutations were correlated with the reduced AID capacity to induce either SHM, CSR or both (Shinkura et al., 2004). Most of the data indicate that AID preferentially deaminates cytidines located within the WRCY hotspot motifs (W – adenosine or thymidine, R – purine, C – cytidine, Y – pyrimidine) (Bransteitter et al., 2003; Peled et al., 2008). It is believed that AID activity in vivo is connected with transcription and requires the unwinding of the DNA helix. AID can target both strands but preferentially modifies the DNA strand that is not used as a template during transcription (Chaudhuri et al., 2003; Shen and Storb, 2004). Formation of DNA secondary structures in transcription bubbles (like R-loops in GC-rich S regions) has been proved to be important for AID targeting (Zarrin et al., 2004). Unfortunately the exact mechanism of locus recognition by AID is generally unknown. Only a few proteins that interact with AID have been identified so far. One of them is MDM2, a regulatory protein shuttling between cytoplasm and nucleus (MacDuff et al., 2006). Another is replication protein A (RPA), a ubiquitous protein that binds singlestranded regions of DNA during DNA replication and repair (Yamane et al., 2011). AID has also been shown to interact with the RNA exosome and RNA Pol II complex (through Spt5 protein) (Basu et al., 2011; Pavri et al., 2010). Unfortunately, none of these interactions can ensure AID specificity to the immunoglobulin locus. Interestingly, it has also been hypothesized that AID displays only partial specificity, and it is the DNA-repair machinery that actually maintains the stability of all other genomic regions (Unniraman and Schatz, 2007). AID is modified post-translationally, however, the exact role of this modification still remains unclear. It has been shown that in mice AID needs to be phosphorylated (most likely by protein kinase A – PKA at serine 38) in order to trigger antibody diversification. The data collected indicate that phosphorylation-defective AID mutants show either delayed activity in SHM/CSR or its substantial decrease (Cheng et al., 2009). At the same time, it has been demonstrated that AID phosphorylation is not specific to B cells and that AID phosphorylation is not required for the fish homolog to act (Basu et al., 2008; Chatterji et al., 2007). In general, all these findings suggest that it is more likely that phosphorylation is related to AID modulation than to its targeting. Therefore, both cis-acting factors (encoded in AID structure) and trans-acting factors (encoded in DNA and other protein structures) determining the AID selectivity and site-specificity remain unknown and will be the subject of future investigations. AID in demethylation processes The latest reports show that under physiological conditions AID might act beyond the immune system. Accumulating evidence suggests that AID may participate in active genome demethylation – a key element in the process of epigenetic gene activation and reprogramming. Genome methylation is one of the most important epigenetic factors which determine gene expression. Methylation of DNA occurs mainly on the 5C-position of cytosine, which leads to the formation of 5-methylcytosine (5meC) (Illingworth and Bird, 2009). In mammals, a number of enzymes have been identified that are responsible for both de novo methylation (DNMT3A, DNMT3B) and also maintenance of methylation during DNA replication (DNMT1) (Bestor, 2000). Much less is known about the molecular basis of the demethylation process. It may occur passively (where the methyl groups are not incorporated into new DNA strands during replication) or in an active way (when methyl groups are removed from DNA). Active, global genome demethylation occurs in mammalian development at two distinct stages: in the early embryo immediately after fertilization and in primordial germ cells (PGCs) (Gehring et al., 2009; Hajkova et al., 2002). Demethylation restores a pluripotency state and leads to the reprogramming of the genome, so it is essential for differentiation and development. The loss of DNA methylation is in fact global: in female mice PGCs only 7% of CpGs remain methylated, as compared to 70-80% in embryonic stem Activation-induced cytidine deaminase (AID): single activity – pleiotropic effect 19 (ES) cells and somatic cells (Popp et al., 2010). A lack of AID in PGCs reduces the level of global DNA demethylation suggesting that AID is responsible, at least partially, for this process. Moreover, AID has been proved to be a key factor in active demethylation in zebrafish embryos (Rai et al., 2008). Several recent observations also indicate that AID is involved in the demethylation of key pluripotency genes (such as Oct4 and Nanog ) during the reprogramming of somatic cells to a pluripotent state (Bhutani et al., 2010). Importantly, AID is expressed at high levels in oocytes, embryonic germ cells and embryonic stem cells, which strongly suggests the involvement of AID in development (Morgan et al., 2004). The biochemical process behind active DNA demethylation has been an area of controversy. Several mechanisms of active demethylation have been proposed. One of them is based on the reported ability of AID to deaminate 5meC directly to thymidine, albeit with low efficiency (Bransteitter et al., 2003; Morgan et al., 2004). The AID-mediated deamination of 5meC leads to a T:G mismatch that can be resolved by the DNA repair machinery which inserts unmethylated cytosine in place of the mismatch (thus removing the epigenetic mark). In this context, AID appears to be an important element in epigenetic reprogramming, a critical event in ontogenesis as well as in the generation of induced pluripotent stem cells. Nevertheless, the low efficiency of 5meC deamination by AID, as well as some unanswered mechanistic questions (e.g. lack of a transcriptional activity of methylated regions), make this new potential AID function still highly controversial.