The single copy gene coding for human α1 (IV) procollagen is located at the terminal end of the long arm of chromosome 13
We report the isolation and characterization of cDNA clones coding for part of the pro-alpha1(IV) chain of human type IV procollagen. A cDNA library was prepared from total RNA isolated from a cultured human tumor cell line, HT-1080, and screened with a cloned mouse cDNA coding for the pro-alpha1(IV) chain. The largest cDNA clone encoded for 185 amino acid residues of the -Gly-X-Y-sequence of the human pro-alpha1(IV) chain, all of the globular carboxyl-terminal domain, and the 3' noncoding region. The results provide the first complete sequence for the carboxyl-terminal globular portion of a type IV procollagen chain. A striking feature of the carboxyl-terminal globular domain was a homology between the first and second half of the structure. The homology involved all 12 cysteine residues, the spacing between the cysteine residues, and many adjacent amino acids. The results raised the possibility that evolution of the globular domain involved duplication of an ancestral sequence coding for about 100 amino acids, 6 of which were cysteine. The homology, however, was more apparent in the amino acid sequence than in the nucleotide sequence, and, therefore, the results suggested that the homology reflects selective pressure on the function of the protein more than conservation of the nucleotide sequences in the gene. The nucleotide sequences of the 3' noncoding region of the cDNAs contained four polyadenylation signals of AATAAA. Three or four of the polyadenylation signals were probably used in transcription, since one major and two minor smaller RNA species from human skin fibroblasts hydridized with the cDNAs. In further studies, sorted human chromosomes were used to locate the gene for the pro-alpha1(IV) chain on chromosome 13.