Human insulin gene, complete cds
GenBank: J00265.1
FASTA Graphics
Go to:
LOCUS       HUMINS01                4044 bp    DNA     linear   PRI 12-FEB-2001
DEFINITION  Human insulin gene, complete cds.
ACCESSION   J00265
VERSION     J00265.1  GI:186429
KEYWORDS    GC rich region; insulin; polymorphic variation; tandem repeat.
SEGMENT     1 of 2
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   2  (bases 1925 to 3715)
  AUTHORS   Bell,G.I., Pictet,R.L., Rutter,W.J., Cordell,B., Tischer,E. and
            Goodman,H.M.
  TITLE     Sequence of the human insulin gene
  JOURNAL   Nature 284 (5751), 26-32 (1980)
   PUBMED   6243748
REFERENCE   4  (bases 1928 to 3651)
  AUTHORS   Ullrich,A., Dull,T.J., Gray,A., Brosius,J. and Sures,I.
  TITLE     Genetic variation in the human insulin gene
  JOURNAL   Science 209 (4456), 612-615 (1980)
   PUBMED   6248962
REFERENCE   5  (bases 2414 to 2610)
  AUTHORS   Bell,G.I., Swain,W.F., Pictet,R., Cordell,B., Goodman,H.M. and
            Rutter,W.J.
  TITLE     Nucleotide sequence of a cDNA clone encoding human preproinsulin
  JOURNAL   Nature 282 (5738), 525-527 (1979)
   PUBMED   503234
REFERENCE   6  (bases 2411 to 2610)
  AUTHORS   Sures,I., Goeddel,D.V., Gray,A. and Ullrich,A.
  TITLE     Nucleotide sequence of human preproinsulin complementary DNA
  JOURNAL   Science 208 (4439), 57-59 (1980)
   PUBMED   6927840
REFERENCE   7  (bases 1 to 4044)
  AUTHORS   Bell,G.I., Pictet,R. and Rutter,W.J.
  TITLE     Analysis of the regions flanking the human insulin gene and
            sequence of an Alu family member
  JOURNAL   Nucleic Acids Res. 8 (18), 4091-4109 (1980)
   PUBMED   6253909
REFERENCE   8  (bases 1 to 2227)
  AUTHORS   Bell,G.I., Selby,M.J. and Rutter,W.J.
  TITLE     The highly polymorphic region near the human insulin gene is
            composed of simple tandemly repeating sequences
  JOURNAL   Nature 295 (5844), 31-35 (1982)
   PUBMED   7035959
REFERENCE   9  (bases 917 to 1428; 1828 to 2185; 3615 to 4044)
  AUTHORS   Ullrich,A., Dull,T.J., Gray,A., Philips,J.A. III and Peter,S.
  TITLE     Variation in the sequence and modification state of the human
            insulin gene flanking regions
  JOURNAL   Nucleic Acids Res. 10 (7), 2225-2240 (1982)
   PUBMED   6283472
COMMENT     The human insulin gene region consists of three exons and two
            introns coding for a signal peptide, a b-chain, a c-peptide, and an
            a-chain. Present evidence favors a single insulin gene per haploid
            genome; however, allelic and polymorphic variation are conspicuous.
            The two major alleles studied thus far are denoted alpha and beta.
            The 5' flanks for these are so different, largely because of the
            presence of tandem repeats not found elsewhere in the human genome,
            that separate entries have been made for this region (see J00266
            and J00267). Thus differences in the first 2000 bases are not
            annotated below. This sequence heterogeneity is generated largely,
            though not exclusively, by a family of G+C-rich oligonucleotides
            whose consensus sequence is ACAGGGGTGTGGGG. In the 5' sequence
            reported below (from [5]), these occur most obviously between bases
            1340 and 1823. While the variation in the 5' flank may be
            significant for gene expression, it has not been associated to date
            with diabetic conditions. [4],[5],[6] discuss this variation in
            detail. Variation in the form of base modification is observed in
            the 3' flanking sequence ([6]). Conflicts between [5],[6] in this
            region may ultimately prove to be polymorphic variations.  This
            sequence of 4044 bases (which most closely represents the beta
            allele) was communicated with revisions by G.I.Bell. An additional
            stretch of about 950 bases in the 3' flank, which has not been
            published, is available through G.I.Bell or this library. See other
            loci beginning <humins> and other loci with ins as the 4th-6th
            characters of the locus name.
FEATURES             Location/Qualifiers
     source          1..4044
                     /organism="Homo sapiens"
                     /mol_type="genomic DNA"
                     /db_xref="taxon:9606"
                     /map="11p15.5"
                     /tissue_type="liver"
                     /dev_stage="foetus"
     gene            join(2186..4044,J00268.1:1..825)
                     /gene="INS"
     exon            2186..2227
                     /gene="INS"
                     /note="G00-119-349"
                     /number=1
     intron          2228..2406
                     /gene="INS"
                     /note="G00-119-349"
                     /number=1
     variation       2401
                     /gene="INS"
                     /note="a in alpha-allele; t in beta allele ([4])"
     CDS             join(2424..2610,3397..3542)
                     /gene="INS"
                     /note="precursor"
                     /codon_start=1
                     /product="insulin"
                     /protein_id="AAA59172.1"
                     /db_xref="GI:386828"
                     /db_xref="GDB:G00-119-349"
                     /translation="MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCG
                     ERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSL
                     YQLENYCN"
     sig_peptide     2424..2495
                     /gene="INS"
                     /note="G00-119-349"
     mat_peptide     join(2496..2610,3397..3539)
                     /gene="INS"
                     /product="c peptide; G00-119-349"
     intron          2611..3396
                     /gene="INS"
                     /note="G00-119-349"
                     /number=2
     variation       3229
                     /gene="INS"
                     /note="c in alpha-allele; g in beta-allele ([4])"
     exon            3397..>3615
                     /gene="INS"
                     /note="G00-119-349"
                     /number=2
     variation       3551
                     /gene="INS"
                     /note="c in alpha-allele; t in beta-allele ([4])"
     variation       3564
                     /gene="INS"
                     /note="c in alpha-allele; a in beta-allele ([4])"
ORIGIN      
        1 ctcgaggggc ctagacattg ccctccagag agagcaccca acaccctcca ggcttgaccg
       61 gccagggtgt ccccttccta ccttggagag agcagcccca gggcatcctg cagggggtgc
      121 tgggacacca gctggccttc aaggtctctg cctccctcca gccaccccac tacacgctgc
      181 tgggatcctg gatctcagct ccctggccga caacactggc aaactcctac tcatccacga
      241 aggccctcct gggcatggtg gtccttccca gcctggcagt ctgttcctca cacaccttgt
      301 tagtgcccag cccctgaggt tgcagctggg ggtgtctctg aagggctgtg agcccccagg
      361 aagccctggg gaagtgcctg ccttgcctcc ccccggccct gccagcgcct ggctctgccc
      421 tcctacctgg gctcccccca tccagcctcc ctccctacac actcctctca aggaggcacc
      481 catgtcctct ccagctgccg ggcctcagag cactgtggcg tcctggggca gccaccgcat
      541 gtcctgctgt ggcatggctc agggtggaaa gggcggaagg gaggggtcct gcagatagct
      601 ggtgcccact accaaacccg ctcggggcag gagagccaaa ggctgggtgt gtgcagagcg
      661 gccccgagag gttccgaggc tgaggccagg gtgggacata gggatgcgag gggccggggc
      721 acaggatact ccaacctgcc tgcccccatg gtctcatcct cctgcttctg ggacctcctg
      781 atcctgcccc tggtgctaag aggcaggtaa ggggctgcag gcagcagggc tcggagccca
      841 tgccccctca ccatgggtca ggctggacct ccaggtgcct gttctgggga gctgggaggg
      901 ccggaggggt gtaccccagg ggctcagccc agatgacact atgggggtga tggtgtcatg
      961 ggacctggcc aggagagggg agatgggctc ccagaagagg agtgggggct gagagggtgc
     1021 ctggggggcc aggacggagc tgggccagtg cacagcttcc cacacctgcc cacccccaga
     1081 gtcctgccgc cacccccaga tcacacggaa gatgaggtcc gagtggcctg ctgaggactt
     1141 gctgcttgtc cccaggtccc caggtcatgc cctccttctg ccaccctggg gagctgaggg
     1201 cctcagctgg ggctgctgtc ctaaggcagg gtgggaacta ggcagccagc agggagggga
     1261 cccctccctc actcccactc tcccaccccc accaccttgg cccatccatg gcggcatctt
     1321 gggccatccg ggactgggga caggggtcct ggggacaggg gtccggggac agggtcctgg
     1381 ggacaggggt gtggggacag gggtctgggg acaggggtgt ggggacaggg gtgtggggac
     1441 aggggtctgg ggacaggggt gtggggacag gggtccgggg acaggggtgt ggggacaggg
     1501 gtctggggac aggggtgtgg ggacaggggt gtggggacag gggtctgggg acaggggtgt
     1561 ggggacaggg gtcctgggga caggggtgtg gggacagggg tgtggggaca ggggtgtggg
     1621 gacaggggtg tggggacagg ggtcctgggg ataggggtgt ggggacaggg gtgtggggac
     1681 aggggtcccg gggacagggg tgtggggaca ggggtgtggg gacaggggtc ctggggacag
     1741 gggtctgagg acaggggtgt gggcacaggg gtcctgggga caggggtcct ggggacaggg
     1801 gtcctgggga caggggtctg gggacagcag cgcaaagagc cccgccctgc agcctccagc
     1861 tctcctggtc taatgtggaa agtggcccag gtgagggctt tgctctcctg gagacatttg
     1921 cccccagctg tgagcaggga caggtctggc caccgggccc ctggttaaga ctctaatgac
     1981 ccgctggtcc tgaggaagag gtgctgacga ccaaggagat cttcccacag acccagcacc
     2041 agggaaatgg tccggaaatt gcagcctcag cccccagcca tctgccgacc cccccacccc
     2101 gccctaatgg gccaggcggc aggggttgac aggtagggga gatgggctct gagactataa
     2161 agccagcggg ggcccagcag ccctcagccc tccaggacag gctgcatcag aagaggccat
     2221 caagcaggtc tgttccaagg gcctttgcgt caggtgggct cagggttcca gggtggctgg
     2281 accccaggcc ccagctctgc agcagggagg acgtggctgg gctcgtgaag catgtggggg
     2341 tgagcccagg ggccccaagg cagggcacct ggccttcagc ctgcctcagc cctgcctgtc
     2401 tcccagatca ctgtccttct gccatggccc tgtggatgcg cctcctgccc ctgctggcgc
     2461 tgctggccct ctggggacct gacccagccg cagcctttgt gaaccaacac ctgtgcggct
     2521 cacacctggt ggaagctctc tacctagtgt gcggggaacg aggcttcttc tacacaccca
     2581 agacccgccg ggaggcagag gacctgcagg gtgagccaac cgcccattgc tgcccctggc
     2641 cgcccccagc caccccctgc tcctggcgct cccacccagc atgggcagaa gggggcagga
     2701 ggctgccacc cagcaggggg tcaggtgcac ttttttaaaa agaagttctc ttggtcacgt
     2761 cctaaaagtg accagctccc tgtggcccag tcagaatctc agcctgagga cggtgttggc
     2821 ttcggcagcc ccgagataca tcagagggtg ggcacgctcc tccctccact cgcccctcaa
     2881 acaaatgccc cgcagcccat ttctccaccc tcatttgatg accgcagatt caagtgtttt
     2941 gttaagtaaa gtcctgggtg acctggggtc acagggtgcc ccacgctgcc tgcctctggg
     3001 cgaacacccc atcacgcccg gaggagggcg tggctgcctg cctgagtggg ccagacccct
     3061 gtcgccagcc tcacggcagc tccatagtca ggagatgggg aagatgctgg ggacaggccc
     3121 tggggagaag tactgggatc acctgttcag gctcccactg tgacgctgcc ccggggcggg
     3181 ggaaggaggt gggacatgtg ggcgttgggg cctgtaggtc cacacccagt gtgggtgacc
     3241 ctccctctaa cctgggtcca gcccggctgg agatgggtgg gagtgcgacc tagggctggc
     3301 gggcaggcgg gcactgtgtc tccctgactg tgtcctcctg tgtccctctg cctcgccgct
     3361 gttccggaac ctgctctgcg cggcacgtcc tggcagtggg gcaggtggag ctgggcgggg
     3421 gccctggtgc aggcagcctg cagcccttgg ccctggaggg gtccctgcag aagcgtggca
     3481 ttgtggaaca atgctgtacc agcatctgct ccctctacca gctggagaac tactgcaact
     3541 agacgcagcc tgcaggcagc cccacacccg ccgcctcctg caccgagaga gatggaataa
     3601 agcccttgaa ccagccctgc tgtgccgtct gtgtgtcttg ggggccctgg gccaagcccc
     3661 acttcccggc actgttgtga gcccctccca gctctctcca cgctctctgg gtgcccacag
     3721 gtgccaacgc cggccaggcc cagcatgcag tggctctccc caaagcggcc atgcctgttg
     3781 gctgcctgct gcccccaccc tgtggctcag ggtccagtat gggagcttcg ggggtctctg
     3841 aggggccagg gatggtgggg ccactgagaa gtgacttctt gttcagtagc tctggactct
     3901 tggagtcccc agagaccttg ttcaggaaag ggaatgagaa cattccagca attttccccc
     3961 cacctagccc tcccaggttc tatttttaga gttatttctg atggagtccc tgtggaggga
     4021 ggaggctggg ctgagggagg gggt