Appendix B: The Short (S) Form Of The Drosophila melanogaster Histone Gene Cluster

GenBank style presentation of the D. melanogaster histone cluster (S form). Primary sequence regions are noted in the Features section. The third-strand target is noted by lower case lettering. This sequence has been modified from GenBank accession number X14215 (Matsuo 1989a), the histone cluster L form, and given its own identification name: "DMHISTS".

LOCUS       DMHISTS      4801 bp    DNA             INV       22-AUG-1997
DEFINITION  Drosophila histone gene cluster, S form of repeating unit (4.8 kB).
ACCESSION   X14215a (non-GenBank)
SOURCE      fruit fly.
  ORGANISM  Drosophila melanogaster
            Eukaryotae; mitochondrial eukaryotes; Metazoa; Arthropoda;
            Tracheata; Insecta; Pterygota; Diptera; Brachycera; 
            Muscomorpha;
            Ephydroidea; Drosophilidae; Drosophila.
REFERENCE   1  (bases 1 to 4801)
  AUTHORS   Matsuo,Y, Yamazaki,T.
 MODIFIED   Niederstrasser, H.

FEATURES             Location/Qualifiers
     source          1. .4801
                     /organism="Drosophila melanogaster"
                     /strain="AK-194"
                     /clone_lib="part. MboI in lambda EMBL4"
                     /chromosome="IIR 39D-E."
     CDS             1. .463
                     /note="H1 histone (AA )
                     /label=H1_End
     CDS             complement(870. .1241)
                     /note="H2B histone (AA 1-123)"
                     /label=H2B
     CDS             1468. .1842
                     /note="H2A histone (AA 1-124)"
                     /label=H2A
     CDS             complement(2317. .2628)
                     /note="H4 histone (AA 1-103)"
                     /label=H4
     CDS             2925. .3335
                     /note="H3 histone (AA 1-136)"
                     /label=H3
     CDS             4494. .4801
                     /note="H1 histone (AA 1-256)"
                     /label=H1_Start
     misc_feature    2787. .2802
                     /note=" Homopurine Target "
                     /label=Homopurine_Target

BASE COUNT     1433 a   1019 c   1013 g   1576 t

ORIGIN

Dmhists.gb  Length: 4801  April 13, 1998 14:54  Type: N  Check: 889  ..

       1  TATTCAAACT AAGGGAAAGG GTGCATCTGG ATCTTTCAAA CTGTCGGCCT
      51  CTGCCAAGAA GGAAAAGGAT CCGAAGGCAA AGTCGAAGGT TTTGTCTGCT
     101  GAGAAAAAAG TTCAAAGCAA GAAGGTAGCC TCTAAGAAGA TTGGTGTCTC
     151  CTCCAAAAAA ACTGCCGTTG GGGCTGCTGA CAAAAAGCCC AAAGCTAAGA
     201  AGGCTGTGGC TACCAAAAAG ACTGCCGAAA ATAAGAAAAC TGAGAAGGCA
     251  AAAGCCAAGG ATGCCAAGAA AACTGGAATC ATAAAGTCGA AGCCCGCCGC
     301  AACAAAGGCG AAAGTGACTG CAGCGAAGCC AAAGGCTGTA GTAGCGAAAG
     351  CGTCAAAGGC AAAGCCAGCG GTGTCTGCAA AACCCAAAAA GACGGTGAAG
     401  AAAGCATCGG TTTCTGCTAC CGCCAAGAAG CCGAAAGCGA AGACTACGGC
     451  TGCCAAAAAG TAAATTGTGA AAAAGTGCAG TATTTGGTAC ATGTTCGCAA
     501  TTAAAATTTT AGATTTATGA TTTATAGATC TGAAATTTGT TTAAACAAGT
     551  CCTTTTCAGG GCTACAACGT TCCGTTGCAA GAGAAAAAAA CTTTTATTTT
     601  CTTCCACTTA TTTATTAGCT GACGTTCGCA GCAACAATAA AACGTTTCAT
     651  GTCATGAATT ACATTGAATG TTGGTCGCAT TCAGTTTTCG TTCCCGATTT
     701  TTTTGTATTT ATTTGAACAT TACCCAATTA CCCATATTGC GGGTAAATAA
     751  GTTTTATTTG TAAATTCATA TTCGATGATT GGTGGTTGAA AAATGCATTT
     801  CTTTGGTATA ACACATTGTG GCCCTGAAAA GGGCCGTTTT GGATTATTGT
     851  CCGCATTCGC AGGAGAAAAT TATTTAGAGC TGGTGTACTT GGTGACAGCC
     901  TTGGTTCCCT CACTGACAGC ATGCTTGGCC AACTCTCCAG GCAAAAGCAG
     951  GCGAACAGCC GTTTGGATCT CCCGACTGGT GATGGTCGAG CGCTTGTTGT
    1001  AGTGAGCTAG ACGAGACGCT TCGGCAGCAA TTCGCTCGAA AATATCATTT
    1051  ACAAAGCTGT TCATTATGCT CATCGCCTTC GACGAAATTC CGGTGTCAGG
    1101  ATGGACCTGC TTGAGAACCT TGTAAATGTA GATGGCATAG CTCTCCTTCC
    1151  TTTTGCGCTT CTTTTTCTTG TCGGTCTTGG TGATGTTCTT CTGAGCCTTG
    1201  CCAGCCTTCT TGGCTGCCTT TCCACTAGTT TTCGGAGGCA TTGTTCACGT
    1251  TACTTATATT TTCACAAACA CAATTCACTT ATCGTAATGT GGGCCCGAAC
    1301  GCGTTCACGT TTATACTTTT TTTCGAGCAG TCAATTCAGG TCTAAGTCAC
    1351  CCACCCCTAA CTGAATGCGC AGGCAAACGG AAAAGTATAA ATATTTCGCT
    1401  GTCTGGGTTA GGCGAGCATT CGTGTTCCGT GTGTAAAGTG AACTAAGTGA
    1451  AATAAACGCA AAGCAAAATG TCTGGACGTG GAAAAGGTGG CAAAGTGAAG
    1501  GGAAAGGCAA AGTCCCGCTC AAACCGTGCC GGTCTTCAAT TCCCTGTGGG
    1551  CCGTATTCAC CGTTTGCTCC GGAAGGGAAA CTACGCAGAG CGTGTTGGTG
    1601  CAGGCGCTCC AGTTTACCTA GCTGCCGTAA TGGAATATCT GGCCGCTGAG
    1651  GTTCTCGAGT TGGCTGGCAA TGCTGCTCGT GACAACAAGA AGACTAGAAT
    1701  TATTCCGCGT CATCTGCAAC TGGCCATCCG CAACGACGAG GAGTTAAACA
    1751  AGCTGCTCTC CGGCGTCACA ATTGCACAAG GTGGCGTGTT GCCTAATATA
    1801  CAGGCTGTTC TGTTGCCCAA GAAGACCGAG AAGAAGGCCT AAACGTTTCA
    1851  AAGGCTAAGC TAAAAACCTA CATGTACATA AAATCGTCAA TCAAACCGTC
    1901  CTTTTCAGGA CGACCAAATT ATTACCAAAG AATTGAAAAA TTTTTTAGCT
    1951  TGGCAATTTG TTGTAATTAA TAAATCATAA AGAATTATTA ACGTAAAGAT
    2001  GGTAATGTAG TAAGGGTTTT CTACTATATG CGGTATAAAC TATAATTTGC
    2051  TTCTTTAAAC AATCGCACAC CACGATGTGA TGCTGTACAT GCGGTGTCTG
    2101  AAACCATTTG TACAGTCTGT ACAAATCCAT GTTAGAAATA CACATTCTAT
    2151  TTGAAAGAGT ACGAACGACA GACATTTATT TTTAGTTTAA CATATTTTTT
    2201  GGGAGTCCCG ACCAATAAAA TTAAATACTT TTTGAAAATC TTCCTCCTTT
    2251  TAAAAACTGA ATGGTGGTCC TGAAAAGGAC CGATTGCTTA ATAGGGGTAC
    2301  ACAGGATGTA CACTTTTTAA CCGCCAAATC CGTAGAGGGT GCGGCCTTGC
    2351  CTCTTCAGAG CGTACACAAC ATCCATGGCT GTAACTGTCT TCCTCTTGGC
    2401  GTGTTCCGTG TAGGTCACGG CATCACGAAT TACGTTCTCC AAGAAAACCT
    2451  TCAGAACGCC ACGCGTTTCC TCGTATATGA GTCCAGATAT GCGCTTCACA
    2501  CCGCCTCGAC GGGCCAAACG GCGGATAGCA GGCTTCGTGA TACCTTGGAT
    2551  GTTATCACGC AGCACTTTGC GATGACGCTT GGCGCCACCC TTTCCCAAGC
    2601  CTTTGCCTCC TTTACCACGA CCAGTCATTT TTCACTGTTC TATACTATTA
    2651  TACACGCACA GCACGAAAGT CACTAAAGAA CTAATTTCAA CGTTTCTGTG
    2701  TGCCCCTATT TATAGGTAAA ACGACAAAAA CCCGAGAGAG TACGAACGAT
    2751  ATGTTCGTTC GCTTTTCGCT CGTCAAATGA AATGGCctct gtttttctct
    2801  ctCTCTCTCT CTCTCTTTCA CCGTCCACGA TTGCTATATA AGTAGGTAGC
    2851  AAATGCTCTG ATCGTTFIRE WHENTTTCAA ACGTGAAGTA GTGAACGTGA
    2901  ACTTTAGTGA AACCREADY, GRIDLEY!CT CGTACCAAGC AAACTGCTCG
    2951  CAAATCGACT GGTGGAAAGG CGCCACGCAA ACAACTGGCT ACTAAGGCCG
    3001  CTCGCAAGAG TGCTCCAGCC ACCGGAGGTG TGAAGAAGCC CCACCGCTAT
    3051  CGCCCTGGAA CCGTGGCCTT GCGTGAAATT CGTCGCTACC AAAAGAGCAC
    3101  CGAGCTTCTA ATCCGCAAGC TGCCTTTCCA GCGTCTGGTG CGTGAAATCG
    3151  CTCAGGACTT TAAGACGGAC TTGCGATTCC AGAGCTCGGC GGTTATGGCT
    3201  CTGCAGGAAG CTAGCGAAGC CTACCTGGTT GGTCTCTTCG AAGATACCAA
    3251  CTTGTGTGCC ATTCATGCCA AGCGTATCAC CATAATGCCC AAAGACATCC
    3301  AGTTAGCGCG ACGCATTCGC GGCGAGCGTG CTTAAGCTGA CACGGCATTA
    3351  ACTTGCAGAT AAAGCGCTAG CGTACTCTAT AATCGGTCCT TTTCAGGACC
    3401  ACAAACCAGA TTCAATGAGA TAAAATTTTC TGTTGCCGAC TATTTATAAC
    3451  TTAAAAAAAA TAAGAACAAA ATTCATATTC TATTATTTAT GGCGCAAACG
    3501  GTACTGGGTC TTAAATCATA TGTAAAAATA ATATTTATAA AATAACAGAA
    3551  AATAATAAAA TAAAACTAGC TATTTTATAT TTTTTCCATG TGTTAACTGA
    3601  AGAATGTGTT ATTATTGAAG AGGTCGTACG GGACAATTGA CACTGTCCCT
    3651  TCAAACGTCT GTAAAAAATA AAACCTATGT AAAATTCAGC ACGGAAATTG
    3701  GCTAATTTTG TTGCGGAATG TAATATATAT TACATAATAA AGGATAATAC
    3751  AAAAATTGTT TCTTTTTATT TTTTATTTGA TTTATTTATT TGACTACATA
    3801  GACGGTAATG CATATGTGGC GAGGAAATCG ATTGATTTCA GAACAAATTA
    3851  TTTTAAAATA TGCATGAAAA CACATTAATA ACAAGCAAAC ACATTAATAA
    3901  TTTAAGAAAA TATTATTTAT TATATTAATA TTATGTTATT TAAGAAAGTA
    3951  TCTGTATTTT TAACGATCGA AAATTATTTC TGAATGCTGC TTTAAAGCAA
    4001  ATTTTTCTGT AGTTCAATGT GAACTTAAAT CAAGTAATAA AGTATCTTAA
    4051  TTAATAATAG ACGCTTCTTT CAGAAGCCTT CTAGGGATGA ACGTTTCAAT
    4101  TTTAATAAAC ATAACGAATT AGTGAAATAT TTGCCATGAT TCTTATTTTA
    4151  ATAGATGTTT TTTTATAAAT TGGTCCAGTT AAAAATTTGA TTATAAAAAT
    4201  TCAATCAACA TTTGAAAGTC TCAAAACCCA TATTACATCC TTTTAAAAAT
    4251  GGAAAGTGAC GAAAAAATTA TTTAAAAGTG TAGAACTATT AAAACCTTAT
    4301  TTTTATTAAT GATTTAAAAT ACTAAAAAAT TAAAAAAAGT TTACACTTCA
    4351  AGCAAACTTT GACATAGTAA ATGACTGATG TCAGTAGCAT TGTTAAAGTG
    4401  CTCTCCTCCT CGATTCTCAT CAGAGCAAAG GAGGTTGGTA GGCAGCGCGC
    4451  GAGCCATTTT TAACAGAAAA AAAGTGTTCT CAGTGAAAAA AAGATGTCTG
    4501  ATTCTGCAGT TGCAACGTCC GCTTCCCCAG TGGCTGCCCC ACCAGCGACA
    4551  GTTGAGAAGA AAGTGGTCCA AAAAAAGGCA TCTGGATCTG CTGGCACAAA
    4601  GGCAAAGAAA GCCTCTGCGA CGCCGTCACA TCCGCCAACT CAGCAAATGG
    4651  TGGACGCTTC CATTAAAAAT TTAAAGGAAC GTGGCGGTTC ATCACTTCTG
    4701  GCAATCAAAA AATATATCAC TGCCACTTAT AAATGCGACG CCCAAAAGTT
    4751  AGCGCCATTC ATCAAGAAGT ACTTAAAATC GGCCGTGGTC AATGGAAAGC
    4801  T