CENP-B box gene transcriptions

Jump to navigation Jump to search

Editor-In-Chief: Henry A. Hoff

"Centromere protein B (CENP-B) specifically binds to the centromeric 17 base-pair CENP-B box DNA, which contains two CpG dinucleotides."[1]

Methylations

"In eukaryotes, CpG methylation is an epigenetic DNA modification that is important for heterochromatin formation."[1]

"CENP-B preferentially binds to the unmethylated CENP-B box DNA."[1]

The "CpG methylations of the CENP-B box sequence may function in [RNA interference (RNAi)] RNAi-dependent heterochromatin formation by regulating CENP-B-binding to the CENP-B box sequence in the α-satellite repeats."[1]

Centromeres

"The centromere of eukaryotic chromosomes plays an essential role in the proper segregation of chromosomes at mitosis and meiosis, and has a special heterochromatin structure, which is composed of α-satellite DNA repeats and their associated proteins. The human centromere proteins A, B and C (CENP-A, CENP-B and CENP-C, respectively) are such centromere-specific DNA-binding proteins [1–7]."[1]

The centromere "has several functions, including sister chromatid adhesion, linking chromosomes to spindle microtubules, and synchronous separation of sister chromatids at anaphase onset (Choo, 1997a). These centromere functions are important for maintaining chromosomal euploidy in eukaryotic organisms."[2]

"CENP-B, a highly conserved protein in humans and mice, is specifically localized at the centromere (Earnshaw and Rothfield, 1985). This protein binds to the 17-bp motif of the CENP-B box sequence in alphoid DNA at its amino-terminal region and forms homodimers at its carboxy-terminal region (Masumoto et al., 1989; Yoda et al., 1992)."[2]

"CENP-B–CENP-B box interaction is involved in the centromere assembly mechanism."[2]

DNA binding

"Neither CENP-A nor CENP-C shows any sequence specificity in DNA binding. In contrast, CENP-B is known to specifically bind a 17 base-pair sequence (the CENP-B box), which appears in every other α-satellite repeat (171 base-pairs) in human centromeres [8–10]."[1]

Minichromosomes

The "CENP-B box is essential for the formation of functional minichromosomes [15,16]."[1]

Functional homologues

"CENP-B-like proteins have been identified in humans, and the functional redundancy of CENP-B homologues has also been found in the fission yeast Schizosaccharomyces pombe [20–22]."[1]

Consensus sequences

File:CENP-B box.png
The 17-bp motif of the CENP-B box repeats in DNA monomers. Credit: Jun-ichirou Ohzeki, Megumi Nakano, Teruaki Okada, Hiroshi Masumoto.

"The human α-satellite consensus sequence contains only three CpG sequences within its 17 base-pair sequence [23]. Interestingly, two of the three CpG sequences in the α-satellite consensus sequence are located within site 1 (5′-pTpTpCpG-3′) and site 3 (5′-pCpGpGpG-3′) of the CENP-B box (Fig. 1A;[9])."[1]

Nucleosomes

"CENP-B has the potential to induce nucleosome assembly in the vicinity of the CENP-B box sequence [14]."[1]

Human alphoid DNAs

"Human alphoid DNA contains a huge repetitive sequence, exists only at the centromeric region, and is found in all human chromosomes (Alexandrov et al., 2001). Alphoid sequences consist of tandem repeats of an AT-rich 171-bp alphoid monomer unit, and some alphoid monomers form chromosome-specific higher-order repeated units (Willard, 1985; for review see Willard and Waye, 1987). The repetitive structure of alphoid DNA can be classified into two types of repeats (Ikeno et al., 1994): units composed of several monomers (type-I alphoid repeat; Fig. 1 a, α21-I) and monomeric organization consisting of diverged alphoid monomer units (type-II alphoid repeat; Fig. 1 a, α21-II). Centromere components are mainly assembled on type-I alphoid sequences (Ikeno et al., 1994; Ando et al., 2002, Politi et al., 2002). Human artificial chromosome formation is associated only with type-I alphoid sequences (Harrington et al., 1997; Ikeno et al., 1998; Masumoto et al., 1998; Schueler et al., 2001). The CENP-B box appears only in type-I alphoid sequences (Masumoto et al., 1989; Muro et al., 1992; Haaf and Ward, 1994) of autosomes and X chromosomes."[2]

Neocentromeres "(a rare phenomenon in which centromeres form on fragmented chromosomes) have no significant centromeric DNA sequences, not even alphoid DNA (du Sart et al., 1997; Lo et al., 2001). Like the Y chromosome, neocentromere-containing chromosomes are stably maintained in cells that undergo mitosis (Tyler-Smith et al., 1999)."[2]

Ohzeki samplings

For the Basic programs (starting with SuccessablesCENPB.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesCENPB--.bas, looking for 3'-TTTCGTTGGAAGCGGGA-5', 0,
  2. negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesCENPB-+.bas, looking for 3'-TTTCGTTGGAAGCGGGA-5', 0,
  3. positive strand in the negative direction is SuccessablesCENPB+-.bas, looking for 3'-TTTCGTTGGAAGCGGGA-5', 0,
  4. positive strand in the positive direction is SuccessablesCENPB++.bas, looking for 3'-TTTCGTTGGAAGCGGGA-5', 0,
  5. complement, negative strand, negative direction is SuccessablesCENPBc--.bas, looking for 3'-AAAGCAACCTTCGCCCT-5', 0,
  6. complement, negative strand, positive direction is SuccessablesCENPBc-+.bas, looking for 3'-AAAGCAACCTTCGCCCT-5', 0,
  7. complement, positive strand, negative direction is SuccessablesCENPBc+-.bas, looking for 3'-AAAGCAACCTTCGCCCT-5', 0,
  8. complement, positive strand, positive direction is SuccessablesCENPBc++.bas, looking for 3'-AAAGCAACCTTCGCCCT-5', 0,
  9. inverse complement, negative strand, negative direction is SuccessablesCENPBci--.bas, looking for 3'-TCCCGCTTCCAACGAAA-5', 0,
  10. inverse complement, negative strand, positive direction is SuccessablesCENPBci-+.bas, looking for 3'-TCCCGCTTCCAACGAAA-5', 0,
  11. inverse complement, positive strand, negative direction is SuccessablesCENPBci+-.bas, looking for 3'-TCCCGCTTCCAACGAAA-5', 0,
  12. inverse complement, positive strand, positive direction is SuccessablesCENPBci++.bas, looking for 3'-TCCCGCTTCCAACGAAA-5', 0,
  13. inverse, negative strand, negative direction, is SuccessablesCENPB--.bas, looking for 3'-AGGGCGAAGGTTGCTTT-5', 0,
  14. inverse, negative strand, positive direction, is SuccessablesCENPB-+.bas, looking for 3'-AGGGCGAAGGTTGCTTT-5', 0,
  15. inverse, positive strand, negative direction, is SuccessablesCENPBi+-.bas, looking for 3'-AGGGCGAAGGTTGCTTT-5', 0,
  16. inverse, positive strand, positive direction, is SuccessablesCENPBi++.bas, looking for 3'-AGGGCGAAGGTTGCTTT-5', 0.

Increasing the number of nts from A1BG to ZNF497 from 958 to 4445 has yielded no CENP-B boxes.

Ohzeki samplings

For the Basic programs testing consensus sequence TTGGAA (starting with SuccessablesTTGGAA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction: 1, TTGGAA at 316.
  2. positive strand, negative direction: 2, TTGGAA at 2556, TTGGAA at 330.
  3. negative strand, positive direction: 0.
  4. positive strand, positive direction: 1, TTGGAA at 2581.
  5. inverse complement, negative strand, negative direction: 1, TTCCAA at 3347.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 1, TTCCAA at 2921.

Ohzeki (4560-2846) UTRs

  1. Negative strand, negative direction: TTCCAA at 3347.

Ohzeki negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TTGGAA at 316.
  2. Positive strand, negative direction: TTGGAA at 2556, TTGGAA at 330.

Ohzeki positive direction (4050-1) distal promoters

  1. Positive strand, positive direction: TTGGAA at 2581.
  2. Positive strand, positive direction: TTCCAA at 2921.

Ohzeki random dataset samplings

  1. Ohr0: 1, TTGGAA at 3205.
  2. Ohr1: 1, TTGGAA at 2035.
  3. Ohr2: 0.
  4. Ohr3: 3, TTGGAA at 4303, TTGGAA at 4224, TTGGAA at 2844.
  5. Ohr4: 2, TTGGAA at 2262, TTGGAA at 1385.
  6. Ohr5: 3, TTGGAA at 3100, TTGGAA at 1890, TTGGAA at 26.
  7. Ohr6: 1, TTGGAA at 2491.
  8. Ohr7: 2, TTGGAA at 2943, TTGGAA at 101.
  9. Ohr8: 3, TTGGAA at 3165, TTGGAA at 2303, TTGGAA at 1837.
  10. Ohr9: 1, TTGGAA at 3838.
  11. Ohr0ci: 1, TTCCAA at 1121.
  12. Ohr1ci: 2, TTCCAA at 2095, TTCCAA at 295.
  13. Ohr2ci: 5, TTCCAA at 4465, TTCCAA at 2935, TTCCAA at 1518, TTCCAA at 742, TTCCAA at 118.
  14. Ohr3ci: 4, TTCCAA at 3665, TTCCAA at 3592, TTCCAA at 1642, TTCCAA at 263.
  15. Ohr4ci: 2, TTCCAA at 2725, TTCCAA at 551.
  16. Ohr5ci: 0.
  17. Ohr6ci: 1, TTCCAA at 3062.
  18. Ohr7ci: 0.
  19. Ohr8ci: 3, TTCCAA at 3655, TTCCAA at 615, TTCCAA at 55.
  20. Ohr9ci: 2, TTCCAA at 2317, TTCCAA at 1694.

Ohr arbitrary (evens) (4560-2846) UTRs

  1. Ohr0: TTGGAA at 3205.
  2. Ohr8: TTGGAA at 3165.
  3. Ohr2ci: TTCCAA at 4465, TTCCAA at 2935.
  4. Ohr6ci: TTCCAA at 3062.
  5. Ohr8ci: TTCCAA at 3655.

Ohr alternate (odds) (4560-2846) UTRs

  1. Ohr3: TTGGAA at 4303, TTGGAA at 4224.
  2. Ohr5: TTGGAA at 3100.
  3. Ohr7: TTGGAA at 2943.
  4. Ohr3ci: TTCCAA at 3665, TTCCAA at 3592.

Ohr alternate negative direction (odds) (2846-2811) core promoters

  1. Ohr3: TTGGAA at 2844.

Ohr arbitrary positive direction (odds) (4445-4265) core promoters

  1. Ohr3: TTGGAA at 4303.

Ohr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. Ohr4ci: TTCCAA at 2725.

Ohr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. Ohr3: TTGGAA at 4224.

Ohr arbitrary negative direction (evens) (2596-1) distal promoters

  1. Ohr4: TTGGAA at 2262, TTGGAA at 1385.
  2. Ohr6: TTGGAA at 2491.
  3. Ohr8: TTGGAA at 2303, TTGGAA at 1837.
  4. Ohr0ci: TTCCAA at 1121.
  5. Ohr2ci: TTCCAA at 1518, TTCCAA at 742, TTCCAA at 118.
  6. Ohr4ci: TTCCAA at 551.
  7. Ohr8ci: TTCCAA at 615, TTCCAA at 55.

Ohr alternate negative direction (odds) (2596-1) distal promoters

  1. Ohr1: TTGGAA at 2035.
  2. Ohr5: TTGGAA at 1890, TTGGAA at 26.
  3. Ohr7: TTGGAA at 101.
  4. Ohr1ci: TTCCAA at 2095, TTCCAA at 295.
  5. Ohr3ci: TTCCAA at 1642, TTCCAA at 263.
  6. Ohr9ci: TTCCAA at 2317, TTCCAA at 1694.

Ohr arbitrary positive direction (odds) (4050-1) distal promoters

  1. Ohr1: TTGGAA at 2035.
  2. Ohr3: TTGGAA at 2844.
  3. Ohr5: TTGGAA at 3100, TTGGAA at 1890, TTGGAA at 26.
  4. Ohr7: TTGGAA at 2943, TTGGAA at 101.
  5. Ohr1ci: TTCCAA at 2095, TTCCAA at 295.
  6. Ohr3ci: TTCCAA at 3665, TTCCAA at 3592, TTCCAA at 1642, TTCCAA at 263.
  7. Ohr9ci: TTCCAA at 2317, TTCCAA at 1694.

Ohr alternate positive direction (evens) (4050-1) distal promoters

  1. Ohr0: TTGGAA at 3205.
  2. Ohr4: TTGGAA at 2262, TTGGAA at 1385.
  3. Ohr6: TTGGAA at 2491.
  4. Ohr8: TTGGAA at 3165, TTGGAA at 2303, TTGGAA at 1837.
  5. Ohr0ci: TTCCAA at 1121.
  6. Ohr2ci: TTCCAA at 2935, TTCCAA at 1518, TTCCAA at 742, TTCCAA at 118.
  7. Ohr4ci: TTCCAA at 2725, TTCCAA at 551.
  8. Ohr6ci: TTCCAA at 3062.
  9. Ohr8ci: TTCCAA at 3655, TTCCAA at 615, TTCCAA at 55.

Ohzeki analysis and results

The Centromere protein B (CENP-B) box TTTCGTTGGAAGCGGGA[2] in Table 2 number 91, order 16, is absent.

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 1 2 0.5 0.5 ± 0.5 (--1,+-0)
Randoms UTR arbitrary negative 6 10 0.6 0.6 ± 0
Randoms UTR alternate negative 6 10 0.6 0.6 ± 0
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0.05 ± 0.05
Randoms Core alternate negative 1 10 0.1 0.05 ± 0.05
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.05 ± 0.05
Randoms Core alternate positive 0 10 0 0.05 ± 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 1 10 0.1 0.05 ± 0.05
Randoms Proximal alternate positive 0 10 0 0.05 ± 0.05
Reals Distal negative 3 2 1.5 1.5 ± 0.5 (--1,+-2)
Randoms Distal arbitrary negative 12 10 1.2 1.1 ± 0.1
Randoms Distal alternate negative 10 10 1.0 1.1 ± 0.1
Reals Distal positive 2 2 1 1 ± 1 (-+0,++2)
Randoms Distal arbitrary positive 15 10 1.5 1.65 ± 0.15
Randoms Distal alternate positive 18 10 1.8 1.65 ± 0.15

Comparison:

The occurrences of real Ohzeki TTGGAA UTRs and positive distals are greater than the randoms, negative distals overlap randoms. This suggests that the real Ohzeki TTGGAAs are likely active or activable.

CGGGA samplings

Copying a responsive elements consensus sequence CGGGA and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CGGGA (starting with SuccessablesCGGGA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 3, CGGGA at 4492, CGGGA at 3930, CGGGA at 3577.
  2. Positive strand, negative direction: 3, CGGGA at 4001, CGGGA at 2456, CGGGA at 1682.
  3. Negative strand, positive direction: 10, CGGGA at 4441, CGGGA at 4431, CGGGA at 4261, CGGGA at 4246, CGGGA at 3672, CGGGA at 1682, CGGGA at 1592, CGGGA at 912, CGGGA at 812, CGGGA at 408.
  4. Positive strand, positive direction: 15, CGGGA at 4293, CGGGA at 4229, CGGGA at 4212, CGGGA at 3501, CGGGA at 3494, CGGGA at 1771, CGGGA at 1674, CGGGA at 1658, CGGGA at 1571, CGGGA at 1151, CGGGA at 1067, CGGGA at 731, CGGGA at 647, CGGGA at 454, CGGGA at 94.
  5. inverse complement, negative strand, negative direction: 3, TCCCG at 3566, TCCCG at 1820, TCCCG at 89.
  6. inverse complement, positive strand, negative direction: 1, TCCCG at 1508.
  7. inverse complement, negative strand, positive direction: 2, TCCCG at 995, TCCCG at 895.
  8. inverse complement, positive strand, positive direction: 6, TCCCG at 3907, TCCCG at 3480, TCCCG at 2767, TCCCG at 1899, TCCCG at 517, TCCCG at 19.

CGGGA (4560-2846) UTRs

  1. Negative strand, negative direction: CGGGA at 4492, CGGGA at 3930, CGGGA at 3577.
  2. Negative strand, negative direction: TCCCG at 3566.
  3. Positive strand, negative direction: CGGGA at 4001.

CGGGA positive direction (4445-4265) core promoters

  1. Negative strand, positive direction: CGGGA at 4441, CGGGA at 4431.
  2. Positive strand, positive direction: CGGGA at 4293.

CGGGA positive direction (4265-4050) proximal promoters

  1. Negative strand, positive direction: CGGGA at 4261, CGGGA at 4246.
  2. Positive strand, positive direction: CGGGA at 4229, CGGGA at 4212.

CGGGA negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TCCCG at 1820, TCCCG at 89.
  2. Positive strand, negative direction: CGGGA at 2456, CGGGA at 1682.
  3. Positive strand, negative direction: TCCCG at 1508.

CGGGA positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CGGGA at 3672, CGGGA at 1682, CGGGA at 1592, CGGGA at 912, CGGGA at 812, CGGGA at 408.
  2. Negative strand, positive direction: TCCCG at 995, TCCCG at 895.
  3. Positive strand, positive direction: CGGGA at 3501, CGGGA at 3494, CGGGA at 1771, CGGGA at 1674, CGGGA at 1658, CGGGA at 1571, CGGGA at 1151, CGGGA at 1067, CGGGA at 731, CGGGA at 647, CGGGA at 454, CGGGA at 94.
  4. Positive strand, positive direction: TCCCG at 3907, TCCCG at 3480, TCCCG at 2767, TCCCG at 1899, TCCCG at 517, TCCCG at 19.

CGGGA random dataset samplings

  1. CGGGAr0: 6, CGGGA at 4123, CGGGA at 3971, CGGGA at 2326, CGGGA at 2101, CGGGA at 2071, CGGGA at 92.
  2. CGGGAr1: 3, CGGGA at 2793, CGGGA at 2722, CGGGA at 396.
  3. CGGGAr2: 6, CGGGA at 4097, CGGGA at 3499, CGGGA at 2078, CGGGA at 763, CGGGA at 155, CGGGA at 92.
  4. CGGGAr3: 5, CGGGA at 3425, CGGGA at 2913, CGGGA at 2432, CGGGA at 1992, CGGGA at 1890.
  5. CGGGAr4: 5, CGGGA at 3897, CGGGA at 2412, CGGGA at 1602, CGGGA at 1286, CGGGA at 1221.
  6. CGGGAr5: 6, CGGGA at 2904, CGGGA at 2882, CGGGA at 1755, CGGGA at 1742, CGGGA at 1190, CGGGA at 813.
  7. CGGGAr6: 3, CGGGA at 3341, CGGGA at 2691, CGGGA at 2211.
  8. CGGGAr7: 7, CGGGA at 3464, CGGGA at 3102, CGGGA at 2918, CGGGA at 1916, CGGGA at 1779, CGGGA at 1501, CGGGA at 58.
  9. CGGGAr8: 11, CGGGA at 4422, CGGGA at 4123, CGGGA at 4092, CGGGA at 3724, CGGGA at 3280, CGGGA at 2582, CGGGA at 2097, CGGGA at 1972, CGGGA at 1555, CGGGA at 1064, CGGGA at 839.
  10. CGGGAr9: 4, CGGGA at 2136, CGGGA at 1935, CGGGA at 1023, CGGGA at 934.
  11. CGGGAr0ci: 6, TCCCG at 3918, TCCCG at 3827, TCCCG at 2562, TCCCG at 1596, TCCCG at 820, TCCCG at 263.
  12. CGGGAr1ci: 5, TCCCG at 2379, TCCCG at 1659, TCCCG at 1417, TCCCG at 935, TCCCG at 433.
  13. CGGGAr2ci: 9, TCCCG at 4117, TCCCG at 3632, TCCCG at 3496, TCCCG at 3181, TCCCG at 2812, TCCCG at 2434, TCCCG at 1479, TCCCG at 812, TCCCG at 631.
  14. CGGGAr3ci: 3, TCCCG at 3474, TCCCG at 2520, TCCCG at 2031.
  15. CGGGAr4ci: 5, TCCCG at 4354, TCCCG at 3702, TCCCG at 3676, TCCCG at 1463, TCCCG at 208.
  16. CGGGAr5ci: 9, TCCCG at 4237, TCCCG at 3679, TCCCG at 2887, TCCCG at 2547, TCCCG at 2320, TCCCG at 2019, TCCCG at 790, TCCCG at 728, TCCCG at 94.
  17. CGGGAr6ci: 7, TCCCG at 4335, TCCCG at 4143, TCCCG at 3708, TCCCG at 3497, TCCCG at 2364, TCCCG at 1492, TCCCG at 896.
  18. CGGGAr7ci: 9, TCCCG at 3862, TCCCG at 3086, TCCCG at 2159, TCCCG at 1372, TCCCG at 1092, TCCCG at 892, TCCCG at 446, TCCCG at 39, TCCCG at 33.
  19. CGGGAr8ci: 5, TCCCG at 4051, TCCCG at 3145, TCCCG at 2212, TCCCG at 626, TCCCG at 428.
  20. CGGGAr9ci: 6, TCCCG at 4230, TCCCG at 2592, TCCCG at 2487, TCCCG at 1790, TCCCG at 1647, TCCCG at 48.

CGGGAr arbitrary (evens) (4560-2846) UTRs

  1. CGGGAr0: CGGGA at 4123, CGGGA at 3971.
  2. CGGGAr2: CGGGA at 4097, CGGGA at 3499.
  3. CGGGAr4: CGGGA at 3897.
  4. CGGGAr6: CGGGA at 3341.
  5. CGGGAr8: CGGGA at 4422, CGGGA at 4123, CGGGA at 4092, CGGGA at 3724, CGGGA at 3280.
  6. CGGGAr0ci: TCCCG at 3918, TCCCG at 3827.
  7. CGGGAr2ci: TCCCG at 4117, TCCCG at 3632, TCCCG at 3496, TCCCG at 3181.
  8. CGGGAr4ci: TCCCG at 4354, TCCCG at 3702, TCCCG at 3676.
  9. CGGGAr6ci: TCCCG at 4335, TCCCG at 4143, TCCCG at 3708, TCCCG at 3497.
  10. CGGGAr8ci: TCCCG at 4051, TCCCG at 3145.

CGGGAr alternate (odds) (4560-2846) UTRs

  1. CGGGAr5: CGGGA at 2904, CGGGA at 2882.
  2. CGGGAr7: CGGGA at 3464, CGGGA at 3102, CGGGA at 2918.
  3. CGGGAr3ci: TCCCG at 3474.
  4. CGGGAr5ci: TCCCG at 4237, TCCCG at 3679, TCCCG at 2887.
  5. CGGGAr7ci: TCCCG at 3862, TCCCG at 3086.
  6. CGGGAr9ci: TCCCG at 4230.

CGGGAr arbitrary negative direction (evens) (2846-2811) core promoters

  1. CGGGAr2ci: TCCCG at 2812.

CGGGAr alternate positive direction (evens) (4445-4265) core promoters

  1. CGGGAr8: CGGGA at 4422.
  2. CGGGAr4ci: TCCCG at 4354.
  3. CGGGAr6ci: TCCCG at 4335.

CGGGAr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. CGGGAr6: CGGGA at 2691.

CGGGAr alternate negative direction (odds) (2811-2596) proximal promoters

  1. CGGGAr1: CGGGA at 2793, CGGGA at 2722.

CGGGAr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. CGGGAr5ci: TCCCG at 4237.
  2. CGGGAr9ci: TCCCG at 4230.

CGGGAr alternate positive direction (evens) (4265-4050) proximal promoters

  1. CGGGAr0: CGGGA at 4123.
  2. CGGGAr2: CGGGA at 4097.
  3. CGGGAr8: CGGGA at 4123, CGGGA at 4092.
  4. CGGGAr2ci: TCCCG at 4117.
  5. CGGGAr6ci: TCCCG at 4143.
  6. CGGGAr8ci: TCCCG at 4051.

CGGGAr arbitrary negative direction (evens) (2596-1) distal promoters

  1. CGGGAr0: CGGGA at 2326, CGGGA at 2101, CGGGA at 2071, CGGGA at 92.
  2. CGGGAr2: CGGGA at 2078, CGGGA at 763, CGGGA at 155, CGGGA at 92.
  3. CGGGAr4: CGGGA at 2412, CGGGA at 1602, CGGGA at 1286, CGGGA at 1221.
  4. CGGGAr6: CGGGA at 2211.
  5. CGGGAr8: CGGGA at 2582, CGGGA at 2097, CGGGA at 1972, CGGGA at 1555, CGGGA at 1064, CGGGA at 839.
  6. CGGGAr0ci: TCCCG at 2562, TCCCG at 1596, TCCCG at 820, TCCCG at 263.
  7. CGGGAr2ci: TCCCG at 2434, TCCCG at 1479, TCCCG at 812, TCCCG at 631.
  8. CGGGAr4ci: TCCCG at 1463, TCCCG at 208.
  9. CGGGAr6ci: TCCCG at 2364, TCCCG at 1492, TCCCG at 896.
  10. CGGGAr8ci: TCCCG at 2212, TCCCG at 626, TCCCG at 428.

CGGGAr alternate negative direction (odds) (2596-1) distal promoters

  1. CGGGAr1: CGGGA at 396.
  2. CGGGAr5: CGGGA at 1755, CGGGA at 1742, CGGGA at 1190, CGGGA at 813.
  3. CGGGAr7: CGGGA at 1916, CGGGA at 1779, CGGGA at 1501, CGGGA at 58.
  4. CGGGAr9: CGGGA at 2136, CGGGA at 1935, CGGGA at 1023, CGGGA at 934.
  5. CGGGAr1ci: TCCCG at 2379, TCCCG at 1659, TCCCG at 1417, TCCCG at 935, TCCCG at 433.
  6. CGGGAr3ci: TCCCG at 2520, TCCCG at 2031.
  7. CGGGAr5ci: TCCCG at 2547, TCCCG at 2320, TCCCG at 2019, TCCCG at 790, TCCCG at 728, TCCCG at 94.
  8. CGGGAr7ci: TCCCG at 2159, TCCCG at 1372, TCCCG at 1092, TCCCG at 892, TCCCG at 446, TCCCG at 39, TCCCG at 33.
  9. CGGGAr9ci: TCCCG at 2592, TCCCG at 2487, TCCCG at 1790, TCCCG at 1647, TCCCG at 48.

CGGGAr arbitrary positive direction (odds) (4050-1) distal promoters

  1. CGGGAr1: CGGGA at 2793, CGGGA at 2722, CGGGA at 396.
  2. CGGGAr5: CGGGA at 2904, CGGGA at 2882, CGGGA at 1755, CGGGA at 1742, CGGGA at 1190, CGGGA at 813.
  3. CGGGAr7: CGGGA at 3464, CGGGA at 3102, CGGGA at 2918, CGGGA at 1916, CGGGA at 1779, CGGGA at 1501, CGGGA at 58.
  4. CGGGAr9: CGGGA at 2136, CGGGA at 1935, CGGGA at 1023, CGGGA at 934.
  5. CGGGAr1ci: TCCCG at 2379, TCCCG at 1659, TCCCG at 1417, TCCCG at 935, TCCCG at 433.
  6. CGGGAr3ci: TCCCG at 3474, TCCCG at 2520, TCCCG at 2031.
  7. CGGGAr5ci: TCCCG at 3679, TCCCG at 2887, TCCCG at 2547, TCCCG at 2320, TCCCG at 2019, TCCCG at 790, TCCCG at 728, TCCCG at 94.
  8. CGGGAr7ci: TCCCG at 3862, TCCCG at 3086, TCCCG at 2159, TCCCG at 1372, TCCCG at 1092, TCCCG at 892, TCCCG at 446, TCCCG at 39, TCCCG at 33.
  9. CGGGAr9ci: TCCCG at 2592, TCCCG at 2487, TCCCG at 1790, TCCCG at 1647, TCCCG at 48.

CGGGAr alternate positive direction (evens) (4050-1) distal promoters

  1. CGGGAr0: CGGGA at 3971, CGGGA at 2326, CGGGA at 2101, CGGGA at 2071, CGGGA at 92.
  2. CGGGAr2: CGGGA at 3499, CGGGA at 2078, CGGGA at 763, CGGGA at 155, CGGGA at 92.
  3. CGGGAr4: CGGGA at 3897, CGGGA at 2412, CGGGA at 1602, CGGGA at 1286, CGGGA at 1221.
  4. CGGGAr6: CGGGA at 3341, CGGGA at 2691, CGGGA at 2211.
  5. CGGGAr8: CGGGA at 3724, CGGGA at 3280, CGGGA at 2582, CGGGA at 2097, CGGGA at 1972, CGGGA at 1555, CGGGA at 1064, CGGGA at 839.
  6. CGGGAr0ci: TCCCG at 3918, TCCCG at 3827, TCCCG at 2562, TCCCG at 1596, TCCCG at 820, TCCCG at 263.
  7. CGGGAr2ci: TCCCG at 3632, TCCCG at 3496, TCCCG at 3181, TCCCG at 2812, TCCCG at 2434, TCCCG at 1479, TCCCG at 812, TCCCG at 631.
  8. CGGGAr4ci: TCCCG at 3702, TCCCG at 3676, TCCCG at 1463, TCCCG at 208.
  9. CGGGAr6ci: TCCCG at 3708, TCCCG at 3497, TCCCG at 2364, TCCCG at 1492, TCCCG at 896.
  10. CGGGAr8ci: TCCCG at 3145, TCCCG at 2212, TCCCG at 626, TCCCG at 428.

CGGGA analysis and results

The Centromere protein B (CENP-B) box TTTCGTTGGAAGCGGGA.[2]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 5 2 2.5 2.5 ± 1.5 (--4,+-1)
Randoms UTR arbitrary negative 26 10 2.6 1.9 ± 0.7
Randoms UTR alternate negative 12 10 1.2 1.9 ± 0.7
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 1 10 0.1 0.05 ± 0.05
Randoms Core alternate negative 0 10 0 0.05 ± 0.05
Reals Core positive 3 2 1.5 1.5 ± 0.5 (-+2,++1)
Randoms Core arbitrary positive 0 10 0 0.15 ± 0.15
Randoms Core alternate positive 3 10 0.3 0.15 ± 0.15
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 1 10 0.1 0.15 ± 0.05
Randoms Proximal alternate negative 2 10 0.2 0.15 ± 0.05
Reals Proximal positive 4 2 2 2 ± 0 (-+2,++2)
Randoms Proximal arbitrary positive 2 10 0.2 0.45 ± 0.25
Randoms Proximal alternate positive 7 10 0.7 0.45 ± 0.25
Reals Distal negative 5 2 2.5 2.5 ± 0.5 (--2,+-3)
Randoms Distal arbitrary negative 25 10 2.5 3.15 ± 0.65
Randoms Distal alternate negative 38 10 3.8 3.15 ± 0.65
Reals Distal positive 26 2 13 13 ± 5 (-+8,18)
Randoms Distal arbitrary positive 50 10 5.0 5.15 ± 0.15
Randoms Distal alternate positive 53 10 5.3 5.15 ± 0.15

Comparison:

The occurrences of real CGGGA UTRs are outside the randoms, cores, proximals and positive distals are greater than the randoms, negative distals overlap low randoms. This suggests that the real CGGGAs are likely active or activable.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

References

  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 Yoshinori Tanaka, Hitoshi Kurumizaka, and Shigeyuki Yokoyama (January 2005). "CpG methylation of the CENP-B box reduces human CENP-B binding". The FEBS Journal. 272 (1): 282–289. doi:10.1111/j.1432-1033.2004.04406.x. PMID 15634350. Retrieved 2017-02-05.
  2. 2.0 2.1 2.2 2.3 2.4 2.5 2.6 Jun-ichirou Ohzeki, Megumi Nakano, Teruaki Okada, Hiroshi Masumoto (2 December 2002). "CENP-B box is required for de novo centromere chromatin assembly on human alphoid DNA". The Journal of Cell Biology. 159 (5): 765. doi:10.1083/jcb.200207112. Retrieved 2017-02-05.

External links

Template:Sisterlinks