CAT box gene transcriptions: Difference between revisions

Jump to navigation Jump to search
 
(27 intermediate revisions by the same user not shown)
Line 380: Line 380:
==M-CAT box samplings==
==M-CAT box samplings==


Copying a responsive elements consensus sequence GCGGCCTC and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.
For the Basic programs testing consensus sequence GCGGCCTC (starting with SuccessablesMCAT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
# negative strand, negative direction: 0.
# positive strand, negative direction: 0.
# negative strand, positive direction: 0.
# positive strand, positive direction: 0.
# inverse complement, negative strand, negative direction: 0.
# inverse complement, positive strand, negative direction: 0.
# inverse complement, negative strand, positive direction: 0.
# inverse complement, positive strand, positive direction: 0.
 
==Shue box samplings==
 
The subunit homologous upstream element (SHUE) box homology is an 18-nucleotide sequence CAATCCCTGCCTGGGATC, where the sequence CCCTG(C/G) was conserved in all four subunits.<ref name=Crowder>Crowder, C. Michael & Merlie, J. P. (December 1988) "Stepwise Activation of the Mouse Acetylcholine Receptor 𝛅- and 𝛄-Subunit Genes in Clonal Cell Lines" Molecular and Cellular Biology 8(12), 5257-5267, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC365628/pdf/molcellb00072-0209.pdf</ref>
 
For the Basic programs testing consensus sequence CCCTG(C/G) (starting with SuccessablesShue.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
# Negative strand, negative direction: 2, CCCTGG at 3868, CCCTGC at 1151.
# Positive strand, negative direction: 3, CCCTGG at 4494, CCCTGG at 3744, CCCTGC at 3579.
# Negative strand, positive direction: 6, CCCTGG at 4424, CCCTGC at 4231, CCCTGG at 3496, CCCTGG at 1815, CCCTGC at 1075, CCCTGC at 191.
# Positive strand, positive direction: 5, CCCTGG at 3545, CCCTGG at 3172, CCCTGC at 410, CCCTGC at 323, CCCTGG at 37.
# inverse complement, negative strand, negative direction: 0.
# inverse complement, positive strand, negative direction: 2, CCAGGG at 3886, CCAGGG at 3565.
# inverse complement, negative strand, positive direction: 13, GCAGGG at 3663, CCAGGG at 3537, GCAGGG at 3467, CCAGGG at 2781, CCAGGG at 2575, GCAGGG at 2297, CCAGGG at 1894, GCAGGG at 1789, GCAGGG at 659, CCAGGG at 516, GCAGGG at 380, GCAGGG at 319, CCAGGG at 34.
# inverse complement, positive strand, positive direction: 2, CCAGGG at 4421, GCAGGG at 3204.
 
===Shue (4560-2846) UTRs===
 
# Negative strand, negative direction: CCCTGG at 3868.
# Positive strand, negative direction: CCCTGG at 4494, CCCTGG at 3744, CCCTGC at 3579.
# Positive strand, negative direction: CCAGGG at 3886, CCAGGG at 3565.
 
===Shue positive direction (4445-4265) core promoters===
 
# Negative strand, positive direction: CCCTGG at 4424.
# Positive strand, positive direction: CCAGGG at 4421.
 
===Shue positive direction (4265-4050) proximal promoters===
 
# Negative strand, positive direction: CCCTGC at 4231.
 
===Shue negative direction (2596-1) distal promoters===
 
# Negative strand, negative direction: CCCTGC at 1151.
 
===Shue positive direction (4050-1) distal promoters===
 
# Negative strand, positive direction: CCCTGG at 3496, CCCTGG at 1815, CCCTGC at 1075, CCCTGC at 191.
# Negative strand, positive direction: GCAGGG at 3663, CCAGGG at 3537, GCAGGG at 3467, CCAGGG at 2781, CCAGGG at 2575, GCAGGG at 2297, CCAGGG at 1894, GCAGGG at 1789, GCAGGG at 659, CCAGGG at 516, GCAGGG at 380, GCAGGG at 319, CCAGGG at 34.
# Positive strand, positive direction: CCCTGG at 3545, CCCTGG at 3172, CCCTGC at 410, CCCTGC at 323, CCCTGG at 37.
# Positive strand, positive direction: GCAGGG at 3204.
 
==Shue box random dataset samplings==
 
# Shuer0: 4, CCCTGG at 4268, CCCTGG at 3641, CCCTGG at 2435, CCCTGG at 682.
# Shuer1: 2, CCCTGC at 4090, CCCTGG at 1740.
# Shuer2: 1, CCCTGC at 2557.
# Shuer3: 2, CCCTGC at 3410, CCCTGC at 2138.
# Shuer4: 10, CCCTGG at 3231, CCCTGG at 3109, CCCTGG at 2861, CCCTGG at 2756, CCCTGG at 1481, CCCTGG at 1168, CCCTGG at 1153, CCCTGG at 970, CCCTGG at 203, CCCTGG at 84.
# Shuer5: 2, CCCTGG at 4384, CCCTGG at 917.
# Shuer6: 3, CCCTGC at 3875, CCCTGC at 821, CCCTGC at 51.
# Shuer7: 1, CCCTGG at 782.
# Shuer8: 2, CCCTGG at 3140, CCCTGG at 2353.
# Shuer9: 3, CCCTGG at 3000, CCCTGG at 2225, CCCTGG at 1829.
# Shuer0ci: 2, CCAGGG at 3744, CCAGGG at 66.
# Shuer1ci: 5, GCAGGG at 3932, GCAGGG at 3740, GCAGGG at 2653, GCAGGG at 1982, CCAGGG at 668.
# Shuer2ci: 3, CCAGGG at 4272, CCAGGG at 2280, CCAGGG at 1572.
# Shuer3ci: 6, GCAGGG at 3332, CCAGGG at 1878, CCAGGG at 1589, GCAGGG at 1413, GCAGGG at 817, CCAGGG at 525.
# Shuer4ci: 2, CCAGGG at 4559, CCAGGG at 3648.
# Shuer5ci: 9, CCAGGG at 4369, GCAGGG at 3907, CCAGGG at 3503, GCAGGG at 3481, CCAGGG at 3229, GCAGGG at 1514, GCAGGG at 1306, CCAGGG at 768, CCAGGG at 332.
# Shuer6ci: 6, CCAGGG at 3885, CCAGGG at 3515, GCAGGG at 3138, GCAGGG at 2792, CCAGGG at 1755, CCAGGG at 963.
# Shuer7ci: 1, CCAGGG at 2343.
# Shuer8ci: 4, GCAGGG at 3535, GCAGGG at 2681, GCAGGG at 1697, CCAGGG at 1591.
# Shuer9ci: 3, GCAGGG at 4366, GCAGGG at 2212, GCAGGG at 2173.
 
===Shuer arbitrary (evens) (4560-2846) UTRs===
 
# Shuer0: CCCTGG at 4268, CCCTGG at 3641.
# Shuer4: CCCTGG at 3231, CCCTGG at 3109, CCCTGG at 2861.
# Shuer6: CCCTGC at 3875.
# Shuer8: CCCTGG at 3140.
# Shuer0ci: CCAGGG at 3744.
# Shuer2ci: CCAGGG at 4272.
# Shuer4ci: CCAGGG at 4559, CCAGGG at 3648.
# Shuer6ci: CCAGGG at 3885, CCAGGG at 3515, GCAGGG at 3138.
# Shuer8ci: GCAGGG at 3535.
 
===Shuer alternate (odds) (4560-2846) UTRs===
 
# Shuer1: CCCTGC at 4090.
# Shuer3: CCCTGC at 3410.
# Shuer5: CCCTGG at 4384.
# Shuer9: CCCTGG at 3000.
# Shuer1ci: GCAGGG at 3932, GCAGGG at 3740.
# Shuer3ci: GCAGGG at 3332.
# Shuer5ci: CCAGGG at 4369, GCAGGG at 3907, CCAGGG at 3503, GCAGGG at 3481, CCAGGG at 3229.
# Shuer9ci: GCAGGG at 4366.
 
===Shuer arbitrary positive direction (odds) (4445-4265) core promoters===
 
# Shuer5: CCCTGG at 4384.
# Shuer5ci: CCAGGG at 4369.
# Shuer9ci: GCAGGG at 4366.
 
===Shuer alternate positive direction (evens) (4445-4265) core promoters===
 
# Shuer0: CCCTGG at 4268.
# Shuer2ci: CCAGGG at 4272.


For the Basic programs testing consensus sequence GCGGCCTC (starting with SuccessablesMCAT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
===Shuer arbitrary negative direction (evens) (2811-2596) proximal promoters===
# negative strand, negative direction, looking for AAAAAAAA, 0.
# positive strand, negative direction, looking for AAAAAAAA, 0.
# negative strand, positive direction, looking for AAAAAAAA, 0.
# positive strand, positive direction, looking for AAAAAAAA, 0.
# inverse complement, negative strand, negative direction, looking for TTTTTTTT, 0.
# inverse complement, positive strand, negative direction, looking for TTTTTTTT, 0.
# inverse complement, negative strand, positive direction, looking for TTTTTTTT, 0.
# inverse complement, positive strand, positive direction, looking for TTTTTTTT, 0.


===AAA (4560-2846) UTRs===
# Shuer4: CCCTGG at 2756.
# Shuer6ci: GCAGGG at 2792.
# Shuer8ci: GCAGGG at 2681.


===AAA negative direction (2846-2811) core promoters===
===Shuer alternate negative direction (odds) (2811-2596) proximal promoters===


===AAA positive direction (4445-4265) core promoters===
# Shuer1ci: GCAGGG at 2653.


===AAA negative direction (2811-2596) proximal promoters===
===Shuer arbitrary positive direction (odds) (4265-4050) proximal promoters===


===AAA positive direction (4265-4050) proximal promoters===
# Shuer1: CCCTGC at 4090.


===AAA negative direction (2596-1) distal promoters===
===Shuer arbitrary negative direction (evens) (2596-1) distal promoters===


===AAA positive direction (4050-1) distal promoters===
# Shuer0: CCCTGG at 2435, CCCTGG at 682.
# Shuer2: CCCTGC at 2557.
# Shuer4: CCCTGG at 1481, CCCTGG at 1168, CCCTGG at 1153, CCCTGG at 970, CCCTGG at 203, CCCTGG at 84.
# Shuer6: CCCTGC at 821, CCCTGC at 51.
# Shuer8: CCCTGG at 2353.
# Shuer0ci: CCAGGG at 66.
# Shuer2ci: CCAGGG at 2280, CCAGGG at 1572.
# Shuer6ci: CCAGGG at 1755, CCAGGG at 963.
# Shuer8ci: GCAGGG at 1697, CCAGGG at 1591.


==Shue box samplings==
===Shuer alternate negative direction (odds) (2596-1) distal promoters===


Copying a responsive elements consensus sequence AAAAAAAA and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.
# Shuer1: CCCTGG at 1740.
# Shuer3: CCCTGC at 2138.
# Shuer5: CCCTGG at 917.
# Shuer7: CCCTGG at 782.
# Shuer9: CCCTGG at 2225, CCCTGG at 1829.
# Shuer1ci: GCAGGG at 1982, CCAGGG at 668.
# Shuer3ci: CCAGGG at 1878, CCAGGG at 1589, GCAGGG at 1413, GCAGGG at 817, CCAGGG at 525.
# Shuer5ci: GCAGGG at 1514, GCAGGG at 1306, CCAGGG at 768, CCAGGG at 332.
# Shuer7ci: CCAGGG at 2343.
# Shuer9ci: GCAGGG at 2212, GCAGGG at 2173.


For the Basic programs testing consensus sequence AAAAAAAA (starting with SuccessablesAAA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
===Shuer arbitrary positive direction (odds) (4050-1) distal promoters===
# negative strand, negative direction, looking for AAAAAAAA, 0.
# positive strand, negative direction, looking for AAAAAAAA, 0.
# negative strand, positive direction, looking for AAAAAAAA, 0.
# positive strand, positive direction, looking for AAAAAAAA, 0.
# inverse complement, negative strand, negative direction, looking for TTTTTTTT, 0.
# inverse complement, positive strand, negative direction, looking for TTTTTTTT, 0.
# inverse complement, negative strand, positive direction, looking for TTTTTTTT, 0.
# inverse complement, positive strand, positive direction, looking for TTTTTTTT, 0.


===AAA (4560-2846) UTRs===
# Shuer1: CCCTGG at 1740.
# Shuer3: CCCTGC at 3410, CCCTGC at 2138.
# Shuer5: CCCTGG at 917.
# Shuer7: CCCTGG at 782.
# Shuer9: CCCTGG at 3000, CCCTGG at 2225, CCCTGG at 1829.
# Shuer1ci: GCAGGG at 3932, GCAGGG at 3740, GCAGGG at 2653, GCAGGG at 1982, CCAGGG at 668.
# Shuer3ci: GCAGGG at 3332, CCAGGG at 1878, CCAGGG at 1589, GCAGGG at 1413, GCAGGG at 817, CCAGGG at 525.
# Shuer4ci: CCAGGG at 3648.
# Shuer5ci: GCAGGG at 3907, CCAGGG at 3503, GCAGGG at 3481, CCAGGG at 3229, GCAGGG at 1514, GCAGGG at 1306, CCAGGG at 768, CCAGGG at 332.
# Shuer7ci: CCAGGG at 2343.
# Shuer9ci: GCAGGG at 2212, GCAGGG at 2173.


===AAA negative direction (2846-2811) core promoters===
===Shur alternate positive direction (evens) (4050-1) distal promoters===


===AAA positive direction (4445-4265) core promoters===
# Shuer0: CCCTGG at 3641, CCCTGG at 2435, CCCTGG at 682.
# Shuer2: CCCTGC at 2557.
# Shuer4: CCCTGG at 3231, CCCTGG at 3109, CCCTGG at 2861, CCCTGG at 2756, CCCTGG at 1481, CCCTGG at 1168, CCCTGG at 1153, CCCTGG at 970, CCCTGG at 203, CCCTGG at 84.
# Shuer6: CCCTGC at 3875, CCCTGC at 821, CCCTGC at 51.
# Shuer8: CCCTGG at 3140, CCCTGG at 2353.
# Shuer0ci: CCAGGG at 3744, CCAGGG at 66.
# Shuer2ci: CCAGGG at 4272, CCAGGG at 2280, CCAGGG at 1572.
# Shuer6ci: CCAGGG at 3885, CCAGGG at 3515, GCAGGG at 3138, GCAGGG at 2792, CCAGGG at 1755, CCAGGG at 963.
# Shuer8ci: GCAGGG at 3535, GCAGGG at 2681, GCAGGG at 1697, CCAGGG at 1591.


===AAA negative direction (2811-2596) proximal promoters===
==Shue box analysis and results==
{{main|Complex locus A1BG and ZNF497#Shues}}
The sequence CCCTG(C/G) was conserved in all four subunits.<ref name=Crowder/>


===AAA positive direction (4265-4050) proximal promoters===
{|class="wikitable"
|-
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1)
|-
| Reals || UTR || negative || 6 || 2 || 3 || 3 ± 2 (--1,+-5)
|-
| Randoms || UTR || arbitrary negative || 15 || 10 || 1.5 || 1.4
|-
| Randoms || UTR || alternate negative || 13 || 10 || 1.3 || 1.4
|-
| Reals || Core || negative || 0 || 2 || 0 || 0
|-
| Randoms || Core || arbitrary negative || 0 || 10 || 0 || 0
|-
| Randoms || Core || alternate negative || 0 || 10 || 0 || 0
|-
| Reals || Core || positive || 2 || 2 || 1 || 1 ± 0 (-+1,++1)
|-
| Randoms || Core || arbitrary positive || 3 || 10 || 0.3 || 0.25
|-
| Randoms || Core || alternate positive || 2 || 10 || 0.2 || 0.25
|-
| Reals || Proximal || negative || 0 || 2 || 0 || 0
|-
| Randoms || Proximal || arbitrary negative || 3 || 10 || 0.3 || 0.2
|-
| Randoms || Proximal || alternate negative || 1 || 10 || 0.1 || 0.2
|-
| Reals || Proximal || positive || 1 || 2 || 0.5 || 0.5 ± 0.5 (-+1,++0)
|-
| Randoms || Proximal || arbitrary positive || 1 || 10 || 0.1 || 0.05
|-
| Randoms || Proximal || alternate positive || 0 || 10 || 0 || 0.05
|-
| Reals || Distal || negative || 1 || 2 || 0.5 || 0.5 ± 0.5 (--1,+-0)
|-
| Randoms || Distal || arbitrary negative || 19 || 10 || 1.9 || 1.95
|-
| Randoms || Distal || alternate negative || 20 || 10 || 2.0 || 1.95
|-
| Reals || Distal || positive || 23 || 2 || 11.5 || 11.5 ± 5.5 (-+17,++6)  
|-
| Randoms || Distal || arbitrary positive || 31 || 10 || 3.1 || 3.25
|-
| Randoms || Distal || alternate positive || 34 || 10 || 3.4 || 3.25
|}


===AAA negative direction (2596-1) distal promoters===
Comparison:


===AAA positive direction (4050-1) distal promoters===
The occurrences of real Shue UTRs are outside the randoms, cores, proximals and positive distals are greater than the randoms, negative distals are less then the randoms. This suggests that the real Shues are likely active or activable.


==Sp1 element samplings==
==Sp1 element samplings==


Copying a responsive elements consensus sequence AAAAAAAA and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.
For the Basic programs testing consensus sequence GGGGCGGGT (starting with SuccessablesSp1B.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
# negative strand, negative direction: 0.
# positive strand, negative direction: 0.
# negative strand, positive direction: 0.
# positive strand, positive direction: 0.
# inverse complement, negative strand, negative direction: 0.
# inverse complement, positive strand, negative direction: 0.
# inverse complement, negative strand, positive direction: 0.
# inverse complement, positive strand, positive direction: 0.
 
==AP-2 (Roesler) samplings==
 
The skeletal muscle acetylcholine receptor (AchR) promoter of the mouse has four copies of CCCCACC(A/C), where the repeats of CCCCACCC have perfect homology with two known AP-2-binding sites.<ref name=Roesler>Roesler, W. J., G. R. Vandenbark, and R. W. Hanson (1988), "Cyclic AMP and the induction of eukaryotic gene transcription" J. Biol. Chem. 263:9063-9066.</ref>
 
For the Basic programs testing consensus sequence CCCCACC(A/C) (starting with SuccessablesAP2R.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
# negative strand, negative direction: 0.
# positive strand, negative direction: 0.
# negative strand, positive direction: 0.
# positive strand, positive direction: 0.
# inverse complement, negative strand, negative direction: 0.
# inverse complement, positive strand, negative direction: 0.
# inverse complement, negative strand, positive direction: 1, GGGTGGGG at 3542.
# inverse complement, positive strand, positive direction: 1, GGGTGGGG at 2019.
 
===AP2R positive direction (4050-1) distal promoters===
 
# Negative strand, positive direction: GGGTGGGG at 3542.
# Positive strand, positive direction: GGGTGGGG at 2019.
 
==AP2R random dataset samplings==
 
# AP2Rr0: 0.
# AP2Rr1: 1 CCCCACCC at 1922.
# AP2Rr2: 0.
# AP2Rr3: 1, CCCCACCA at 35.
# AP2Rr4: 0.
# AP2Rr5: 0.
# AP2Rr6: 0.
# AP2Rr7: 1, CCCCACCA at 1601.
# AP2Rr8: 0.
# AP2Rr9: 0.
# AP2Rr0ci: 0.
# AP2Rr1ci: 0.
# AP2Rr2ci: 0.
# AP2Rr3ci: 0.
# AP2Rr4ci: 0.
# AP2Rr5ci: 0.
# AP2Rr6ci: 1, GGGTGGGG at 3741.
# AP2Rr7ci: 2, TGGTGGGG at 3077, GGGTGGGG at 134.
# AP2Rr8ci: 1, GGGTGGGG at 100.
# AP2Rr9ci: 0.
 
===AP2Rr arbitrary (evens) (4560-2846) UTRs===
 
# AP2Rr6ci: GGGTGGGG at 3741.
 
===AP2Rr alternate (odds) (4560-2846) UTRs===
 
# AP2Rr7ci: TGGTGGGG at 3077.
 
===AP2Rr arbitrary negative direction (evens) (2596-1) distal promoters===
 
# AP2Rr8ci: GGGTGGGG at 100.
 
===AP2Rr alternate negative direction (odds) (2596-1) distal promoters===
 
# AP2Rr1: CCCCACCC at 1922.
# AP2Rr3: CCCCACCA at 35.
# AP2Rr7: CCCCACCA at 1601.
# AP2Rr7ci: GGGTGGGG at 134.


For the Basic programs testing consensus sequence AAAAAAAA (starting with SuccessablesAAA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
===AP2Rr arbitrary positive direction (odds) (4050-1) distal promoters===
# negative strand, negative direction, looking for AAAAAAAA, 0.
# positive strand, negative direction, looking for AAAAAAAA, 0.
# negative strand, positive direction, looking for AAAAAAAA, 0.
# positive strand, positive direction, looking for AAAAAAAA, 0.
# inverse complement, negative strand, negative direction, looking for TTTTTTTT, 0.
# inverse complement, positive strand, negative direction, looking for TTTTTTTT, 0.
# inverse complement, negative strand, positive direction, looking for TTTTTTTT, 0.
# inverse complement, positive strand, positive direction, looking for TTTTTTTT, 0.


===AAA (4560-2846) UTRs===
# AP2Rr1: CCCCACCC at 1922.
# AP2Rr3: CCCCACCA at 35.
# AP2Rr7: CCCCACCA at 1601.
# AP2Rr7ci: TGGTGGGG at 3077, GGGTGGGG at 134.


===AAA negative direction (2846-2811) core promoters===
===AP2Rr alternate positive direction (evens) (4050-1) distal promoters===


===AAA positive direction (4445-4265) core promoters===
# AP2Rr6ci: GGGTGGGG at 3741.
# AP2Rr8ci: GGGTGGGG at 100.


===AAA negative direction (2811-2596) proximal promoters===
==AP2R analysis and results==
{{main|Complex locus A1BG and ZNF497#AP2Rs}}
The skeletal muscle acetylcholine receptor (AchR) promoter of the mouse has four copies of CCCCACC(A/C), where the repeats of CCCCACCC have perfect homology with two known AP-2-binding sites.<ref name=Roesler/>


===AAA positive direction (4265-4050) proximal promoters===
{|class="wikitable"
|-
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1)
|-
| Reals || UTR || negative || 0 || 2 || 0 || 0
|-
| Randoms || UTR || arbitrary negative || 1 || 10 || 0.1 || 0.1
|-
| Randoms || UTR || alternate negative || 1 || 10 || 0.1 || 0.1
|-
| Reals || Core || negative || 0 || 2 || 0 || 0
|-
| Randoms || Core || arbitrary negative || 0 || 10 || 0 || 0
|-
| Randoms || Core || alternate negative || 0 || 10 || 0 || 0
|-
| Reals || Core || positive || 0 || 2 || 0 || 0
|-
| Randoms || Core || arbitrary positive || 0 || 10 || 0 || 0
|-
| Randoms || Core || alternate positive || 0 || 10 || 0 || 0
|-
| Reals || Proximal || negative || 0 || 2 || 0 || 0
|-
| Randoms || Proximal || arbitrary negative || 0 || 10 || 0 || 0
|-
| Randoms || Proximal || alternate negative || 0 || 10 || 0 || 0
|-
| Reals || Proximal || positive || 0 || 2 || 0 || 0
|-
| Randoms || Proximal || arbitrary positive || 0 || 10 || 0 || 0
|-
| Randoms || Proximal || alternate positive || 0 || 10 || 0 || 0
|-
| Reals || Distal || negative || 0 || 2 || 0 || 0
|-
| Randoms || Distal || arbitrary negative || 1 || 10 || 0.1 || 0.25
|-
| Randoms || Distal || alternate negative || 4 || 10 || 0.4 || 0.25
|-
| Reals || Distal || positive || 2 || 2 || 1 || 1 ± 0 (-+1,++1)  
|-
| Randoms || Distal || arbitrary positive || 5 || 10 || 0.5 || 0.35 ± 0.15
|-
| Randoms || Distal || alternate positive || 2 || 10 || 0.2 || 0.35 ± 0.15
|}


===AAA negative direction (2596-1) distal promoters===
Comparison:


===AAA positive direction (4050-1) distal promoters===
The occurrences of real AP2R distals are greater than the randoms. This suggests that the real AP2R distals are likely active or activable.


==Acknowledgements==
==Acknowledgements==

Latest revision as of 01:39, 8 September 2023

Associate Editor(s)-in-Chief: Henry A. Hoff

Human genes

The "four cystatin genes [GeneID: 1469 CST1, GeneID: 1470 CST2, GeneID: 1471 CST3, and GeneID: 1472 CST4] contain the ATA-box sequence (ATAAA) in their 5'-flanking regions; however, the CAT-box sequence (CAT), a binding site of the transcription factor, CTF, is found only in the 5'-flanking region of the S-type cystatin genes."[1]

Gene expressions

The "5‘ flanking region of the rat acetylcholine receptor (AChR) β subunit gene [with] regulatory elements that confer muscle specificity [includes] a minimal TATA-box-less promoter region containing an initiator motif. An 85-bp fragment [promotes] high muscle-specific expression of a chloramphenicol acetyltransferase (CAT) reporter construct upon transfection in primary muscle cells. This sequence can be functionally dissected in a basal muscle-specific promoter element carrying a M-CAT box that is flanked at the 5’ end by an enhancer element with two binding sites for myogenic factors. Point mutations in the M-CAT box cause the loss of transcriptional activity of the basal promoter fragment. The enhancer activity depends on the presence of both E boxes that cooperate in a synergistic fashion. [The] control of muscle-specific and developmental expression of the rat AChR β subunit gene requires both regulatory elements, the M-CAT box and two adjacent E boxes, located in close proximity to each other."[2]

Interactions

The "minimal regulatory region of the 5’ flanking sequence contains E box elements that are defined by the nucleotides CANNTG [26, 271. E boxes are shown to provide binding sites for helix-loop-helix proteins of the MyoDl family including MyoDl [28], myogenin [29, 301, MRF4/ herculin [31] and myf5 [32]."[2]

Enhancer activity

"Partial sequence of the 5' flanking region of the rat AChR β subunit gene [contains] putative E box element [CAGGTG], putative Sp1 element [GGGGCGGGT at -85 nts], putative Shue box element [CCCTGGCCTGG at -15 nts], M-CAT box element [GCGGCCTC at -8 nts]."[2]

"Within the first 140bp of the 5’ flanking region the position and sequence of three other putative regulatory elements, the Spl [43, 44], M-CAT [34] and Shue box [45], are conserved between mouse and rat".[2]

Consensus sequences

"The M-CAT consensus sequence [is] CATTCCT".[2]

Promoter occurrences

"A CAT-box-like element, GCCATT [34], adjacent to the GC-box, is conserved in the three promoters."[2]

Hypotheses

  1. A1BG has no CAT boxes in either promoter.
  2. A1BG is not transcribed by a CAT box.
  3. CAT box does not participate in the transcription of A1BG.

CAT box samplings

Copying a CAT box consensus sequence 5'-CATTCCT-3' and putting the sequence in "⌘F" finds one between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence 5'-CATTCCT-3' (starting with SuccessablesCAT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for 5'-CATTCCT-3', 0.
  2. negative strand, positive direction, looking for 5'-CATTCCT-3', 1, 5'-CATTCCT-3' at 2209, and complement.
  3. positive strand, negative direction, looking for 5'-CATTCCT-3', 0.
  4. positive strand, positive direction, looking for 5'-CATTCCT-3', 1, 5'-CATTCCT-3' at 2458, and complement.
  5. complement, negative strand, negative direction, looking for 5'-GTAAGGA-3', 0.
  6. complement, negative strand, positive direction, looking for 5'-GTAAGGA-3', 1, 5'-GTAAGGA-3' at 2458.
  7. complement, positive strand, negative direction, looking for 5'-GTAAGGA-3', 0.
  8. complement, positive strand, positive direction, looking for 5'-GTAAGGA-3', 1, 5'-GTAAGGA-3' at 2209.
  9. inverse complement, negative strand, negative direction, looking for 5'-AGGAATG-3', 0.
  10. inverse complement, negative strand, positive direction, looking for 5'-AGGAATG-3', 0.
  11. inverse complement, positive strand, negative direction, looking for 5'-AGGAATG-3', 1, 5'-AGGAATG-3' at 4554.
  12. inverse complement, positive strand, positive direction, looking for 5'-AGGAATG-3', 0.
  13. inverse negative strand, negative direction, looking for 5'-TCCTTAC-3', 1, 5'-TCCTTAC-3' at 4554.
  14. inverse negative strand, positive direction, looking for 5'-TCCTTAC-3', 0.
  15. inverse positive strand, negative direction, looking for 5'-TCCTTAC-3', 0.
  16. inverse positive strand, positive direction, looking for 5'-TCCTTAC-3', 0.

CAT (4560-2846) UTRs

  1. Positive strand, negative direction: AGGAATG at 4554.

CAT positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CATTCCT at 2209.
  2. Positive strand, positive direction: CATTCCT at 2458.

CAT box random dataset samplings

  1. CATr0: 0.
  2. CATr1: 0.
  3. CATr2: 0.
  4. CATr3: 1, CATTCCT at 3089.
  5. CATr4: 1, CATTCCT at 1553.
  6. CATr5: 1, CATTCCT at 985.
  7. CATr6: 0.
  8. CATr7: 0.
  9. CATr8: 0.
  10. CATr9: 0.
  11. CATr0ci: 0.
  12. CATr1ci: 0.
  13. CAT2ci: 1, AGGAATG at 4356.
  14. CATr3ci: 0.
  15. CATr4ci: 1, AGGAATG at 2701.
  16. CATr5ci: 0.
  17. CATr6ci: 0.
  18. CATr7ci: 0.
  19. CATr8ci: 1, AGGAATG at 157.
  20. CATr9ci: 1, AGGAATG at 3677.

CATr arbitrary (evens) (4560-2846) UTRs

  1. CAT2ci: AGGAATG at 4356.

CATr alternate (odds) (4560-2846) UTRs

  1. CATr3: CATTCCT at 3089.
  2. CATr9ci: AGGAATG at 3677.

CATr alternate positive direction (evens) (4445-4265) core promoters

  1. CAT2ci: AGGAATG at 4356.

CATr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. CATr4ci: AGGAATG at 2701.

CATr arbitrary negative direction (evens) (2596-1) distal promoters

  1. CATr4: CATTCCT at 1553.
  2. CATr8ci: AGGAATG at 157.

CATr alternate negative direction (odds) (2596-1) distal promoters

  1. CATr5: CATTCCT at 985.

CATr arbitrary positive direction (odds) (4050-1) distal promoters

  1. CATr3: CATTCCT at 3089.
  2. CATr5: CATTCCT at 985.
  3. CATr9ci: AGGAATG at 3677.

CATr alternate positive direction (evens) (4050-1) distal promoters

  1. CATr4: CATTCCT at 1553.
  2. CATr8ci: AGGAATG at 157.

CAT box analysis and results

"The M-CAT consensus sequence [is] CATTCCT".[2]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 1 2 0.5 0.5
Randoms UTR arbitrary negative 1 10 0.1 0.15
Randoms UTR alternate negative 2 10 0.2 0.15
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 1 10 0.1 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 1 10 0.1 0.05
Randoms Proximal alternate negative 0 10 0 0.05
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal arbitrary negative 2 10 0.2 0.15
Randoms Distal alternate negative 1 10 0.1 0.15
Reals Distal positive 2 2 1 1
Randoms Distal arbitrary positive 3 10 0.3 0.25
Randoms Distal alternate positive 2 10 0.2 0.25

Comparison:

The occurrences of real CATs are greater than the randoms. This suggests that the real CATs are likely active or activable.

CAT-box-like element samplings

Copying a CAT-box-like element consensus sequence GCCATT and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or two between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence GCCATT (starting with SuccessablesCATble.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for GCCATT, 0.
  2. positive strand, negative direction, looking for GCCATT, 2, GCCATT at 3686, GCCATT at 3284.
  3. positive strand, positive direction, looking for GCCATT, 0.
  4. negative strand, positive direction, looking for GCCATT, 0.
  5. inverse complement, negative strand, negative direction, looking for AATGGC, 0.
  6. inverse complement, positive strand, negative direction, looking for AATGGC, 2, AATGGC at 3005, AATGGC at 1949.
  7. inverse complement, positive strand, positive direction, looking for AATGGC, 0.
  8. inverse complement, negative strand, positive direction, looking for AATGGC, 0.

CATble (4560-2846) UTRs

  1. Positive strand, negative direction: GCCATT at 3686, GCCATT at 3284, AATGGC at 3005.

CATble negative direction (2596-1) distal promoters

  1. Positive strand, negative direction: AATGGC at 1949.

CAT-box-like element random dataset samplings

  1. CATbler0: 3, GCCATT at 4001, GCCATT at 2389, GCCATT at 1184.
  2. CATbler1: 2, GCCATT at 3407, GCCATT at 1908.
  3. CATbler2: 1, GCCATT at 1471.
  4. CATbler3: 2, GCCATT at 1459, GCCATT at 401.
  5. CATbler4: 3, GCCATT at 3574, GCCATT at 3025, GCCATT at 1550.
  6. CATbler5: 3, GCCATT at 2803, GCCATT at 1202, GCCATT at 292.
  7. CATbler6: 3, GCCATT at 3997, GCCATT at 3726, GCCATT at 2292.
  8. CATbler7: 2, GCCATT at 2444, GCCATT at 2270.
  9. CATbler8: 2, GCCATT at 4021, GCCATT at 2461.
  10. CATbler9: 0.
  11. CATbler0ci: 1, AATGGC at 4239.
  12. CATbler1ci: 1, AATGGC at 3940.
  13. CATbler2ci: 1, AATGGC at 429.
  14. CATbler3ci: 2, AATGGC at 3370, AATGGC at 2694.
  15. CATbler4ci: 3, AATGGC at 3265, AATGGC at 1365, AATGGC at 1076.
  16. CATbler5ci: 0.
  17. CATbler6ci: 3, AATGGC at 1958, AATGGC at 251, AATGGC at 74.
  18. CATbler7ci: 1, AATGGC at 3389.
  19. CATbler8ci: 4, AATGGC at 3107, AATGGC at 2423, AATGGC at 640, AATGGC at 159.
  20. CATbler9ci: 4, AATGGC at 4343, AATGGC at 3788, AATGGC at 1724, AATGGC at 333.

CATbler arbitrary (evens) (4560-2846) UTRs

  1. CATbler0: GCCATT at 4001.
  2. CATbler4: GCCATT at 3574, GCCATT at 3025.
  3. CATbler6: GCCATT at 3997, GCCATT at 3726.
  4. CATbler8: GCCATT at 4021.
  5. CATbler0ci: AATGGC at 4239.
  6. CATbler4ci: AATGGC at 3265.
  7. CATbler8ci: AATGGC at 3107.

CATbler alternate (odds) (4560-2846) UTRs

  1. CATbler1: GCCATT at 3407.
  2. CATbler1ci: AATGGC at 3940.
  3. CATbler3ci: AATGGC at 3370.
  4. CATbler7ci: AATGGC at 3389.
  5. CATbler9ci: AATGGC at 4343, AATGGC at 3788.

CATbler arbitrary positive direction (odds) (4445-4265) core promoters

  1. CATbler9ci: AATGGC at 4343.

CATbler alternate negative direction (odds) (2811-2596) proximal promoters

  1. CATbler5: GCCATT at 2803.
  2. CATbler3ci: AATGGC at 2694.

CATbler alternate positive direction (evens) (4265-4050) proximal promoters

  1. CATbler0ci: AATGGC at 4239.

CATbler arbitrary negative direction (evens) (2596-1) distal promoters

  1. CATbler0: GCCATT at 2389, GCCATT at 1184.
  2. CATbler2: GCCATT at 1471.
  3. CATbler4: GCCATT at 1550.
  4. CATbler6: GCCATT at 2292.
  5. CATbler8: GCCATT at 2461.
  6. CATbler2ci: AATGGC at 429.
  7. CATbler4ci: AATGGC at 1365, AATGGC at 1076.
  8. CATbler6ci: AATGGC at 1958, AATGGC at 251, AATGGC at 74.
  9. CATbler8ci: AATGGC at 2423, AATGGC at 640, AATGGC at 159.

CATbler alternate negative direction (odds) (2596-1) distal promoters

  1. CATbler1: GCCATT at 1908.
  2. CATbler3: GCCATT at 1459, GCCATT at 401.
  3. CATbler5: GCCATT at 1202, GCCATT at 292.
  4. CATbler7: GCCATT at 2444, GCCATT at 2270.
  5. CATbler9ci: AATGGC at 1724, AATGGC at 333.

CATbler arbitrary positive direction (odds) (4050-1) distal promoters

  1. CATbler1: GCCATT at 3407, GCCATT at 1908.
  2. CATbler3: GCCATT at 1459, GCCATT at 401.
  3. CATbler5: GCCATT at 2803, GCCATT at 1202, GCCATT at 292.
  4. CATbler7: GCCATT at 2444, GCCATT at 2270.
  5. CATbler1ci: AATGGC at 3940.
  6. CATbler3ci: AATGGC at 3370, AATGGC at 2694.
  7. CATbler7ci: AATGGC at 3389.
  8. CATbler9ci: AATGGC at 3788, AATGGC at 1724, AATGGC at 333.

CATbler alternate positive direction (evens) (4050-1) distal promoters

  1. CATbler0: GCCATT at 4001, GCCATT at 2389, GCCATT at 1184.
  2. CATbler2: GCCATT at 1471.
  3. CATbler4: GCCATT at 3574, GCCATT at 3025, GCCATT at 1550.
  4. CATbler6: GCCATT at 3997, GCCATT at 3726, GCCATT at 2292.
  5. CATbler8: GCCATT at 4021, GCCATT at 2461.
  6. CATbler2ci: AATGGC at 429.
  7. CATbler4ci: AATGGC at 3265, AATGGC at 1365, AATGGC at 1076.
  8. CATbler6ci: AATGGC at 1958, AATGGC at 251, AATGGC at 74.
  9. CATbler8ci: AATGGC at 3107, AATGGC at 2423, AATGGC at 640, AATGGC at 159.

CAT box like analysis and results

"A CAT-box-like element, GCCATT [34], adjacent to the GC-box, is conserved in the three promoters."[2]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 3 2 1.5 1.5
Randoms UTR arbitrary negative 9 10 0.9 0.75
Randoms UTR alternate negative 6 10 0.6 0.75
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.05
Randoms Core alternate positive 0 10 0 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0.1
Randoms Proximal alternate negative 2 10 0.2 0.1
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0.05
Randoms Proximal alternate positive 1 10 0.1 0.05
Reals Distal negative 1 2 0.5 0.5
Randoms Distal arbitrary negative 15 10 1.5 1.2 ± 0.3
Randoms Distal alternate negative 9 10 0.9 1.2 ± 0.3
Reals Distal positive 0 2 0 0
Randoms Distal arbitrary positive 16 10 1.6 1.9 ± 0.3
Randoms Distal alternate positive 22 10 2.2 1.9 ± 0.3

Comparison:

The occurrences of real CATbles are greater or less than the randoms. This suggests that the real CATbles are likely active or activable.

M-CAT box samplings

For the Basic programs testing consensus sequence GCGGCCTC (starting with SuccessablesMCAT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction: 0.
  2. positive strand, negative direction: 0.
  3. negative strand, positive direction: 0.
  4. positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 0.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

Shue box samplings

The subunit homologous upstream element (SHUE) box homology is an 18-nucleotide sequence CAATCCCTGCCTGGGATC, where the sequence CCCTG(C/G) was conserved in all four subunits.[3]

For the Basic programs testing consensus sequence CCCTG(C/G) (starting with SuccessablesShue.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 2, CCCTGG at 3868, CCCTGC at 1151.
  2. Positive strand, negative direction: 3, CCCTGG at 4494, CCCTGG at 3744, CCCTGC at 3579.
  3. Negative strand, positive direction: 6, CCCTGG at 4424, CCCTGC at 4231, CCCTGG at 3496, CCCTGG at 1815, CCCTGC at 1075, CCCTGC at 191.
  4. Positive strand, positive direction: 5, CCCTGG at 3545, CCCTGG at 3172, CCCTGC at 410, CCCTGC at 323, CCCTGG at 37.
  5. inverse complement, negative strand, negative direction: 0.
  6. inverse complement, positive strand, negative direction: 2, CCAGGG at 3886, CCAGGG at 3565.
  7. inverse complement, negative strand, positive direction: 13, GCAGGG at 3663, CCAGGG at 3537, GCAGGG at 3467, CCAGGG at 2781, CCAGGG at 2575, GCAGGG at 2297, CCAGGG at 1894, GCAGGG at 1789, GCAGGG at 659, CCAGGG at 516, GCAGGG at 380, GCAGGG at 319, CCAGGG at 34.
  8. inverse complement, positive strand, positive direction: 2, CCAGGG at 4421, GCAGGG at 3204.

Shue (4560-2846) UTRs

  1. Negative strand, negative direction: CCCTGG at 3868.
  2. Positive strand, negative direction: CCCTGG at 4494, CCCTGG at 3744, CCCTGC at 3579.
  3. Positive strand, negative direction: CCAGGG at 3886, CCAGGG at 3565.

Shue positive direction (4445-4265) core promoters

  1. Negative strand, positive direction: CCCTGG at 4424.
  2. Positive strand, positive direction: CCAGGG at 4421.

Shue positive direction (4265-4050) proximal promoters

  1. Negative strand, positive direction: CCCTGC at 4231.

Shue negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: CCCTGC at 1151.

Shue positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CCCTGG at 3496, CCCTGG at 1815, CCCTGC at 1075, CCCTGC at 191.
  2. Negative strand, positive direction: GCAGGG at 3663, CCAGGG at 3537, GCAGGG at 3467, CCAGGG at 2781, CCAGGG at 2575, GCAGGG at 2297, CCAGGG at 1894, GCAGGG at 1789, GCAGGG at 659, CCAGGG at 516, GCAGGG at 380, GCAGGG at 319, CCAGGG at 34.
  3. Positive strand, positive direction: CCCTGG at 3545, CCCTGG at 3172, CCCTGC at 410, CCCTGC at 323, CCCTGG at 37.
  4. Positive strand, positive direction: GCAGGG at 3204.

Shue box random dataset samplings

  1. Shuer0: 4, CCCTGG at 4268, CCCTGG at 3641, CCCTGG at 2435, CCCTGG at 682.
  2. Shuer1: 2, CCCTGC at 4090, CCCTGG at 1740.
  3. Shuer2: 1, CCCTGC at 2557.
  4. Shuer3: 2, CCCTGC at 3410, CCCTGC at 2138.
  5. Shuer4: 10, CCCTGG at 3231, CCCTGG at 3109, CCCTGG at 2861, CCCTGG at 2756, CCCTGG at 1481, CCCTGG at 1168, CCCTGG at 1153, CCCTGG at 970, CCCTGG at 203, CCCTGG at 84.
  6. Shuer5: 2, CCCTGG at 4384, CCCTGG at 917.
  7. Shuer6: 3, CCCTGC at 3875, CCCTGC at 821, CCCTGC at 51.
  8. Shuer7: 1, CCCTGG at 782.
  9. Shuer8: 2, CCCTGG at 3140, CCCTGG at 2353.
  10. Shuer9: 3, CCCTGG at 3000, CCCTGG at 2225, CCCTGG at 1829.
  11. Shuer0ci: 2, CCAGGG at 3744, CCAGGG at 66.
  12. Shuer1ci: 5, GCAGGG at 3932, GCAGGG at 3740, GCAGGG at 2653, GCAGGG at 1982, CCAGGG at 668.
  13. Shuer2ci: 3, CCAGGG at 4272, CCAGGG at 2280, CCAGGG at 1572.
  14. Shuer3ci: 6, GCAGGG at 3332, CCAGGG at 1878, CCAGGG at 1589, GCAGGG at 1413, GCAGGG at 817, CCAGGG at 525.
  15. Shuer4ci: 2, CCAGGG at 4559, CCAGGG at 3648.
  16. Shuer5ci: 9, CCAGGG at 4369, GCAGGG at 3907, CCAGGG at 3503, GCAGGG at 3481, CCAGGG at 3229, GCAGGG at 1514, GCAGGG at 1306, CCAGGG at 768, CCAGGG at 332.
  17. Shuer6ci: 6, CCAGGG at 3885, CCAGGG at 3515, GCAGGG at 3138, GCAGGG at 2792, CCAGGG at 1755, CCAGGG at 963.
  18. Shuer7ci: 1, CCAGGG at 2343.
  19. Shuer8ci: 4, GCAGGG at 3535, GCAGGG at 2681, GCAGGG at 1697, CCAGGG at 1591.
  20. Shuer9ci: 3, GCAGGG at 4366, GCAGGG at 2212, GCAGGG at 2173.

Shuer arbitrary (evens) (4560-2846) UTRs

  1. Shuer0: CCCTGG at 4268, CCCTGG at 3641.
  2. Shuer4: CCCTGG at 3231, CCCTGG at 3109, CCCTGG at 2861.
  3. Shuer6: CCCTGC at 3875.
  4. Shuer8: CCCTGG at 3140.
  5. Shuer0ci: CCAGGG at 3744.
  6. Shuer2ci: CCAGGG at 4272.
  7. Shuer4ci: CCAGGG at 4559, CCAGGG at 3648.
  8. Shuer6ci: CCAGGG at 3885, CCAGGG at 3515, GCAGGG at 3138.
  9. Shuer8ci: GCAGGG at 3535.

Shuer alternate (odds) (4560-2846) UTRs

  1. Shuer1: CCCTGC at 4090.
  2. Shuer3: CCCTGC at 3410.
  3. Shuer5: CCCTGG at 4384.
  4. Shuer9: CCCTGG at 3000.
  5. Shuer1ci: GCAGGG at 3932, GCAGGG at 3740.
  6. Shuer3ci: GCAGGG at 3332.
  7. Shuer5ci: CCAGGG at 4369, GCAGGG at 3907, CCAGGG at 3503, GCAGGG at 3481, CCAGGG at 3229.
  8. Shuer9ci: GCAGGG at 4366.

Shuer arbitrary positive direction (odds) (4445-4265) core promoters

  1. Shuer5: CCCTGG at 4384.
  2. Shuer5ci: CCAGGG at 4369.
  3. Shuer9ci: GCAGGG at 4366.

Shuer alternate positive direction (evens) (4445-4265) core promoters

  1. Shuer0: CCCTGG at 4268.
  2. Shuer2ci: CCAGGG at 4272.

Shuer arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. Shuer4: CCCTGG at 2756.
  2. Shuer6ci: GCAGGG at 2792.
  3. Shuer8ci: GCAGGG at 2681.

Shuer alternate negative direction (odds) (2811-2596) proximal promoters

  1. Shuer1ci: GCAGGG at 2653.

Shuer arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. Shuer1: CCCTGC at 4090.

Shuer arbitrary negative direction (evens) (2596-1) distal promoters

  1. Shuer0: CCCTGG at 2435, CCCTGG at 682.
  2. Shuer2: CCCTGC at 2557.
  3. Shuer4: CCCTGG at 1481, CCCTGG at 1168, CCCTGG at 1153, CCCTGG at 970, CCCTGG at 203, CCCTGG at 84.
  4. Shuer6: CCCTGC at 821, CCCTGC at 51.
  5. Shuer8: CCCTGG at 2353.
  6. Shuer0ci: CCAGGG at 66.
  7. Shuer2ci: CCAGGG at 2280, CCAGGG at 1572.
  8. Shuer6ci: CCAGGG at 1755, CCAGGG at 963.
  9. Shuer8ci: GCAGGG at 1697, CCAGGG at 1591.

Shuer alternate negative direction (odds) (2596-1) distal promoters

  1. Shuer1: CCCTGG at 1740.
  2. Shuer3: CCCTGC at 2138.
  3. Shuer5: CCCTGG at 917.
  4. Shuer7: CCCTGG at 782.
  5. Shuer9: CCCTGG at 2225, CCCTGG at 1829.
  6. Shuer1ci: GCAGGG at 1982, CCAGGG at 668.
  7. Shuer3ci: CCAGGG at 1878, CCAGGG at 1589, GCAGGG at 1413, GCAGGG at 817, CCAGGG at 525.
  8. Shuer5ci: GCAGGG at 1514, GCAGGG at 1306, CCAGGG at 768, CCAGGG at 332.
  9. Shuer7ci: CCAGGG at 2343.
  10. Shuer9ci: GCAGGG at 2212, GCAGGG at 2173.

Shuer arbitrary positive direction (odds) (4050-1) distal promoters

  1. Shuer1: CCCTGG at 1740.
  2. Shuer3: CCCTGC at 3410, CCCTGC at 2138.
  3. Shuer5: CCCTGG at 917.
  4. Shuer7: CCCTGG at 782.
  5. Shuer9: CCCTGG at 3000, CCCTGG at 2225, CCCTGG at 1829.
  6. Shuer1ci: GCAGGG at 3932, GCAGGG at 3740, GCAGGG at 2653, GCAGGG at 1982, CCAGGG at 668.
  7. Shuer3ci: GCAGGG at 3332, CCAGGG at 1878, CCAGGG at 1589, GCAGGG at 1413, GCAGGG at 817, CCAGGG at 525.
  8. Shuer4ci: CCAGGG at 3648.
  9. Shuer5ci: GCAGGG at 3907, CCAGGG at 3503, GCAGGG at 3481, CCAGGG at 3229, GCAGGG at 1514, GCAGGG at 1306, CCAGGG at 768, CCAGGG at 332.
  10. Shuer7ci: CCAGGG at 2343.
  11. Shuer9ci: GCAGGG at 2212, GCAGGG at 2173.

Shur alternate positive direction (evens) (4050-1) distal promoters

  1. Shuer0: CCCTGG at 3641, CCCTGG at 2435, CCCTGG at 682.
  2. Shuer2: CCCTGC at 2557.
  3. Shuer4: CCCTGG at 3231, CCCTGG at 3109, CCCTGG at 2861, CCCTGG at 2756, CCCTGG at 1481, CCCTGG at 1168, CCCTGG at 1153, CCCTGG at 970, CCCTGG at 203, CCCTGG at 84.
  4. Shuer6: CCCTGC at 3875, CCCTGC at 821, CCCTGC at 51.
  5. Shuer8: CCCTGG at 3140, CCCTGG at 2353.
  6. Shuer0ci: CCAGGG at 3744, CCAGGG at 66.
  7. Shuer2ci: CCAGGG at 4272, CCAGGG at 2280, CCAGGG at 1572.
  8. Shuer6ci: CCAGGG at 3885, CCAGGG at 3515, GCAGGG at 3138, GCAGGG at 2792, CCAGGG at 1755, CCAGGG at 963.
  9. Shuer8ci: GCAGGG at 3535, GCAGGG at 2681, GCAGGG at 1697, CCAGGG at 1591.

Shue box analysis and results

The sequence CCCTG(C/G) was conserved in all four subunits.[3]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 6 2 3 3 ± 2 (--1,+-5)
Randoms UTR arbitrary negative 15 10 1.5 1.4
Randoms UTR alternate negative 13 10 1.3 1.4
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 2 2 1 1 ± 0 (-+1,++1)
Randoms Core arbitrary positive 3 10 0.3 0.25
Randoms Core alternate positive 2 10 0.2 0.25
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 3 10 0.3 0.2
Randoms Proximal alternate negative 1 10 0.1 0.2
Reals Proximal positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Proximal arbitrary positive 1 10 0.1 0.05
Randoms Proximal alternate positive 0 10 0 0.05
Reals Distal negative 1 2 0.5 0.5 ± 0.5 (--1,+-0)
Randoms Distal arbitrary negative 19 10 1.9 1.95
Randoms Distal alternate negative 20 10 2.0 1.95
Reals Distal positive 23 2 11.5 11.5 ± 5.5 (-+17,++6)
Randoms Distal arbitrary positive 31 10 3.1 3.25
Randoms Distal alternate positive 34 10 3.4 3.25

Comparison:

The occurrences of real Shue UTRs are outside the randoms, cores, proximals and positive distals are greater than the randoms, negative distals are less then the randoms. This suggests that the real Shues are likely active or activable.

Sp1 element samplings

For the Basic programs testing consensus sequence GGGGCGGGT (starting with SuccessablesSp1B.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction: 0.
  2. positive strand, negative direction: 0.
  3. negative strand, positive direction: 0.
  4. positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 0.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

AP-2 (Roesler) samplings

The skeletal muscle acetylcholine receptor (AchR) promoter of the mouse has four copies of CCCCACC(A/C), where the repeats of CCCCACCC have perfect homology with two known AP-2-binding sites.[4]

For the Basic programs testing consensus sequence CCCCACC(A/C) (starting with SuccessablesAP2R.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction: 0.
  2. positive strand, negative direction: 0.
  3. negative strand, positive direction: 0.
  4. positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 0.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 1, GGGTGGGG at 3542.
  8. inverse complement, positive strand, positive direction: 1, GGGTGGGG at 2019.

AP2R positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: GGGTGGGG at 3542.
  2. Positive strand, positive direction: GGGTGGGG at 2019.

AP2R random dataset samplings

  1. AP2Rr0: 0.
  2. AP2Rr1: 1 CCCCACCC at 1922.
  3. AP2Rr2: 0.
  4. AP2Rr3: 1, CCCCACCA at 35.
  5. AP2Rr4: 0.
  6. AP2Rr5: 0.
  7. AP2Rr6: 0.
  8. AP2Rr7: 1, CCCCACCA at 1601.
  9. AP2Rr8: 0.
  10. AP2Rr9: 0.
  11. AP2Rr0ci: 0.
  12. AP2Rr1ci: 0.
  13. AP2Rr2ci: 0.
  14. AP2Rr3ci: 0.
  15. AP2Rr4ci: 0.
  16. AP2Rr5ci: 0.
  17. AP2Rr6ci: 1, GGGTGGGG at 3741.
  18. AP2Rr7ci: 2, TGGTGGGG at 3077, GGGTGGGG at 134.
  19. AP2Rr8ci: 1, GGGTGGGG at 100.
  20. AP2Rr9ci: 0.

AP2Rr arbitrary (evens) (4560-2846) UTRs

  1. AP2Rr6ci: GGGTGGGG at 3741.

AP2Rr alternate (odds) (4560-2846) UTRs

  1. AP2Rr7ci: TGGTGGGG at 3077.

AP2Rr arbitrary negative direction (evens) (2596-1) distal promoters

  1. AP2Rr8ci: GGGTGGGG at 100.

AP2Rr alternate negative direction (odds) (2596-1) distal promoters

  1. AP2Rr1: CCCCACCC at 1922.
  2. AP2Rr3: CCCCACCA at 35.
  3. AP2Rr7: CCCCACCA at 1601.
  4. AP2Rr7ci: GGGTGGGG at 134.

AP2Rr arbitrary positive direction (odds) (4050-1) distal promoters

  1. AP2Rr1: CCCCACCC at 1922.
  2. AP2Rr3: CCCCACCA at 35.
  3. AP2Rr7: CCCCACCA at 1601.
  4. AP2Rr7ci: TGGTGGGG at 3077, GGGTGGGG at 134.

AP2Rr alternate positive direction (evens) (4050-1) distal promoters

  1. AP2Rr6ci: GGGTGGGG at 3741.
  2. AP2Rr8ci: GGGTGGGG at 100.

AP2R analysis and results

The skeletal muscle acetylcholine receptor (AchR) promoter of the mouse has four copies of CCCCACC(A/C), where the repeats of CCCCACCC have perfect homology with two known AP-2-binding sites.[4]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 1 10 0.1 0.1
Randoms UTR alternate negative 1 10 0.1 0.1
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal arbitrary negative 1 10 0.1 0.25
Randoms Distal alternate negative 4 10 0.4 0.25
Reals Distal positive 2 2 1 1 ± 0 (-+1,++1)
Randoms Distal arbitrary positive 5 10 0.5 0.35 ± 0.15
Randoms Distal alternate positive 2 10 0.2 0.35 ± 0.15

Comparison:

The occurrences of real AP2R distals are greater than the randoms. This suggests that the real AP2R distals are likely active or activable.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

References

  1. Eiichi Saitoh and Satoko Isemura (January 1, 1993). "Molecular Biology of Human Salivary Cysteine Proteinase Inhibitors" (PDF). Critical Reviews in Oral Biology and Medicine. 4 (3/4): 487–93. doi:10.1177/10454411930040033301. Retrieved 2013-06-28.
  2. 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 Christof Berberich, Ingolf Dürr, Michael Koenen and Veit Witzemann (September 1993). "Two adjacent E box elements and a M‐CAT box are involved in the muscle‐specific regulation of the rat acetylcholine receptor β subunit gene". European Journal of Biochemistry. 216 (2): 395–404. doi:10.1111/j.1432-1033.1993.tb18157.x. Retrieved 27 December 2019.
  3. 3.0 3.1 Crowder, C. Michael & Merlie, J. P. (December 1988) "Stepwise Activation of the Mouse Acetylcholine Receptor 𝛅- and 𝛄-Subunit Genes in Clonal Cell Lines" Molecular and Cellular Biology 8(12), 5257-5267, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC365628/pdf/molcellb00072-0209.pdf
  4. 4.0 4.1 Roesler, W. J., G. R. Vandenbark, and R. W. Hanson (1988), "Cyclic AMP and the induction of eukaryotic gene transcription" J. Biol. Chem. 263:9063-9066.

External links