D box gene transcriptions: Difference between revisions

Jump to navigation Jump to search
Line 66: Line 66:
# Dboxr0: 1, AGTCTG at 4073.
# Dboxr0: 1, AGTCTG at 4073.
# Dboxr1: 0.
# Dboxr1: 0.
# Dboxr2: 0.
# Dboxr3: 1, AGTCTG at 1984.
# Dboxr3: 1, AGTCTG at 1984.
# Dboxr4: 0.
# Dboxr4: 0.
Line 380: Line 379:
[[Category:Gene project lectures]]
[[Category:Gene project lectures]]
[[Category:Gene transcription lectures]]
[[Category:Gene transcription lectures]]
[[Category:Resources last modified in December 2020]]

Revision as of 22:02, 19 February 2022

Associate Editor(s)-in-Chief: Henry A. Hoff

File:RF00071.jpg
This example of a C/D box is a small nucleolar RNA 73 (snoRNA U73). Credit: Rfam database (RF00071).{{free media}}

For "box C/D snoRNAs, boxes C and D and an adjoining stem form a vital structure, known as the box C/D motif."[1]

In snoRNA U73 on the right, from the right side, the D box is AGUCY. In 5' to 3' direction, the D box is YCUGA.

Degenerate nucleotides

For transcription, U (in RNA) is T, Y=(C or T) and R=(A or G).

Consensus sequences

File:U14 snoRNA.png
This U14 snoRNA from Saccharomyces cerevisiae shows structure and genomic organization. Credit: Dmitry A.Samarsky, Maurille J.Fournier, Robert H.Singer and Edouard Bertrand.{{fairuse}}

Shown in the image on the right is the D box (3'-AGUCUG-5'). Substituting T for U yields D box = 3'-AGTCTG-5' in the transcription direction on the template strand.

"Members of the box C/D snoRNA family, which are the subject of the present report, possess characteristic sequence elements known as box C (UGAUGA) and box D (GUCUGA)."[1]

D-box (TGAGTGG).[2]

Hypotheses

  1. The D boxes are not involved in the transcription of A1BG.
  2. The promoters of A1BG do not contain a Samarsky D box.

(Samarsky) samplings

For the Basic programs (starting with SuccessablesDbox.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: AGTCTG at 2947.
  2. Negative strand, positive direction: AGTCTG at 3923.
  3. Positive strand, negative direction: AGTCTG at 1355.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 0,
  6. inverse complement, negative strand, positive direction: CAGACT at 1744, CAGACT at 2416.
  7. inverse complement, positive strand, negative direction: CAGACT at 15, CAGACT at 1616,
  8. inverse complement, positive strand, positive direction: CAGACT at 2943, CAGACT at 3006, CAGACT at 3924.

(Samarsky) UTR promoters

  1. Negative strand, negative direction: AGTCTG at 2947.

(Samarsky) distal promoters

  1. Positive strand, negative direction: CAGACT at 1616, AGTCTG at 1355, CAGAC at 15.


  1. Negative strand, positive direction: AGTCTG at 3923, CAGACT at 2416, CAGACT at 1744.
  2. Positive strand, positive direction: CAGACT at 3924, CAGACT at 3006, CAGACT at 2943.

Samarsky random dataset samplings

  1. Dboxr0: 1, AGTCTG at 4073.
  2. Dboxr1: 0.
  3. Dboxr3: 1, AGTCTG at 1984.
  4. Dboxr4: 0.
  5. Dboxr5: 1, AGTCTG at 2334.
  6. RDr6: 0.
  7. RDr7: 0.
  8. RDr8: 0.
  9. RDr9: 0.
  10. RDr0ci: 0.
  11. RDr1ci: 0.
  12. RDr2ci: 0.
  13. RDr3ci: 0.
  14. RDr4ci: 0.
  15. RDr5ci: 0.
  16. RDr6ci: 0.
  17. RDr7ci: 0.
  18. RDr8ci: 0.
  19. RDr9ci: 0.

RDr UTRs

RDr core promoters

RDr proximal promoters

RDr distal promoters

D boxes

The human ribosomal protein L11 gene (HRPL11) has [...] two potential snRNA-coding sequences in intron 4: [...] a D box beginning at +4237 (TCCTG), [...].[3]

(Voronina) samplings

For the Basic programs testing consensus sequence 5'-TCCTG-3' (starting with SuccessablesAAA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for 5'-TCCTG-3', 4, 5'-TCCTG-3' at 4467, 5'-TCCTG-3' at 3755, 5'-TCCTG-3' at 3639, 5'-TCCTG-3' at 3388, and complements.
  2. negative strand, positive direction, looking for 5'-TCCTG-3', 10, 5'-TCCTG-3' at 4408, 5'-TCCTG-3' at 4185, 5'-TCCTG-3' at 3621, 5'-TCCTG-3' at 3295, 5'-TCCTG-3' at 2519, 5'-TCCTG-3' at 2500, 5'-TCCTG-3' at 2210, 5'-TCCTG-3' at 1775, 5'-TCCTG-3' at 1117, 5'-TCCTG-3' at 143, and complements.
  3. positive strand, negative direction, looking for 5'-TCCTG-3', 5, 5'-TCCTG-3' at 4545, 5'-TCCTG-3' at 3905, 5'-TCCTG-3' at 1910, 5'-TCCTG-3' at 1840, 5'-TCCTG-3' at 595, and complements.
  4. positive strand, positive direction, looking for 5'-TCCTG-3', 4, 5'-TCCTG-3' at 4251, 5'-TCCTG-3' at 3130, 5'-TCCTG-3' at 2459, 5'-TCCTG-3' at 1669, and complements.
  5. inverse complement, negative strand, negative direction, looking for 5'-CAGGA-3', 0.
  6. inverse complement, negative strand, positive direction, looking for 5'-CAGGA-3', 7, 5'-CAGGA-3' at 3869, 5'-CAGGA-3' at 3572, 5'-CAGGA-3' at 3129, 5'-CAGGA-3' at 2746, 5'-CAGGA-3' at 2621, 5'-CAGGA-3' at 708, 5'-CAGGA-3' at 425, and complements.
  7. inverse complement, positive strand, negative direction, looking for 5'-CAGGA-3', 23, 5'-CAGGA-3' at 4437, 5'-CAGGA-3' at 4283, 5'-CAGGA-3' at 4171, 5'-CAGGA-3' at 4139, 5'-CAGGA-3' at 3250, 5'-CAGGA-3' at 3218, 5'-CAGGA-3' at 3111, 5'-CAGGA-3' at 2690, 5'-CAGGA-3' at 2588, 5'-CAGGA-3' at 2368, 5'-CAGGA-3' at 2251, 5'-CAGGA-3' at 2135, 5'-CAGGA-3' at 1942, 5'-CAGGA-3' at 1824, 5'-CAGGA-3' at 1289, 5'-CAGGA-3' at 1276, 5'-CAGGA-3' at 998, 5'-CAGGA-3' at 985, 5'-CAGGA-3' at 851, 5'-CAGGA-3' at 832, 5'-CAGGA-3' at 715, 5'-CAGGA-3' at 579, 5'-CAGGA-3' at 442, and complements.
  8. inverse complement, positive strand, positive direction, looking for 5'-CAGGA-3', 5, 5'-CAGGA-3' at 3864, 5'-CAGGA-3' at 3620, 5'-CAGGA-3' at 2999, 5'-CAGGA-3' at 758, 5'-CAGGA-3' at 219, and complements.

DVor UTRs

  1. Negative strand, negative direction: TCCTG at 4467.
  2. Positive strand, negative direction: TCCTG at 4545, CAGGA at 4437, CAGGA at 4283.

DVor core promoters

Negative strand, positive direction: TCCTG at 4408.

DVor proximal promoters

  1. Negative strand, positive direction: TCCTG at 4185.
  2. Positive strand, positive direction: TCCTG at 4251.

DVor distal promoters

  1. Negative strand, negative direction: TCCTG at 3755, TCCTG at 3639, TCCTG at 3388.
  2. Positive strand, negative direction: CAGGA at 4171, CAGGA at 4139, TCCTG at 3905, CAGGA at 3250, CAGGA at 3218, CAGGA at 3111, CAGGA at 2690, CAGGA at 2588, CAGGA at 2368, CAGGA at 2251, CAGGA at 2135, CAGGA at 1942, TCCTG at 1910, TCCTG at 1840, CAGGA at 1824, CAGGA at 1289, AGGA at 1276, CAGGA at 998, CAGGA at 985, CAGGA at 851, CAGGA at 832, CAGGA at 715, TCCTG at 595, CAGGA at 579, CAGGA at 442.


  1. Negative strand, positive direction: CAGGA at 3869, TCCTG at 3621, CAGGA at 3572, TCCTG at 3295, CAGGA at 3129, CAGGA at 2746, CAGGA at 2621, TCCTG at 2519, TCCTG at 2500, TCCTG at 2210, TCCTG at 1775, TCCTG at 1117, CAGGA at 708, CAGGA at 425, TCCTG at 143.
  2. Positive strand, positive direction: CAGGA at 3864, CAGGA at 3620, TCCTG at 3130, CAGGA at 2999, TCCTG at 2459, TCCTG at 1669, CAGGA at 758, CAGGA at 219.

Voronina random dataset samplings

  1. RDr0: 0.
  2. RDr1: 0.
  3. RDr2: 0.
  4. RDr3: 0.
  5. RDr4: 0.
  6. RDr5: 0.
  7. RDr6: 0.
  8. RDr7: 0.
  9. RDr8: 0.
  10. RDr9: 0.
  11. RDr0ci: 0.
  12. RDr1ci: 0.
  13. RDr2ci: 0.
  14. RDr3ci: 0.
  15. RDr4ci: 0.
  16. RDr5ci: 0.
  17. RDr6ci: 0.
  18. RDr7ci: 0.
  19. RDr8ci: 0.
  20. RDr9ci: 0.

RDr UTRs

RDr core promoters

RDr proximal promoters

RDr distal promoters

(Johnson) samplings

TCTCACATT(A/C)AATAAGTCA is a D-box.[4]

For the Basic programs testing consensus sequence 5'-TCTCACATT(A/C)AATAAGTCA-3' (starting with SuccessablesAAA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for 5'-TCTCACATT(A/C)AATAAGTCA-3', 0.
  2. negative strand, positive direction, looking for 5'-TCTCACATT(A/C)AATAAGTCA-3', 0.
  3. positive strand, negative direction, looking for 5'-TCTCACATT(A/C)AATAAGTCA-3', 0.
  4. positive strand, positive direction, looking for 5'-TCTCACATT(A/C)AATAAGTCA-3', 0.
  5. complement, negative strand, negative direction, looking for 5'-AGAGTGTAA(G/T)TTATTCAGT-3', 0.
  6. complement, negative strand, positive direction, looking for 5'-AGAGTGTAA(G/T)TTATTCAGT-3', 0.
  7. complement, positive strand, negative direction, looking for 5'-AGAGTGTAA(G/T)TTATTCAGT-3', 0.
  8. complement, positive strand, positive direction, looking for 5'-AGAGTGTAA(G/T)TTATTCAGT-3', 0.
  9. inverse complement, negative strand, negative direction, looking for 5'-TGACTTATT(G/T)AATGTGAGA-3', 0.
  10. inverse complement, negative strand, positive direction, looking for 5'-TGACTTATT(G/T)AATGTGAGA-3', 0.
  11. inverse complement, positive strand, negative direction, looking for 5'-TGACTTATT(G/T)AATGTGAGA-3', 0.
  12. inverse complement, positive strand, positive direction, looking for 5'-TGACTTATT(G/T)AATGTGAGA-3', 0.
  13. inverse negative strand, negative direction, looking for 5'-ACTGAATAA(A/C)TTACACTCT-3', 0.
  14. inverse negative strand, positive direction, looking for 5'-ACTGAATAA(A/C)TTACACTCT-3', 0.
  15. inverse positive strand, negative direction, looking for 5'-ACTGAATAA(A/C)TTACACTCT-3', 0.
  16. inverse positive strand, positive direction, looking for 5'-ACTGAATAA(A/C)TTACACTCT-3', 0.

(Mracek) samplings

There is another promoter D box, or D-box: "Located in the region [...] is a single D-box element (5′-GTTGTATAAC-3′) with a distinct sequence from that of the functional D-box identified in the per2 promoter (5′-CTTATGTAAA-3′) [21]."[5]

(Mracek1) samplings

For the Basic programs testing consensus sequence 5'-GTTGTATAAC-3' (starting with SuccessablesMra1.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for 5'-GTTGTATAAC-3', 0.
  2. negative strand, positive direction, looking for 5'-GTTGTATAAC-3', 0.
  3. positive strand, negative direction, looking for 5'-GTTGTATAAC-3', 0.
  4. positive strand, positive direction, looking for 5'-GTTGTATAAC-3', 0.
  5. complement, negative strand, negative direction, looking for 5'-CAACATATTG-3', 0.
  6. complement, negative strand, positive direction, looking for 5'-CAACATATTG-3', 0.
  7. complement, positive strand, negative direction, looking for 5'-CAACATATTG-3', 0.
  8. complement, positive strand, positive direction, looking for 5'-CAACATATTG-3', 0.
  9. inverse complement, negative strand, negative direction, looking for 5'-GTTATACAAC-3', 0.
  10. inverse complement, negative strand, positive direction, looking for 5'-GTTATACAAC-3', 0.
  11. inverse complement, positive strand, negative direction, looking for 5'-GTTATACAAC-3', 0.
  12. inverse complement, positive strand, positive direction, looking for 5'-GTTATACAAC-3', 0.
  13. inverse negative strand, negative direction, looking for 5'-CAATATGTTG-3', 0.
  14. inverse negative strand, positive direction, looking for 5'-CAATATGTTG-3', 0.
  15. inverse positive strand, negative direction, looking for 5'-CAATATGTTG-3', 0.
  16. inverse positive strand, positive direction, looking for 5'-CAATATGTTG-3', 0.

(Mracek2) samplings

For the Basic programs testing consensus sequence 5'-CTTATGTAAA-3' (starting with SuccessablesMra2.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for 5'-CTTATGTAAA-3', 0.
  2. negative strand, positive direction, looking for 5'-CTTATGTAAA-3', 0.
  3. positive strand, negative direction, looking for 5'-CTTATGTAAA-3', 0.
  4. positive strand, positive direction, looking for 5'-CTTATGTAAA-3', 0.
  5. complement, negative strand, negative direction, looking for 5'-GAATACATTT-3', 0.
  6. complement, negative strand, positive direction, looking for 5'-GAATACATTT-3', 0.
  7. complement, positive strand, negative direction, looking for 5'-GAATACATTT-3', 0.
  8. complement, positive strand, positive direction, looking for 5'-GAATACATTT-3', 0.
  9. inverse complement, negative strand, negative direction, looking for 5'-TTTACATAAG-3', 0.
  10. inverse complement, negative strand, positive direction, looking for 5'-TTTACATAAG-3', 0.
  11. inverse complement, positive strand, negative direction, looking for 5'-TTTACATAAG-3', 0.
  12. inverse complement, positive strand, positive direction, looking for 5'-TTTACATAAG-3', 0.
  13. inverse negative strand, negative direction, looking for 5'-AAATGTATTC-3', 0.
  14. inverse negative strand, positive direction, looking for 5'-AAATGTATTC-3', 0.
  15. inverse positive strand, negative direction, looking for 5'-AAATGTATTC-3', 0.
  16. inverse positive strand, positive direction, looking for 5'-AAATGTATTC-3', 0.

Consensus sequence (Motojima)

D-box (TGAGTGG).[2]

(Motojima) samplings

Copying the consensus of the D-box: 5'-TGAGTGG-3' and putting the sequence in "⌘F" finds no locations between ZSCAN22 and A1BG and one between ZNF497 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence 5'-TGAGTGG-3' (starting with SuccessablesMOT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for TGAGTGG, 0.
  2. negative strand, positive direction, looking for TGAGTGG, 1, TGAGTGG at 3449.
  3. positive strand, negative direction, looking for TGAGTGG, 0.
  4. positive strand, positive direction, looking for TGAGTGG, 0.
  5. inverse complement, negative strand, negative direction, looking for CCACTCA, 1, CCACTCA at 3827.
  6. inverse complement, negative strand, positive direction, looking for CCACTCA, 0.
  7. inverse complement, positive strand, negative direction, looking for CCACTCA, 1, CCACTCA at 4487.
  8. inverse complement, positive strand, positive direction, looking for CCACTCA, 0.

MOT UTRs

  1. Positive strand, negative direction: CCACTCA at 4487.

MOT distal promoters

  1. Negative strand, negative direction: CCACTCA at 3827.


  1. Negative strand, positive direction: TGAGTGG at 3449.

Motojima random dataset samplings

  1. RDr0: 0.
  2. RDr1: 0.
  3. RDr2: 0.
  4. RDr3: 0.
  5. RDr4: 0.
  6. RDr5: 0.
  7. RDr6: 0.
  8. RDr7: 0.
  9. RDr8: 0.
  10. RDr9: 0.
  11. RDr0ci: 0.
  12. RDr1ci: 0.
  13. RDr2ci: 0.
  14. RDr3ci: 0.
  15. RDr4ci: 0.
  16. RDr5ci: 0.
  17. RDr6ci: 0.
  18. RDr7ci: 0.
  19. RDr8ci: 0.
  20. RDr9ci: 0.

RDr UTRs

RDr core promoters

RDr proximal promoters

RDr distal promoters

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

References

  1. 1.0 1.1 Dmitry A.Samarsky, Maurille J.Fournier, Robert H.Singer and Edouard Bertrand (1 July 1998). "The snoRNA box C/D motif directs nucleolar targeting and also couples snoRNA synthesis and localization" (PDF). The European Molecular Biology Organization (EMBO) Journal. 17 (13): 3747–3757. doi:10.1093/emboj/17.13.3747. PMID 9649444. Retrieved 2017-02-04.
  2. 2.0 2.1 Masaru Motojima, Takao Ando and Toshimasa Yoshioka (10 July 2000). "Sp1-like activity mediates angiotensin-II-induced plasminogen-activator inhibitor type-1 (PAI-1) gene expression in mesangial cells" (PDF). Biomedical Journal. 349 (2): 435–441. doi:10.1042/0264-6021:3490435. PMID 10880342. Retrieved 13 August 2020.
  3. E. N. Voronina, T. D. Kolokol’tsova, E. A. Nechaeva, and M. L. Filipenko (2003). "Structural–Functional Analysis of the Human Gene for Ribosomal Protein L11" (PDF). Molecular Biology. 37 (3): 362–371. Retrieved 11 April 2019.
  4. PA Johnson, D Bunick, NB Hecht (1991). "Protein Binding Regions in the Mouse and Rat Protamine-2 Genes" (PDF). Biology of Reproduction. 44 (1): 127–134. Retrieved 6 April 2019.
  5. Philipp Mracek, Cristina Santoriello, M. Laura Idda, Cristina Pagano, Zohar Ben-Moshe, Yoav Gothilf, Daniela Vallone, Nicholas S. Foulkes (December 6, 2012). "Regulation of per and cry Genes Reveals a Central Role for the D-Box Enhancer in Light-Dependent Gene Expression". PLoS ONE. 7 (12): e51278. doi:10.1371/journal.pone.0051278. Retrieved 10 February 2019.

External links