CGCG box gene transcriptions
Editor-In-Chief: Henry A. Hoff
A. thaliana is a popular model organism in plant biology and genetics. For a complex multicellular eukaryote, A. thaliana has a relatively small genome of approximately 135 megabase pairs (Mbp).[1]
"The minimum DNA-binding elements are 6-bp CGCG box, (A/C/G)CGCG(C/G/T)."[2]
"AtSR1 [Arabidopsis thaliana signal-responsive genes] targets the nucleus and specifically recognizes a novel 6-bp CGCG box (A/C/G)CGCG(G/T/C). The multiple CGCG cis-elements are found in promoters of genes such as those involved in ethylene signaling, abscisic acid signaling, and light signal perception. The DNA-binding domain in AtSR1 is located on the N-terminal 146 bp where all AtSR1-related proteins share high similarity but have no similarity to other known DNA-binding proteins. The calmodulin-binding nuclear proteins isolated from wounded leaves exhibit specific CGCG box DNA binding activities. These results suggest that the AtSR gene family encodes a family of calmodulin-binding/DNA-binding proteins involved in multiple signal transduction pathways in plants."[2]
"Ca2+-mediated signaling is involved in the transduction of physical signals such as temperature, wind, touch, light, and gravity; oxidative signals such as those arising from pathogen attacks; and hormone signals such as ethylene, abscisic acid (ABA),1 gibberellins, and auxin (2-7). All these signals have been shown to trigger changes in amplitude or oscillation in cytosolic free Ca2+ level. Recently, the signal-induced nuclear free calcium changes were also observed (8). Free Ca2+ changes are sensed by a number of Ca2+-binding proteins that usually contain a common structural motif, the “EF-hand,” a helix-loop-helix structure (9). One of the best characterized Ca2+-binding proteins is calmodulin (CaM), a highly conserved and multifunctional regulatory protein in eukaryotes. Its regulatory activities are triggered by its ability to modulate the activity of a certain set of CaM-binding proteins after binding to Ca2+, and thereby generating physiological responses to various stimuli (10-15)."[2]
"The CaM-regulated basic helix-loop-helix family of transcription factors was reported in mammals, where CaM inhibits the protein-DNA interaction by competing with the DNA-binding domain in certain proteins (16)."[2]
"cis-acting elements ACGCGG/CCGCGT were present in the promoter regions of about 130 genes (more than two copies) in Arabidopsis genome."[2]
"The promoter regions are assumed to be within ∼1 kb upstream of the starting transcription site (for the known genes) or the first ATG (for the predicted genes). These genes are related to ethylene signaling (EIN3) and ABA signaling (a putative ABA responsive protein), light perception (phytochrome A, phyA), stress responsive such as the DNA repairing protein, heat shock protein, touch protein (TCH 4), and CaM-regulated ion channel. CaM genes (CaM2 andCaM3) and AtSR6 also contains CGCGcis-elements in their promoter regions."[2]
CGCG box samplings
For the Basic programs (starting with SuccessablesCGCG.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for (A/C/G)CGCG(C/G/T), and found:
- Negative strand, negative direction: 2, CCGCGC at 1761, GCGCGT at 161.
- Negative strand, positive direction: 9, CCGCGC at 1650, CCGCGG at 1437, CCGCGG at 1337, GCGCGT at 1215, ACGCGG at 971, ACGCGG at 871, GCGCGC at 683, CCGCGC at 681, GCGCGT at 543.
- Positive strand, negative direction: 1, GCGCGG at 1762.
- Positive strand, positive direction: 23, CCGCGG at 1769, ACGCGG at 1656, CCGCGT at 1550, ACGCGT at 1523, ACGCGG at 1498, ACGCGG at 1454, ACGCGT at 1414, ACGCGG at 1398, ACGCGG at 1354, ACGCGT at 1314, CCGCGT at 1298, ACGCGG at 1246, CCGCGC at 1214, ACGCGG at 1162, ACGCGG at 1078, CCGCGT at 1046, CCGCGT at 976, CCGCGT at 876, GCGCGT at 684, GCGCGC at 682, CCGCGC at 542, ACGCGG at 452, CCGCGC at 161.
The inverse complements are the same as the directs.
CGCG box distal promoters
- Negative strand, negative direction: CCGCGC at 1761, GCGCGT at 161.
- Positive strand, negative direction: GCGCGG at 1762.
- Negative strand, positive direction: CCGCGC at 1650, CCGCGG at 1437, CCGCGG at 1337, GCGCGT at 1215, ACGCGG at 971, ACGCGG at 871, GCGCGC at 683, CCGCGC at 681, GCGCGT at 543.
- Positive strand, positive direction: CCGCGG at 1769, ACGCGG at 1656, CCGCGT at 1550, ACGCGT at 1523, ACGCGG at 1498, ACGCGG at 1454, ACGCGT at 1414, ACGCGG at 1398, ACGCGG at 1354, ACGCGT at 1314, CCGCGT at 1298, ACGCGG at 1246, CCGCGC at 1214, ACGCGG at 1162, ACGCGG at 1078, CCGCGT at 1046, CCGCGT at 976, CCGCGT at 876, GCGCGT at 684, GCGCGC at 682, CCGCGC at 542, ACGCGG at 452, CCGCGC at 161.
CGCG box random dataset samplings
- CGCGr0: 5, GCGCGC at 3885, GCGCGC at 3238, GCGCGG at 3116, GCGCGG at 2069, CCGCGG at 1893.
- RDr1: 0.
- RDr2: 0.
- RDr3: 0.
- RDr4: 0.
- RDr5: 0.
- RDr6: 0.
- RDr7: 0.
- RDr8: 0.
- RDr9: 0.
- RDr0ci: 0.
- RDr1ci: 0.
- RDr2ci: 0.
- RDr3ci: 0.
- RDr4ci: 0.
- RDr5ci: 0.
- RDr6ci: 0.
- RDr7ci: 0.
- RDr8ci: 0.
- RDr9ci: 0.
RDr UTRs
RDr core promoters
RDr proximal promoters
RDr distal promoters
CGCG box analysis and results
Acknowledgements
The content on this page was first contributed by: Henry A. Hoff.
Initial content for this page in some instances came from Wikiversity.
See also
References
- ↑ Genome Assembly. The Arabidopsis Information Resource. Retrieved 29 March 2016.
- ↑ 2.0 2.1 2.2 2.3 2.4 2.5 Tianbao Yang and B. W. Poovaiah (22 November 2002). "A calmodulin-binding/CGCG box DNA-binding protein family involved in multiple signaling pathways in plants". Journal of Biological Chemistry. 277 (47): 45049–45058. doi:10.1074/jbc.M207941200. Retrieved 2017-02-05.