Gcn4p gene transcriptions: Difference between revisions

Jump to navigation Jump to search
 
(21 intermediate revisions by the same user not shown)
Line 9: Line 9:
The "transcription factor Rap1p not only depleted the nucleosome from its own binding site of the ''HIS4'' promoter, but also reduced a nearby nucleosome to increase the accessibility of other transcription factors, including Gcn4p, Bas1p, Bas2p [102]."<ref name=Tang/>
The "transcription factor Rap1p not only depleted the nucleosome from its own binding site of the ''HIS4'' promoter, but also reduced a nearby nucleosome to increase the accessibility of other transcription factors, including Gcn4p, Bas1p, Bas2p [102]."<ref name=Tang/>


==Consensus sequences==
==Consensus sequences (Tang)==


UAS Sequence for the transcription factor Gcn4p is 5'-ATGACTCTT-3'.<ref name=Tang>{{ cite journal
UAS Sequence for the transcription factor Gcn4p binding site is ATGACTCTT.<ref name=Tang>{{ cite journal
|author=Hongting Tang, Yanling Wu, Jiliang Deng, Nanzhu Chen, Zhaohui Zheng, Yongjun Wei, Xiaozhou Luo, and Jay D. Keasling
|author=Hongting Tang, Yanling Wu, Jiliang Deng, Nanzhu Chen, Zhaohui Zheng, Yongjun Wei, Xiaozhou Luo, and Jay D. Keasling
|title=Promoter Architecture and Promoter Engineering in ''Saccharomyces cerevisiae''
|title=Promoter Architecture and Promoter Engineering in ''Saccharomyces cerevisiae''
Line 25: Line 25:
|pmid=32781665
|pmid=32781665
|accessdate=18 September 2020 }}</ref>
|accessdate=18 September 2020 }}</ref>
==Gcn4p binding site (Tang) samplings==
Copying ATGACTC in "⌘F" yields none between ZNF497 and A1BG on the negative strand.
Copying ATGACTCT in "⌘F" yields none between ZNF497 and A1BG on the positive strand.
Copying ATGACTCA in "⌘F" yields none between ZNF497 and A1BG on the positive strand.
Copying ATGACTC in "⌘F" yields none between ZSCAN22 and A1BG on the negative strand.
Copying ATGACTCTT in "⌘F" yields none between ZSCAN22 and A1BG on the positive strand.
Copying ATGACTCA in "⌘F" yields none between ZSCAN22 and A1BG on the positive strand.
Copying ATGACTC(A/T)T in "⌘F" yields none between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs:
# Negative strand, negative direction: 0.
# Positive strand, negative direction: 0.
# Negative strand, positive direction: 0.
# Positive strand, positive direction: 0.
# ci Negative strand, negative direction: 0.
# ci Positive strand, negative direction: 0.
# ci Negative strand, positive direction: 0.
# ci Positive strand, positive direction: 0.
==Consensus sequences (Staschke)==


"The program DNA-Pattern was used to search for and catalogue occurrences of consensus GCRE (TGABTVW) [TGA(C/G/T)T(A/C/G)(A/T)] and GATA (GATAAG, GATAAH, GATTA) motifs in yeast promoters."<ref name=Staschke>{{ cite journal
"The program DNA-Pattern was used to search for and catalogue occurrences of consensus GCRE (TGABTVW) [TGA(C/G/T)T(A/C/G)(A/T)] and GATA (GATAAG, GATAAH, GATTA) motifs in yeast promoters."<ref name=Staschke>{{ cite journal
Line 41: Line 68:
|accessdate=4 January 2021 }}</ref>
|accessdate=4 January 2021 }}</ref>


"The predicted Gln3p and Gcn4p binding sites in the UGA3 promoter are [...] the consensus Gln3p (GATA) and Gcn4p (GCRE) [TGAGTCA] binding sites present in the minimal UGA3 promoter at -􏰉206 and -􏰉112, respectively, [...]."<ref name=Staschke/>
"The predicted Gln3p and Gcn4p binding sites in the UGA3 promoter are [...] the consensus Gln3p (GATA) and Gcn4p (GCRE) [TGAGTCA] binding sites present in the minimal UGA3 promoter at -206 and -112, respectively, [...]."<ref name=Staschke/>


==Samplings==
==GCRE samplings==
 
Copying 5'-ATGACTCTT-3' in "⌘F" yields none between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.


For the Basic programs testing consensus sequence TGA(C/G/T)T(A/C/G)(A/T) (starting with SuccessablesGcn4.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
For the Basic programs testing consensus sequence TGA(C/G/T)T(A/C/G)(A/T) (starting with SuccessablesGcn4.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
Line 65: Line 90:
# inverse positive strand, positive direction, looking for (A/T)(A/C/G)T(C/G/T)AGT, 4, TGTGAGT at 4336, AATTAGT at 4147, TCTCAGT at 2613, AGTCAGT at 2100.
# inverse positive strand, positive direction, looking for (A/T)(A/C/G)T(C/G/T)AGT, 4, TGTGAGT at 4336, AATTAGT at 4147, TCTCAGT at 2613, AGTCAGT at 2100.


===Gcn4 core promoters===
===GCN4 (4560-2846) UTRs===
{{main|Core promoter gene transcriptions}}
Negative strand, positive direction: ACACTCA at 4336, and complement.


Positive strand, positive direction: TGAGTGA at 4338, and complement.
# Negative strand, negative direction: ACACTCA at 4094, TGAGTCT at 3644, TGACTAT at 3544.
# Positive strand, negative direction: AGACTCA at 4055, TGAGTAT at 3829, AGAATCA at 3237.


===Gcn4 proximal promoters===
===GCN4 positive direction (4445-4265) core promoters===
{{main|Proximal promoter gene transcriptions}}
Negative strand, positive direction: TGATTAT at 4165, TGATTAT at 4158, TTAATCA at 4147, and complements.


===Gcn4 distal promoters===
# Negative strand, positive direction: ACACTCA at 4336.
{{main|Distal promoter gene transcriptions}}
# Positive strand, positive direction: TGAGTGA at 4338.
Negative strand, negative direction: ACACTCA at 4094, TGAGTCT at 3644, TGACTAT at 3544, ATACTCA at 276, and complements.


Positive strand, negative direction: AGACTCA at 4055, TGAGTAT at 3829, AGAATCA at 3237, TGACTCT at 2788, AGACTCA at 1404, AGAATCA at 293, TGAGTCT at 278, AGAATCA at 198, and complements.
===GCN4 negative direction (2811-2596) proximal promoters===


Negative strand, positive direction: TGAGTGA at 3712, TGAGTGT at 3592, TGATTGT at 2678, AGAGTCA at 2613, TCAGTCA at 2100, and complements.
# Positive strand, negative direction: TGACTCT at 2788.


Positive strand, positive direction: TTACTCA at 3447, AGACTCA at 3008, AGACTCA at 2952, TGACTAA at 2676, and complement.
===GCN4 positive direction (4265-4050) proximal promoters===


==Tang random dataset samplings==
# Negative strand, positive direction: ACACTCA at 4336, TGATTAT at 4165, TGATTAT at 4158, TTAATCA at 4147.
 
===GCN4 negative direction (2596-1) distal promoters===
 
# Negative strand, negative direction: ATACTCA at 276.
# Positive strand, negative direction: AGACTCA at 1404, AGAATCA at 293, TGAGTCT at 278, AGAATCA at 198.
 
===GCN4 positive direction (4050-1) distal promoters===
 
# Negative strand, positive direction: TGAGTGA at 3712, TGAGTGT at 3592, TGATTGT at 2678, AGAGTCA at 2613, TCAGTCA at 2100.
# Positive strand, positive direction: TTACTCA at 3447, AGACTCA at 3008, AGACTCA at 2952, TGACTAA at 2676.
 
==GCRE random dataset samplings==


# Gcn4r0: 2, TGAGTGA at 743, TGACTAT at 706.
# Gcn4r0: 2, TGAGTGA at 743, TGACTAT at 706.
Line 108: Line 141:
# Gcn4r9ci: 5, TTAATCA at 3367, ATACTCA at 3248, TGACTCA at 2082, ATACTCA at 1195, TCAGTCA at 106.
# Gcn4r9ci: 5, TTAATCA at 3367, ATACTCA at 3248, TGACTCA at 2082, ATACTCA at 1195, TCAGTCA at 106.


===Gcn4r UTRs===
===GCN4r arbitrary (evens) (4560-2846) UTRs===
{{main|UTR promoter gene transcriptions}}
 
# Gcn4r2: TGAGTAA at 3017.
# Gcn4r2: TGAGTAA at 3017.
# Gcn4r4: TGATTGT at 3795, TGAGTAA at 3403, TGACTCA at 2963.
# Gcn4r4: TGATTGT at 3795, TGAGTAA at 3403, TGACTCA at 2963.
# Gcn4r6: TGATTCT at 3178.
# Gcn4r6: TGATTCT at 3178.
# Gcn4r0ci: TTAGTCA at 3188.
# Gcn4r0ci: TTAGTCA at 3188.
# Gcn4r2ci: TTACTCA at 4306, TTAGTCA at 2899.
# Gcn4r2ci: TTACTCA at 4306, TTAGTCA at 2899.
Line 120: Line 151:
# Gcn4r6ci: AGAGTCA at 4503, AGAGTCA at 4449.
# Gcn4r6ci: AGAGTCA at 4503, AGAGTCA at 4449.


===Gcn4r core promoters===
===GCN4r alternate (odds) (4560-2846) UTRs===
{{main|Core promoter gene transcriptions}}
 
# Gcn4r1: TGATTAA at 2851.
# Gcn4r3: TGACTCT at 4481.
# Gcn4r3: TGACTCT at 4481.
# Gcn4r9: TGACTAA at 4474, TGAGTAT at 4268.
# Gcn4r5: TGAGTCA at 4249, TGATTAA at 4204.
# Gcn4r7: TGACTAA at 4046.
# Gcn4r9: TGACTAA at 4474, TGAGTAT at 4268, TGATTGA at 4264, TGACTGA at 4207, TGACTAA at 3706, TGACTCT at 3523.
# Gcn4r1ci: TTAATCA at 3460.
# Gcn4r3ci: ATACTCA at 4179.
# Gcn4r5ci: TGAGTCA at 4249, ACACTCA at 4026, ATAATCA at 3215.
# Gcn4r7ci: AGACTCA at 3443.
# Gcn4r9ci: TTAATCA at 3367, ATACTCA at 3248.
 
===GCN4r arbitrary positive direction (odds) (4445-4265) core promoters===
 
# Gcn4r9: TGAGTAT at 4268.
 
===GCN4r alternate positive direction (evens) (4445-4265) core promoters===
 
# Gcn4r2ci: TTACTCA at 4306.
 
===GCN4r arbitrary negative direction (evens) (2811-2596) proximal promoters===


===Gcn4r proximal promoters===
{{main|Proximal promoter gene transcriptions}}
# Gcn4r4ci: TTACTCA at 2650.
# Gcn4r4ci: TTACTCA at 2650.


===GCN4r arbitrary positive direction (odds) (4265-4050) proximal promoters===


# Gcn4r5: TGAGTCA at 4249, TGATTAA at 4204.
# Gcn4r5: TGAGTCA at 4249, TGATTAA at 4204.
# Gcn4r7: TGACTAA at 4046.
# Gcn4r9: TGATTGA at 4264, TGACTGA at 4207.
# Gcn4r9: TGATTGA at 4264, TGACTGA at 4207.
# Gcn4r3ci: ATACTCA at 4179.
# Gcn4r5ci: TGAGTCA at 4249, ACACTCA at 4026.


===GCN4r alternate positive direction (evens) (4265-4050) proximal promoters===


# Gcn4r3ci: ATACTCA at 4179.
# Gcn4r4ci: ATAGTCA at 4258, ATAATCA at 4118.
# Gcn4r5ci: TGAGTCA at 4249.
 
===GCN4r arbitrary negative direction (evens) (2596-1) distal promoters===


===Gcn4r distal promoters===
{{main|Distal promoter gene transcriptions}}
# Gcn4r0: TGAGTGA at 743, TGACTAT at 706.
# Gcn4r0: TGAGTGA at 743, TGACTAT at 706.
# Gcn4r2: TGATTCA at 489.
# Gcn4r2: TGATTCA at 489.
Line 145: Line 194:
# Gcn4r6: TGACTCA at 2045, TGATTCA at 283, TGATTCT at 260.
# Gcn4r6: TGACTCA at 2045, TGATTCA at 283, TGATTCT at 260.
# Gcn4r8: TGAGTAA at 1966, TGATTGA at 1177.
# Gcn4r8: TGAGTAA at 1966, TGATTGA at 1177.
# Gcn4r0ci: TCACTCA at 2589.
# Gcn4r0ci: TCACTCA at 2589.
# Gcn4r2ci: TCAATCA at 71.
# Gcn4r2ci: TCAATCA at 71.
Line 153: Line 200:
# Gcn4r8ci: TTACTCA at 2158, ATAATCA at 1749.
# Gcn4r8ci: TTACTCA at 2158, ATAATCA at 1749.


===GCN4r alternate negative direction (odds) (2596-1) distal promoters===


# Gcn4r1: 1, TGATTAA at 2851.
# Gcn4r3: TGACTCT at 2451.
# Gcn4r3: TGACTCT at 2451.
# Gcn4r7: 1, TGACTAA at 4046.
# Gcn4r9: TGACTCA at 2082.
# Gcn4r9: TGACTAA at 3706, TGACTCT at 3523, TGACTCA at 2082.
# Gcn4r3ci: TCAATCA at 1655, TGAATCA at 1555, TTAATCA at 1352, AGAGTCA at 477, AGACTCA at 216.
# Gcn4r5ci: TTAGTCA at 1354.
# Gcn4r7ci: TCAATCA at 1137, ACAGTCA at 1014, ACAGTCA at 202.
# Gcn4r9ci: TGACTCA at 2082, ATACTCA at 1195, TCAGTCA at 106.


===GCN4r arbitrary positive direction (odds) (4050-1) distal promoters===


# Gcn4r1: TGATTAA at 2851.
# Gcn4r3: TGACTCT at 2451.
# Gcn4r7: TGACTAA at 4046.
# Gcn4r9: TGACTAA at 3706, TGACTCT at 3523, TGACTCA at 2082.
# Gcn4r1ci: TTAATCA at 3460.
# Gcn4r1ci: TTAATCA at 3460.
# Gcn4r3ci: TCAATCA at 1655, TGAATCA at 1555, TTAATCA at 1352, AGAGTCA at 477, AGACTCA at 216.
# Gcn4r3ci: TCAATCA at 1655, TGAATCA at 1555, TTAATCA at 1352, AGAGTCA at 477, AGACTCA at 216.
# Gcn4r5ci: ACACTCA at 4026, ATAATCA at 3215, TTAGTCA at 1354.
# Gcn4r5ci: ACACTCA at 4026, ATAATCA at 3215, TTAGTCA at 1354.
# Gcn4r7ci: TCAATCA at 1137, ACAGTCA at 1014, ACAGTCA at 202.
# Gcn4r7ci: AGACTCA at 3443, TCAATCA at 1137, ACAGTCA at 1014, ACAGTCA at 202.
# Gcn4r9ci: TTAATCA at 3367, ATACTCA at 3248, TGACTCA at 2082, ATACTCA at 1195, TCAGTCA at 106.
# Gcn4r9ci: TTAATCA at 3367, ATACTCA at 3248, TGACTCA at 2082, ATACTCA at 1195, TCAGTCA at 106.
===GCN4r alternate positive direction (evens) (4050-1) distal promoters===
# Gcn4r0: TGAGTGA at 743, TGACTAT at 706.
# Gcn4r2: TGAGTAA at 3017, TGATTCA at 489.
# Gcn4r4: TGATTGT at 3795, TGAGTAA at 3403, TGACTCA at 2963, TGAGTGT at 2573, TGAGTCA at 2248, TGAGTGA at 1000.
# Gcn4r6: TGATTCT at 3178, TGACTCA at 2045, TGATTCA at 283, TGATTCT at 260.
# Gcn4r8: TGAGTAA at 1966, TGATTGA at 1177.
# Gcn4r0ci: TTAGTCA at 3188, TCACTCA at 2589.
# Gcn4r2ci: TTAGTCA at 2899, TCAATCA at 71.
# Gcn4r4ci: TGACTCA at 2963, TTACTCA at 2650, TGAGTCA at 2248, TTACTCA at 1137, ACACTCA at 626.
# Gcn4r6ci: TGACTCA at 2045, ACACTCA at 1842, TCAATCA at 1836, AGAGTCA at 1109, TTAATCA at 35.
# Gcn4r8ci: TTACTCA at 2158, ATAATCA at 1749.
==GCN4 analysis and results==
{{main|Complex locus A1BG and ZNF497#GCREs (Gcn4)}}
"The program DNA-Pattern was used to search for and catalogue occurrences of consensus GCRE (TGABTVW) [TGA(C/G/T)T(A/C/G)(A/T)]."<ref name=Staschke/>
{|class="wikitable"
|-
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1)
|-
| Reals || UTR || negative || 6 || 2 || 3 || 3
|-
| Randoms || UTR || arbitrary negative || 13 || 10 || 1.3 || 1.6
|-
| Randoms || UTR || alternate negative || 19 || 10 || 1.9 || 1.6
|-
| Reals || Core || negative || 0 || 2 || 0 || 0
|-
| Randoms || Core || arbitrary negative || 0 || 10 || 0 || 0
|-
| Randoms || Core || alternate negative || 0 || 10 || 0 || 0
|-
| Reals || Core || positive || 2 || 2 || 1 || 1
|-
| Randoms || Core || arbitrary positive || 1 || 10 || 0.1 || 0.1
|-
| Randoms || Core || alternate positive || 1 || 10 || 0.1 || 0.1
|-
| Reals || Proximal || negative || 1 || 2 || 0.5 || 0.5
|-
| Randoms || Proximal || arbitrary negative || 1 || 10 || 0.1 || 0.05
|-
| Randoms || Proximal || alternate negative || 0 || 10 || 0 || 0.05
|-
| Reals || Proximal || positive || 4 || 2 || 2 || 2
|-
| Randoms || Proximal || arbitrary positive || 7 || 10 || 0.7 || 0.45
|-
| Randoms || Proximal || alternate positive || 2 || 10 || 0.2 || 0.45
|-
| Reals || Distal || negative || 5 || 2 || 2.5 || 2.5 ± 1.5 (--1,+-4)
|-
| Randoms || Distal || arbitrary negative || 23 || 10 || 2.3 || 1.85 ± 0.45
|-
| Randoms || Distal || alternate negative || 14 || 10 || 1.4 || 1.85 ± 0.45
|-
| Reals || Distal || positive || 9 || 2 || 4.5 || 4.5 ± 0.5 (-+5,++4)
|-
| Randoms || Distal || arbitrary positive || 24 || 10 || 2.4 || 2.8 ± 0.4
|-
| Randoms || Distal || alternate positive || 32 || 10 || 3.2 || 2.8 ± 0.4
|}
Comparison:
The occurrences of real GCN4 UTRs, cores, proximals and positive direction distals are greater than the randoms and the negative direction distals are outside the randoms. This suggests that the real GCN4s are likely active or activable.


==See also==
==See also==
{{div col|colwidth=20em}}
{{div col|colwidth=20em}}
* [[A1BG gene transcriptions]]
* [[A1BG regulatory elements and regions]]
* [[A1BG response element gene transcriptions]]
* [[A1BG response element positive results]]
* [[Complex locus A1BG and ZNF497]]
* [[Complex locus A1BG and ZNF497]]
* [[Transcription factor]]
{{Div col end}}
{{Div col end}}


Line 181: Line 307:


<!-- footer categories -->
<!-- footer categories -->
[[Category:Resources last modified in September 2020]]

Latest revision as of 03:46, 18 September 2023

Associate Editor(s)-in-Chief: Henry A. Hoff

"The saturation mutagenesis of the transcription factor [general control nonderepressible gene protein] Gcn4p’s binding site (5′-ATGACTCTT-3′) within the HIS3 promoter found that almost all mismatch mutants reduced the PHIS3 activity significantly and only one mutant with the sequence 5′-ATGACTCAT-3′ increased the binding affinity of Gcn4p and improved the PHIS3 activity [82]. It has been shown that regulatory regions containing multiple UAS or URS sites for binding the same transcription factor could enhance their activation or repression of transcription. In a test of 15 transcription factors, such as Gal4p, Gcn4p, Bas1p, increasing the number of their UAS sites improved promoter activities; similarly, promoters with multiple URS sites showed a stronger repression, such as Matα2p-Mcm1p."[1]

Human genes

Interactions

The "transcription factor Rap1p not only depleted the nucleosome from its own binding site of the HIS4 promoter, but also reduced a nearby nucleosome to increase the accessibility of other transcription factors, including Gcn4p, Bas1p, Bas2p [102]."[1]

Consensus sequences (Tang)

UAS Sequence for the transcription factor Gcn4p binding site is ATGACTCTT.[1]

Gcn4p binding site (Tang) samplings

Copying ATGACTC in "⌘F" yields none between ZNF497 and A1BG on the negative strand.

Copying ATGACTCT in "⌘F" yields none between ZNF497 and A1BG on the positive strand.

Copying ATGACTCA in "⌘F" yields none between ZNF497 and A1BG on the positive strand.

Copying ATGACTC in "⌘F" yields none between ZSCAN22 and A1BG on the negative strand.

Copying ATGACTCTT in "⌘F" yields none between ZSCAN22 and A1BG on the positive strand.

Copying ATGACTCA in "⌘F" yields none between ZSCAN22 and A1BG on the positive strand.

Copying ATGACTC(A/T)T in "⌘F" yields none between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs:

  1. Negative strand, negative direction: 0.
  2. Positive strand, negative direction: 0.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 0.
  5. ci Negative strand, negative direction: 0.
  6. ci Positive strand, negative direction: 0.
  7. ci Negative strand, positive direction: 0.
  8. ci Positive strand, positive direction: 0.

Consensus sequences (Staschke)

"The program DNA-Pattern was used to search for and catalogue occurrences of consensus GCRE (TGABTVW) [TGA(C/G/T)T(A/C/G)(A/T)] and GATA (GATAAG, GATAAH, GATTA) motifs in yeast promoters."[2]

"The predicted Gln3p and Gcn4p binding sites in the UGA3 promoter are [...] the consensus Gln3p (GATA) and Gcn4p (GCRE) [TGAGTCA] binding sites present in the minimal UGA3 promoter at -206 and -112, respectively, [...]."[2]

GCRE samplings

For the Basic programs testing consensus sequence TGA(C/G/T)T(A/C/G)(A/T) (starting with SuccessablesGcn4.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for TGA(C/G/T)T(A/C/G)(A/T), 2, TGAGTCT at 3644, TGACTAT at 3544, and complements.
  2. negative strand, positive direction, looking for TGA(C/G/T)T(A/C/G)(A/T), 5, TGATTAT at 4165, TGATTAT at 4158, TGAGTGA at 3712, TGAGTGT at 3592, TGATTGT at 2678, and complements.
  3. positive strand, negative direction, looking for TGA(C/G/T)T(A/C/G)(A/T), 3, TGAGTAT at 3829, TGACTCT at 2788, TGAGTCT at 278, and complements.
  4. positive strand, positive direction, looking for TGA(C/G/T)T(A/C/G)(A/T), 2, TGAGTGA at 4338, TGACTAA at 2676, and complements.
  5. complement, negative strand, negative direction, looking for ACT(A/C/G)A(C/G/T)(A/T), 3, ACTCATA at 3829, ACTGAGA at 2788, ACTCAGA at 278.
  6. complement, negative strand, positive direction, looking for ACT(A/C/G)A(C/G/T)(A/T), 2, ACTCACT at 4338, ACTGATT at 2676.
  7. complement, positive strand, negative direction, looking for ACT(A/C/G)A(C/G/T)(A/T), 2, ACTCAGA at 3644, ACTGATA at 3544.
  8. complement, positive strand, positive direction, looking for ACT(A/C/G)A(C/G/T)(A/T), 5, ACTAATA at 4165, ACTAATA at 4158, ACTCACT at 3712, ACTCACA at 3592, ACTAACA at 2678.
  9. inverse complement, negative strand, negative direction, looking for (A/T)(C/G/T)A(A/C/G)TCA, 2, ACACTCA at 4094, ATACTCA at 276, and complements.
  10. inverse complement, negative strand, positive direction, looking for (A/T)(C/G/T)A(A/C/G)TCA, 4, ACACTCA at 4336, TTAATCA at 4147, AGAGTCA at 2613, TCAGTCA at 2100, and complements.
  11. inverse complement, positive strand, negative direction, looking for (A/T)(C/G/T)A(A/C/G)TCA, 5, AGACTCA at 4055, AGAATCA at 3237, AGACTCA at 1404, AGAATCA at 293, AGAATCA at 198, and complements.
  12. inverse complement, positive strand, positive direction, looking for (A/T)(C/G/T)A(A/C/G)TCA, 3, TTACTCA at 3447, AGACTCA at 3008, AGACTCA at 2952, and complements.
  13. inverse negative strand, negative direction, looking for (A/T)(A/C/G)T(C/G/T)AGT, 5, TCTGAGT at 4055, TCTTAGT at 3237, TCTGAGT at 1404, TCTTAGT at 293, TCTTAGT at 198.
  14. inverse negative strand, positive direction, looking for (A/T)(A/C/G)T(C/G/T)AGT, 3, AATGAGT at 3447, TCTGAGT at 3008, TCTGAGT at 2952.
  15. inverse positive strand, negative direction, looking for (A/T)(A/C/G)T(C/G/T)AGT, 2, TGTGAGT at 4094, TATGAGT at 276.
  16. inverse positive strand, positive direction, looking for (A/T)(A/C/G)T(C/G/T)AGT, 4, TGTGAGT at 4336, AATTAGT at 4147, TCTCAGT at 2613, AGTCAGT at 2100.

GCN4 (4560-2846) UTRs

  1. Negative strand, negative direction: ACACTCA at 4094, TGAGTCT at 3644, TGACTAT at 3544.
  2. Positive strand, negative direction: AGACTCA at 4055, TGAGTAT at 3829, AGAATCA at 3237.

GCN4 positive direction (4445-4265) core promoters

  1. Negative strand, positive direction: ACACTCA at 4336.
  2. Positive strand, positive direction: TGAGTGA at 4338.

GCN4 negative direction (2811-2596) proximal promoters

  1. Positive strand, negative direction: TGACTCT at 2788.

GCN4 positive direction (4265-4050) proximal promoters

  1. Negative strand, positive direction: ACACTCA at 4336, TGATTAT at 4165, TGATTAT at 4158, TTAATCA at 4147.

GCN4 negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: ATACTCA at 276.
  2. Positive strand, negative direction: AGACTCA at 1404, AGAATCA at 293, TGAGTCT at 278, AGAATCA at 198.

GCN4 positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: TGAGTGA at 3712, TGAGTGT at 3592, TGATTGT at 2678, AGAGTCA at 2613, TCAGTCA at 2100.
  2. Positive strand, positive direction: TTACTCA at 3447, AGACTCA at 3008, AGACTCA at 2952, TGACTAA at 2676.

GCRE random dataset samplings

  1. Gcn4r0: 2, TGAGTGA at 743, TGACTAT at 706.
  2. Gcn4r1: 1, TGATTAA at 2851.
  3. Gcn4r2: 2, TGAGTAA at 3017, TGATTCA at 489.
  4. Gcn4r3: 2, TGACTCT at 4481, TGACTCT at 2451.
  5. Gcn4r4: 6, TGATTGT at 3795, TGAGTAA at 3403, TGACTCA at 2963, TGAGTGT at 2573, TGAGTCA at 2248, TGAGTGA at 1000.
  6. Gcn4r5: 2, TGAGTCA at 4249, TGATTAA at 4204.
  7. Gcn4r6: 4, TGATTCT at 3178, TGACTCA at 2045, TGATTCA at 283, TGATTCT at 260.
  8. Gcn4r7: 1, TGACTAA at 4046.
  9. Gcn4r8: 2, TGAGTAA at 1966, TGATTGA at 1177.
  10. Gcn4r9: 7, TGACTAA at 4474, TGAGTAT at 4268, TGATTGA at 4264, TGACTGA at 4207, TGACTAA at 3706, TGACTCT at 3523, TGACTCA at 2082.
  11. Gcn4r0ci: 2, TTAGTCA at 3188, TCACTCA at 2589.
  12. Gcn4r1ci: 1, TTAATCA at 3460.
  13. Gcn4r2ci: 3, TTACTCA at 4306, TTAGTCA at 2899, TCAATCA at 71.
  14. Gcn4r3ci: 6, ATACTCA at 4179, TCAATCA at 1655, TGAATCA at 1555, TTAATCA at 1352, AGAGTCA at 477, AGACTCA at 216.
  15. Gcn4r4ci: 7, ATAGTCA at 4258, ATAATCA at 4118, TGACTCA at 2963, TTACTCA at 2650, TGAGTCA at 2248, TTACTCA at 1137, ACACTCA at 626.
  16. Gcn4r5ci: 4, TGAGTCA at 4249, ACACTCA at 4026, ATAATCA at 3215, TTAGTCA at 1354.
  17. Gcn4r6ci: 7, AGAGTCA at 4503, AGAGTCA at 4449, TGACTCA at 2045, ACACTCA at 1842, TCAATCA at 1836, AGAGTCA at 1109, TTAATCA at 35.
  18. Gcn4r7ci: 4, AGACTCA at 3443, TCAATCA at 1137, ACAGTCA at 1014, ACAGTCA at 202.
  19. Gcn4r8ci: 2, TTACTCA at 2158, ATAATCA at 1749.
  20. Gcn4r9ci: 5, TTAATCA at 3367, ATACTCA at 3248, TGACTCA at 2082, ATACTCA at 1195, TCAGTCA at 106.

GCN4r arbitrary (evens) (4560-2846) UTRs

  1. Gcn4r2: TGAGTAA at 3017.
  2. Gcn4r4: TGATTGT at 3795, TGAGTAA at 3403, TGACTCA at 2963.
  3. Gcn4r6: TGATTCT at 3178.
  4. Gcn4r0ci: TTAGTCA at 3188.
  5. Gcn4r2ci: TTACTCA at 4306, TTAGTCA at 2899.
  6. Gcn4r4ci: ATAGTCA at 4258, ATAATCA at 4118, TGACTCA at 2963.
  7. Gcn4r6ci: AGAGTCA at 4503, AGAGTCA at 4449.

GCN4r alternate (odds) (4560-2846) UTRs

  1. Gcn4r1: TGATTAA at 2851.
  2. Gcn4r3: TGACTCT at 4481.
  3. Gcn4r5: TGAGTCA at 4249, TGATTAA at 4204.
  4. Gcn4r7: TGACTAA at 4046.
  5. Gcn4r9: TGACTAA at 4474, TGAGTAT at 4268, TGATTGA at 4264, TGACTGA at 4207, TGACTAA at 3706, TGACTCT at 3523.
  6. Gcn4r1ci: TTAATCA at 3460.
  7. Gcn4r3ci: ATACTCA at 4179.
  8. Gcn4r5ci: TGAGTCA at 4249, ACACTCA at 4026, ATAATCA at 3215.
  9. Gcn4r7ci: AGACTCA at 3443.
  10. Gcn4r9ci: TTAATCA at 3367, ATACTCA at 3248.

GCN4r arbitrary positive direction (odds) (4445-4265) core promoters

  1. Gcn4r9: TGAGTAT at 4268.

GCN4r alternate positive direction (evens) (4445-4265) core promoters

  1. Gcn4r2ci: TTACTCA at 4306.

GCN4r arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. Gcn4r4ci: TTACTCA at 2650.

GCN4r arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. Gcn4r5: TGAGTCA at 4249, TGATTAA at 4204.
  2. Gcn4r9: TGATTGA at 4264, TGACTGA at 4207.
  3. Gcn4r3ci: ATACTCA at 4179.
  4. Gcn4r5ci: TGAGTCA at 4249, ACACTCA at 4026.

GCN4r alternate positive direction (evens) (4265-4050) proximal promoters

  1. Gcn4r4ci: ATAGTCA at 4258, ATAATCA at 4118.

GCN4r arbitrary negative direction (evens) (2596-1) distal promoters

  1. Gcn4r0: TGAGTGA at 743, TGACTAT at 706.
  2. Gcn4r2: TGATTCA at 489.
  3. Gcn4r4: TGAGTGT at 2573, TGAGTCA at 2248, TGAGTGA at 1000.
  4. Gcn4r6: TGACTCA at 2045, TGATTCA at 283, TGATTCT at 260.
  5. Gcn4r8: TGAGTAA at 1966, TGATTGA at 1177.
  6. Gcn4r0ci: TCACTCA at 2589.
  7. Gcn4r2ci: TCAATCA at 71.
  8. Gcn4r4ci: TGAGTCA at 2248, TTACTCA at 1137, ACACTCA at 626.
  9. Gcn4r6ci: TGACTCA at 2045, ACACTCA at 1842, TCAATCA at 1836, AGAGTCA at 1109, TTAATCA at 35.
  10. Gcn4r8ci: TTACTCA at 2158, ATAATCA at 1749.

GCN4r alternate negative direction (odds) (2596-1) distal promoters

  1. Gcn4r3: TGACTCT at 2451.
  2. Gcn4r9: TGACTCA at 2082.
  3. Gcn4r3ci: TCAATCA at 1655, TGAATCA at 1555, TTAATCA at 1352, AGAGTCA at 477, AGACTCA at 216.
  4. Gcn4r5ci: TTAGTCA at 1354.
  5. Gcn4r7ci: TCAATCA at 1137, ACAGTCA at 1014, ACAGTCA at 202.
  6. Gcn4r9ci: TGACTCA at 2082, ATACTCA at 1195, TCAGTCA at 106.

GCN4r arbitrary positive direction (odds) (4050-1) distal promoters

  1. Gcn4r1: TGATTAA at 2851.
  2. Gcn4r3: TGACTCT at 2451.
  3. Gcn4r7: TGACTAA at 4046.
  4. Gcn4r9: TGACTAA at 3706, TGACTCT at 3523, TGACTCA at 2082.
  5. Gcn4r1ci: TTAATCA at 3460.
  6. Gcn4r3ci: TCAATCA at 1655, TGAATCA at 1555, TTAATCA at 1352, AGAGTCA at 477, AGACTCA at 216.
  7. Gcn4r5ci: ACACTCA at 4026, ATAATCA at 3215, TTAGTCA at 1354.
  8. Gcn4r7ci: AGACTCA at 3443, TCAATCA at 1137, ACAGTCA at 1014, ACAGTCA at 202.
  9. Gcn4r9ci: TTAATCA at 3367, ATACTCA at 3248, TGACTCA at 2082, ATACTCA at 1195, TCAGTCA at 106.

GCN4r alternate positive direction (evens) (4050-1) distal promoters

  1. Gcn4r0: TGAGTGA at 743, TGACTAT at 706.
  2. Gcn4r2: TGAGTAA at 3017, TGATTCA at 489.
  3. Gcn4r4: TGATTGT at 3795, TGAGTAA at 3403, TGACTCA at 2963, TGAGTGT at 2573, TGAGTCA at 2248, TGAGTGA at 1000.
  4. Gcn4r6: TGATTCT at 3178, TGACTCA at 2045, TGATTCA at 283, TGATTCT at 260.
  5. Gcn4r8: TGAGTAA at 1966, TGATTGA at 1177.
  6. Gcn4r0ci: TTAGTCA at 3188, TCACTCA at 2589.
  7. Gcn4r2ci: TTAGTCA at 2899, TCAATCA at 71.
  8. Gcn4r4ci: TGACTCA at 2963, TTACTCA at 2650, TGAGTCA at 2248, TTACTCA at 1137, ACACTCA at 626.
  9. Gcn4r6ci: TGACTCA at 2045, ACACTCA at 1842, TCAATCA at 1836, AGAGTCA at 1109, TTAATCA at 35.
  10. Gcn4r8ci: TTACTCA at 2158, ATAATCA at 1749.

GCN4 analysis and results

"The program DNA-Pattern was used to search for and catalogue occurrences of consensus GCRE (TGABTVW) [TGA(C/G/T)T(A/C/G)(A/T)]."[2]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 6 2 3 3
Randoms UTR arbitrary negative 13 10 1.3 1.6
Randoms UTR alternate negative 19 10 1.9 1.6
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 2 2 1 1
Randoms Core arbitrary positive 1 10 0.1 0.1
Randoms Core alternate positive 1 10 0.1 0.1
Reals Proximal negative 1 2 0.5 0.5
Randoms Proximal arbitrary negative 1 10 0.1 0.05
Randoms Proximal alternate negative 0 10 0 0.05
Reals Proximal positive 4 2 2 2
Randoms Proximal arbitrary positive 7 10 0.7 0.45
Randoms Proximal alternate positive 2 10 0.2 0.45
Reals Distal negative 5 2 2.5 2.5 ± 1.5 (--1,+-4)
Randoms Distal arbitrary negative 23 10 2.3 1.85 ± 0.45
Randoms Distal alternate negative 14 10 1.4 1.85 ± 0.45
Reals Distal positive 9 2 4.5 4.5 ± 0.5 (-+5,++4)
Randoms Distal arbitrary positive 24 10 2.4 2.8 ± 0.4
Randoms Distal alternate positive 32 10 3.2 2.8 ± 0.4

Comparison:

The occurrences of real GCN4 UTRs, cores, proximals and positive direction distals are greater than the randoms and the negative direction distals are outside the randoms. This suggests that the real GCN4s are likely active or activable.

See also

References

  1. 1.0 1.1 1.2 Hongting Tang, Yanling Wu, Jiliang Deng, Nanzhu Chen, Zhaohui Zheng, Yongjun Wei, Xiaozhou Luo, and Jay D. Keasling (6 August 2020). "Promoter Architecture and Promoter Engineering in Saccharomyces cerevisiae". Metabolites. 10 (8): 320–39. doi:10.3390/metabo10080320. PMID 32781665 Check |pmid= value (help). Retrieved 18 September 2020.
  2. 2.0 2.1 2.2 Kirk A. Staschke, Souvik Dey, John M. Zaborske, Lakshmi Reddy Palam, Jeanette N. McClintick, Tao Pan, Howard J. Edenberg, and Ronald C. Wek (May 28, 2010). "Integration of General Amino Acid Control and Target of Rapamycin (TOR) Regulatory Pathways in Nitrogen Assimilation in Yeast" (PDF). The Journal of Biological Chemistry. 285 (22): 16893–16911. doi:10.1074/jbc.M110.121947. Retrieved 4 January 2021.

External links