Rap1 regulatory factor gene transcriptions: Difference between revisions
mNo edit summary |
|||
(14 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
{{AE}} Henry A. Hoff | {{AE}} Henry A. Hoff | ||
"Rap1 is another GRF that organizes chromatin, binds promoters of genes that encode ribosomal and glycolytic proteins, and binds telomeres (Shore 1994; Ganapathi et al. 2011; Hughes and de Boer 2013). [...] DNA shape analysis revealed that Rap1 motifs possess an intrinsically wide minor groove spanning the central degenerate region of the motif that was wider at binding-competent sites [...]. A clear trend was observed between increased width of the minor groove in the central degenerate region of the motif and increased Rap1 binding in vitro."<ref name=Rossi>{{ cite journal | "Rap1 is another [General regulatory factor] GRF that organizes chromatin, binds promoters of genes that encode ribosomal and glycolytic proteins, and binds telomeres (Shore 1994; Ganapathi et al. 2011; Hughes and de Boer 2013). [...] DNA shape analysis revealed that Rap1 motifs possess an intrinsically wide minor groove spanning the central degenerate region of the motif that was wider at binding-competent sites [...]. A clear trend was observed between increased width of the minor groove in the central degenerate region of the motif and increased Rap1 binding in vitro."<ref name=Rossi>{{ cite journal | ||
|author=Matthew J. Rossi, William K.M. Lai and B. Franklin Pugh | |author=Matthew J. Rossi, William K.M. Lai and B. Franklin Pugh | ||
|title=Genome-wide determinants of sequence-specific DNA binding of general regulatory factors | |title=Genome-wide determinants of sequence-specific DNA binding of general regulatory factors | ||
Line 67: | Line 67: | ||
# Rapr0: ACCCAGGCA at 3737. | # Rapr0: ACCCAGGCA at 3737. | ||
# Rapr6ci: TGCATGGGT at 3553. | # Rapr6ci: TGCATGGGT at 3553. | ||
===Rapr proximal promoters=== | ===Rapr proximal promoters=== | ||
Line 77: | Line 74: | ||
===Rapr distal promoters=== | ===Rapr distal promoters=== | ||
{{main|Distal promoter gene transcriptions}} | {{main|Distal promoter gene transcriptions}} | ||
# Rapr0: ACCCGGGCA at 1902. | |||
# Rapr6: ACCCGAACA at 1605, ACCCGGGCA at 1452. | |||
# Rapr4ci: TGCCTGGGT at 1378. | |||
# Rapr8ci: TGTATGGGT at 572. | |||
# Rapr9ci: TGCGTGGGT at 431. | |||
==Rap1 prevalent samplings== | ==Rap1 prevalent samplings== | ||
Line 131: | Line 135: | ||
==Rap1 full consensus samplings== | ==Rap1 full consensus samplings== | ||
Copying a responsive elements consensus sequence C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T) and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or one between ZSCAN22 and A1BG as can be found by the computer programs. With all of the multiples possible the total possible is 26,244. | Copying a responsive elements consensus sequence C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T) and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or one between ZSCAN22 and A1BG as can be found by the computer programs. With all of the multiples possible the total possible is 26,244. | ||
Line 141: | Line 145: | ||
# inverse complement, negative strand, negative direction, looking for (A/G/T)(A/G/T)(G/T)(C/G/T)(A/C/T)(A/C/G)(A/C/T)(A/G/T)G(A/C/G)(C/T)(C/G/T)(C/G/T)G, 1, AGTTCGTTGACGGG at 3853. | # inverse complement, negative strand, negative direction, looking for (A/G/T)(A/G/T)(G/T)(C/G/T)(A/C/T)(A/C/G)(A/C/T)(A/G/T)G(A/C/G)(C/T)(C/G/T)(C/G/T)G, 1, AGTTCGTTGACGGG at 3853. | ||
# inverse complement, positive strand, negative direction, looking for (A/G/T)(A/G/T)(G/T)(C/G/T)(A/C/T)(A/C/G)(A/C/T)(A/G/T)G(A/C/G)(C/T)(C/G/T)(C/G/T)G, 2, AATGACCGGGTGCG at 2196, GTTGCCCAGGCTGG at 1464. | # inverse complement, positive strand, negative direction, looking for (A/G/T)(A/G/T)(G/T)(C/G/T)(A/C/T)(A/C/G)(A/C/T)(A/G/T)G(A/C/G)(C/T)(C/G/T)(C/G/T)G, 2, AATGACCGGGTGCG at 2196, GTTGCCCAGGCTGG at 1464. | ||
# inverse complement, positive strand, positive direction, looking for | # inverse complement, positive strand, positive direction, looking for (A/G/T)(A/G/T)(G/T)(C/G/T)(A/C/T)(A/C/G)(A/C/T)(A/G/T)G(A/C/G)(C/T)(C/G/T)(C/G/T)G, 6, AGTCCATTGACTCG at 3737, TGGTTCATGGTGTG at 2601, GTGGCGTGGACCGG at 2571, GTTTTGAGGACCCG at 2503, TGGTCGCGGACGTG at 1471, TGGTCGCGGACGTG at 1371. | ||
# inverse complement, negative strand, positive direction, looking for | # inverse complement, negative strand, positive direction, looking for (A/G/T)(A/G/T)(G/T)(C/G/T)(A/C/T)(A/C/G)(A/C/T)(A/G/T)G(A/C/G)(C/T)(C/G/T)(C/G/T)G, 5, TTTCTCTTGCTGTG at 4393, AGGTACTGGCTCTG at 3122, GTGCCCCAGGTCTG at 3020, GTTGCCTAGGCGGG at 2486, AGGTCACAGGCGCG at 161. | ||
===Rap1 (4560-2846) UTRs=== | |||
# Negative strand, negative direction: CCGGTCCGTACCAC at 4109, CCGGTCCACGACAC at 3958, AGTTCGTTGACGGG at 3853, CGAGTCCTCAACCT at 3117, CCCACCTAGTGAAC at 3102, CCGACCCGCACCAC at 3049. | # Negative strand, negative direction: CCGGTCCGTACCAC at 4109, CCGGTCCACGACAC at 3958, AGTTCGTTGACGGG at 3853, CGAGTCCTCAACCT at 3117, CCCACCTAGTGAAC at 3102, CCGACCCGCACCAC at 3049. | ||
# Positive strand, negative direction: CCAGCCTGGGCAAC at 4042, CCAGCCATTTCCAC at 3691, CCAGCCTGGGCAAC at 3303, CACGCCATTGCACT at 3289. | # Positive strand, negative direction: CCAGCCTGGGCAAC at 4042, CCAGCCATTTCCAC at 3691, CCAGCCTGGGCAAC at 3303, CACGCCATTGCACT at 3289. | ||
===Rap1 core promoters=== | ===Rap1 positive direction (4445-4265) core promoters=== | ||
# Negative strand, positive direction: TTTCTCTTGCTGTG at 4393. | # Negative strand, positive direction: TTTCTCTTGCTGTG at 4393. | ||
# Positive strand, positive direction: CAGACCCAGGGACC at 4424. | |||
===Rap1 negative direction (2811-2596) proximal promoters=== | |||
# Positive strand, negative direction: CCAGCCTGGGCAAC at 2775, CCCAGCCTGGGCAA at 2774. | # Positive strand, negative direction: CCAGCCTGGGCAAC at 2775, CCCAGCCTGGGCAA at 2774. | ||
===Rap1 negative direction (2596-1) distal promoters=== | |||
# Negative strand, negative direction: CCAGTCCTCAACTT at 2594, CCGGTCCGTGCCAC at 2526, CCAGTCCTCAAACT at 2257, CGAGTCCTCAAACT at 2141, CCCGTCCTCTACCT at 1830, CCGACCCGCGCCAC at 1764, CCGGTCCGTGCCAC at 655, CCAGTCCTCTAACT at 585, CCGGCCCACGCCAC at 382. | # Negative strand, negative direction: CCAGTCCTCAACTT at 2594, CCGGTCCGTGCCAC at 2526, CCAGTCCTCAAACT at 2257, CGAGTCCTCAAACT at 2141, CCCGTCCTCTACCT at 1830, CCGACCCGCGCCAC at 1764, CCGGTCCGTGCCAC at 655, CCAGTCCTCTAACT at 585, CCGGCCCACGCCAC at 382. | ||
# Positive strand, negative direction: CCAGCCTGGGCAAC at 2440, AATGACCGGGTGCG at 2196, GTTGCCCAGGCTGG at 1464, CCAGTCTGGGCAAT at 1361, CACGCCTGTAGATC at 972, CCAGCCTGGGCAAC at 904, CAAGCCTGGGCAAC at 464. | # Positive strand, negative direction: CCAGCCTGGGCAAC at 2440, AATGACCGGGTGCG at 2196, GTTGCCCAGGCTGG at 1464, CCAGTCTGGGCAAT at 1361, CACGCCTGTAGATC at 972, CCAGCCTGGGCAAC at 904, CAAGCCTGGGCAAC at 464. | ||
===Rap1 positive direction (4050-1) distal promoters=== | |||
# Negative strand, positive direction: AGGTACTGGCTCTG at 3122, GTGCCCCAGGTCTG at 3020, CACACCCTGAGCTC at 2972, CAAGTCAGGAAATA at 2625, GTTGCCTAGGCGGG at 2486, CGAGCCTGCAGACC at 440, CGGGGCTGGGCCCA at 422, CAGGTCCTCTGCAA at 225, AGGTCACAGGCGCG at 161. | # Negative strand, positive direction: AGGTACTGGCTCTG at 3122, GTGCCCCAGGTCTG at 3020, CACACCCTGAGCTC at 2972, CAAGTCAGGAAATA at 2625, GTTGCCTAGGCGGG at 2486, CGAGCCTGCAGACC at 440, CGGGGCTGGGCCCA at 422, CAGGTCCTCTGCAA at 225, AGGTCACAGGCGCG at 161. | ||
# Positive strand, positive direction: CCCATCCGGACCCT at 3760, AGTCCATTGACTCG at 3737, CGAGTCCATTGACT at 3735, TGGTTCATGGTGTG at 2601, GTGGCGTGGACCGG at 2571, GTTTTGAGGACCCG at 2503, CACATCCAGACAAA at 2262, TGGTCGCGGACGTG at 1471, TGGTCGCGGACGTG at 1371. | # Positive strand, positive direction: CCCATCCGGACCCT at 3760, AGTCCATTGACTCG at 3737, CGAGTCCATTGACT at 3735, TGGTTCATGGTGTG at 2601, GTGGCGTGGACCGG at 2571, GTTTTGAGGACCCG at 2503, CACATCCAGACAAA at 2262, TGGTCGCGGACGTG at 1471, TGGTCGCGGACGTG at 1371. | ||
==Rap1 random dataset samplings== | |||
# Rap1r0: 8, CAAGGCCGTTCAAA at 4165, CAAGTCAAGACCTT at 4050, CCAGGCTGTGGCTC at 2846, CACGTCCTCACCAT at 2601, CAAACCAGTGAACT at 1331, CCAAGCTGGAACTA at 1303, CCAGGCTGGGCATC at 1255, CCGGGCTGGGAAAC at 831. | # Rap1r0: 8, CAAGGCCGTTCAAA at 4165, CAAGTCAAGACCTT at 4050, CCAGGCTGTGGCTC at 2846, CACGTCCTCACCAT at 2601, CAAACCAGTGAACT at 1331, CCAAGCTGGAACTA at 1303, CCAGGCTGGGCATC at 1255, CCGGGCTGGGAAAC at 831. | ||
Line 190: | Line 195: | ||
# Rap1r9ci: 6, ATTTCCCAGGCGGG at 2135, GGTTACCGGACTGG at 1712, TTTTAACTGCCTTG at 846, ATTTACTGGCCCGG at 632, TATTTACTGGCCCG at 631, ATTTTATAGCCGGG at 168. | # Rap1r9ci: 6, ATTTCCCAGGCGGG at 2135, GGTTACCGGACTGG at 1712, TTTTAACTGCCTTG at 846, ATTTACTGGCCCGG at 632, TATTTACTGGCCCG at 631, ATTTTATAGCCGGG at 168. | ||
===Rap1r UTRs=== | ===Rap1r arbitrary (evens) (4560-2846) UTRs=== | ||
# Rap1r0: CAAGGCCGTTCAAA at 4165, CAAGTCAAGACCTT at 4050, CCAGGCTGTGGCTC at 2846. | # Rap1r0: CAAGGCCGTTCAAA at 4165, CAAGTCAAGACCTT at 4050, CCAGGCTGTGGCTC at 2846. | ||
# Rap1r4: CGAACCAAGTCCTT at 3972, CCGACCTATAAACT at 3687, CGGGCCTAGGAACC at 3170. | # Rap1r4: CGAACCAAGTCCTT at 3972, CCGACCTATAAACT at 3687, CGGGCCTAGGAACC at 3170. | ||
# Rap1r6: CGCGGCCACAACAA at 4395, CAAATCCGGGAACC at 3344, CACACCTTGTCCCT at 3171, CCCAGCTGTTGAAT at 2887. | # Rap1r6: CGCGGCCACAACAA at 4395, CAAATCCGGGAACC at 3344, CACACCTTGTCCCT at 3171, CCCAGCTGTTGAAT at 2887. | ||
# Rap1r0ci: TGTTTGAAGGTCGG at 4188. | # Rap1r0ci: TGTTTGAAGGTCGG at 4188. | ||
# Rap1r2ci: GAGCACAAGGCCGG at 4095. | # Rap1r2ci: GAGCACAAGGCCGG at 4095. | ||
# Rap1r4ci: GGGCAACGGACCGG at 4237, GTGCCGAGGATGTG at 4072. | # Rap1r4ci: GGGCAACGGACCGG at 4237, GTGCCGAGGATGTG at 4072. | ||
===Rap1r core promoters=== | ===Rap1r alternate (odds) (4560-2846) UTRs=== | ||
# Rap1r1: CGAGCCAAGGACCC at 4467, CCGATCCTTGGCCC at 4087, CACGTCCTTAACTA at 3437. | |||
# Rap1r5: CCAATCCTGAAAAA at 4165, CCCACCAGGGCACC at 3507, CGAATCCGGGCAAT at 3457. | |||
# Rap1r7: CGCAGCTAGTGCAT at 4289. | |||
# Rap1r9: CCGACCTGGGAATT at 4201, CACGCCTATGAACA at 3747, CGGGGCTATGGACT at 2865. | |||
# Rap1r5ci: AATTTGTGGCTCGG at 3886, TGTTAATTGCTCGG at 3572, ATTGCCTTGCCGCG at 3324, AGTTAGTTGGTTTG at 3270, GAGCAGTGGGCTGG at 3164, AATTCGTTGCTTGG at 2998. | |||
# Rap1r7ci: ATGTTGCAGATTGG at 4180, GGGGACAGGGCTTG at 3723, ATGGCGCTGGTGGG at 3398, GGTCTGTTGCCGTG at 3131. | |||
===Rap1r arbitrary negative direction (evens) (2846-2811) core promoters=== | |||
# Rap1r0: CCAGGCTGTGGCTC at 2846. | # Rap1r0: CCAGGCTGTGGCTC at 2846. | ||
===Rap1r arbitrary positive direction (odds) (4445-4265) core promoters=== | |||
# Rap1r7: CGCAGCTAGTGCAT at 4289. | # Rap1r7: CGCAGCTAGTGCAT at 4289. | ||
===Rap1r proximal promoters=== | ===Rap1r alternate positive direction (evens) (4445-4265) core promoters=== | ||
# Rap1r6: CGCGGCCACAACAA at 4395. | |||
===Rap1r arbitrary negative direction (evens) (2811-2596) proximal promoters=== | |||
# Rap1r0: CACGTCCTCACCAT at 2601. | # Rap1r0: CACGTCCTCACCAT at 2601. | ||
# Rap1r6: CAGAGCATGTGAAT at 2333, CCCACCCAGGGCAC at 966, CCCACCAAGGACAT at 792, CGGGCCAATGGAAA at 745. | |||
# Rap1r8: CCAGCCAGTTCACT at 2796. | # Rap1r8: CCAGCCAGTTCACT at 2796. | ||
# Rap1r6ci: TTTTAACGGGTCCG at 2787. | # Rap1r6ci: TTTTAACGGGTCCG at 2787. | ||
===Rap1r alternate negative direction (odds) (2811-2596) proximal promoters=== | |||
# Rap1r1: CGGGTCAATGCCTT at 2784. | |||
# Rap1r3: CAAATCCTCAGATA at 2773. | |||
===Rap1r arbitrary positive direction (odds) (4265-4050) proximal promoters=== | |||
# Rap1r1: CCGATCCTTGGCCC at 4087. | # Rap1r1: CCGATCCTTGGCCC at 4087. | ||
Line 218: | Line 244: | ||
# Rap1r7ci: ATGTTGCAGATTGG at 4180. | # Rap1r7ci: ATGTTGCAGATTGG at 4180. | ||
===Rap1r distal promoters=== | ===Rap1r alternate positive direction (evens) (4265-4050) proximal promoters=== | ||
# Rap1r0: CAAGGCCGTTCAAA at 4165, CAAGTCAAGACCTT at 4050. | |||
# Rap1r0ci: TGTTTGAAGGTCGG at 4188. | |||
# Rap1r2ci: GAGCACAAGGCCGG at 4095. | |||
# Rap1r4ci: GGGCAACGGACCGG at 4237, GTGCCGAGGATGTG at 4072. | |||
===Rap1r arbitrary negative direction (evens) (2596-1) distal promoters=== | |||
# Rap1r0: CAAACCAGTGAACT at 1331, CCAAGCTGGAACTA at 1303, CCAGGCTGGGCATC at 1255, CCGGGCTGGGAAAC at 831. | # Rap1r0: CAAACCAGTGAACT at 1331, CCAAGCTGGAACTA at 1303, CCAGGCTGGGCATC at 1255, CCGGGCTGGGAAAC at 831. | ||
# Rap1r2: CGAAGCCTTACACA at 509. | # Rap1r2: CGAAGCCTTACACA at 509. | ||
Line 230: | Line 263: | ||
# Rap1r6ci: GGGGCCTAGACCCG at 1601, GGTTCGTGGCCGCG at 1280, TATGAATGGCTTTG at 255. | # Rap1r6ci: GGGGCCTAGACCCG at 1601, GGTTCGTGGCCGCG at 1280, TATGAATGGCTTTG at 255. | ||
# Rap1r8ci: TTGTAAATGCTTCG at 2508, AGGTAGCGGCCTTG at 2032, TAGCCAATGACTCG at 1135, GGTGAGATGGTGTG at 536, AATCTCCAGGTGCG at 294, TGGTTATGGGTGGG at 99. | # Rap1r8ci: TTGTAAATGCTTCG at 2508, AGGTAGCGGCCTTG at 2032, TAGCCAATGACTCG at 1135, GGTGAGATGGTGTG at 536, AATCTCCAGGTGCG at 294, TGGTTATGGGTGGG at 99. | ||
===Rap1r alternate negative direction (odds) (2596-1) distal promoters=== | |||
# Rap1r1: CCGACCCAGTAAAT at 2362, CCGACCATCGCAAA at 1774, CGAAGCATCAAAAA at 1440, CGGGTCTACGAACA at 611, CCAACCTTTACACT at 528. | |||
# Rap1r3: CCGGTCCAGGGCCA at 1592, CCAGCCCTCTGAAC at 1434. | |||
# Rap1r5: CAGGCCTTTAAAAT at 2372, CACACCTTCGCCTC at 1650, CCCGCCAGCTCCAC at 800, CGAAGCCTTGCCCT at 481, CCCGCCCTTGGCCC at 357. | |||
# Rap1r7: CAGACCATCTAAAA at 1275, CGGACCCGGGACAA at 61. | |||
# Rap1r9: CCAAGCCTGAGATT at 2327, CAAGGCTATTCACT at 648. | |||
# Rap1r1ci: TATTCATGGCCTTG at 2005, AGGCCACTGGTTTG at 923. | |||
# Rap1r3ci: GAGGTAAGGGTGTG at 2010, AGGGCACTGGTCGG at 1888, TGGTTCTGGATGCG at 1690. | |||
# Rap1r5ci: GTTTTACGGACGGG at 2083, ATGTAACAGCCCCG at 985. | |||
# Rap1r7ci: AGGTAAAGGGTTCG at 2380, GAGTCCAGGCCGTG at 2231, GAGCCCCGGGTTTG at 1847, GTGTTCCAGACGGG at 129. | |||
# Rap1r9ci: ATTTCCCAGGCGGG at 2135, GGTTACCGGACTGG at 1712, TTTTAACTGCCTTG at 846, ATTTACTGGCCCGG at 632, TATTTACTGGCCCG at 631, ATTTTATAGCCGGG at 168. | |||
===Rap1r arbitrary positive direction (odds) (4050-1) distal promoters=== | |||
# Rap1r1: CACGTCCTTAACTA at 3437, CGGGTCAATGCCTT at 2784, CCGACCCAGTAAAT at 2362, CCGACCATCGCAAA at 1774, CGAAGCATCAAAAA at 1440, CGGGTCTACGAACA at 611, CCAACCTTTACACT at 528. | # Rap1r1: CACGTCCTTAACTA at 3437, CGGGTCAATGCCTT at 2784, CCGACCCAGTAAAT at 2362, CCGACCATCGCAAA at 1774, CGAAGCATCAAAAA at 1440, CGGGTCTACGAACA at 611, CCAACCTTTACACT at 528. | ||
Line 241: | Line 289: | ||
# Rap1r7ci: GGGGACAGGGCTTG at 3723, ATGGCGCTGGTGGG at 3398, GGTCTGTTGCCGTG at 3131, AGGTAAAGGGTTCG at 2380, GAGTCCAGGCCGTG at 2231, GAGCCCCGGGTTTG at 1847, GTGTTCCAGACGGG at 129. | # Rap1r7ci: GGGGACAGGGCTTG at 3723, ATGGCGCTGGTGGG at 3398, GGTCTGTTGCCGTG at 3131, AGGTAAAGGGTTCG at 2380, GAGTCCAGGCCGTG at 2231, GAGCCCCGGGTTTG at 1847, GTGTTCCAGACGGG at 129. | ||
# Rap1r9ci: ATTTCCCAGGCGGG at 2135, GGTTACCGGACTGG at 1712, TTTTAACTGCCTTG at 846, ATTTACTGGCCCGG at 632, TATTTACTGGCCCG at 631, ATTTTATAGCCGGG at 168. | # Rap1r9ci: ATTTCCCAGGCGGG at 2135, GGTTACCGGACTGG at 1712, TTTTAACTGCCTTG at 846, ATTTACTGGCCCGG at 632, TATTTACTGGCCCG at 631, ATTTTATAGCCGGG at 168. | ||
===Rap1r alternate positive direction (evens) (4050-1) distal promoters=== | |||
# Rap1r0: CAAGTCAAGACCTT at 4050, CCAGGCTGTGGCTC at 2846, CACGTCCTCACCAT at 2601, CAAACCAGTGAACT at 1331, CCAAGCTGGAACTA at 1303, CCAGGCTGGGCATC at 1255, CCGGGCTGGGAAAC at 831. | |||
# Rap1r2: CGAAGCCTTACACA at 509. | |||
# Rap1r4: CGAACCAAGTCCTT at 3972, CCGACCTATAAACT at 3687, CGGGCCTAGGAACC at 3170, CGAGCCCATTGCCC at 1846, CCAATCTTGTCCAC at 1770. | |||
# Rap1r6: CAAATCCGGGAACC at 3344, CACACCTTGTCCCT at 3171, CCCAGCTGTTGAAT at 2887, CAGAGCATGTGAAT at 2333, CCCACCCAGGGCAC at 966, CCCACCAAGGACAT at 792, CGGGCCAATGGAAA at 745. | |||
# Rap1r8: CCAGCCAGTTCACT at 2796, CGCGTCCACTGAAC at 1582. | |||
# Rap1r0ci: AATGCCATGGTTCG at 781. | |||
# Rap1r2ci: TTGGCATAGCTTGG at 622, GGGGCACGGCTTTG at 551. | |||
# Rap1r4ci: GAGCAAATGGTCCG at 1814, GTTTAATTGGCGGG at 712, TATCACTTGATTGG at 651. | |||
# Rap1r6ci: TTTTAACGGGTCCG at 2787, GGGGCCTAGACCCG at 1601, GGTTCGTGGCCGCG at 1280, TATGAATGGCTTTG at 255. | |||
# Rap1r8ci: TTGTAAATGCTTCG at 2508, AGGTAGCGGCCTTG at 2032, TAGCCAATGACTCG at 1135, GGTGAGATGGTGTG at 536, AATCTCCAGGTGCG at 294, TGGTTATGGGTGGG at 99. | |||
==Discussion== | |||
When the Rap1 motif was held constant to ACCCRNRCA<ref name=Rossi/>, no real results occurred. However, using the ten random datasets for testing ACCCRNRCA and its inverse complement yielded five consensus sequence results and four inverse complements. Two were in the UTR of A1BG from the negative direction. One was in the proximal promoter from the positive direction, and the remaining five were in the distal promoters. | |||
The reduced consensus (A/G)(A/C)ACCC(A/G)N(A/G)C(A/C)(C/T)(A/C)<ref name=Rossi/> had one result GAACCCACACCTC in the positive direction at 1807, less than half way from ZNF497. Of ten random datasets only one had a result: GCACCCGGGCATC at 1454. Also, for the inverse complement, there was only one TATGCCTGGGTTT at 1380. In both the real sequences and random sequences, each was in the distal promoter closer to the zinc finger than A1BG. The occurrence of one random result per ten datasets suggests that such a result is rarely random. While the real occurrence is likely active as a regulatory response. | |||
The full consensus sequence C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T)<ref name=Rossi/> gave four to six results in the UTR negative direction, one in the core promoter in the positive direction, two in the proximal promoter in the negative direction and one in the positive direction. In the distal promoter each direction had eight to nine results. | |||
For the random data sets: UTR ranged from zero to four in the UTR, core promoter produced only zero to one, proximal promoter produced zero to one, and the distal promoter contained one to seven for either direction. | |||
Comparing the two, the real UTR, proximal promoter, and distal promoter usually exceeded the random results. This suggests that some of the real results could just be due to random associations of nucleotides, but the rest are likely real. | |||
==Rap1 analysis and results== | |||
{{main|Complex locus A1BG and ZNF497#Rap1s}} | |||
Reduced consensus sequence including more frequent nucleotides: C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T).<ref name=Rossi/> | |||
{|class="wikitable" | |||
|- | |||
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1) | |||
|- | |||
| Reals || UTR || negative || 10 || 2 || 5 || 5 ± 1 (--6,+-4) | |||
|- | |||
| Randoms || UTR || arbitrary negative || 14 || 10 || 1.4 || 1.7 | |||
|- | |||
| Randoms || UTR || alternate negative || 20 || 10 || 2.0 || 1.7 | |||
|- | |||
| Reals || Core || negative || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || Core || arbitrary negative || 1 || 10 || 0.1 || 0.05 | |||
|- | |||
| Randoms || Core || alternate negative || 0 || 10 || 0 || 0.05 | |||
|- | |||
| Reals || Core || positive || 2 || 2 || 1 || 1 ± 0 (-+1,++1) | |||
|- | |||
| Randoms || Core || arbitrary positive || 1 || 10 || 0.1 || 0.1 | |||
|- | |||
| Randoms || Core || alternate positive || 1 || 10 || 0.1 || 0.1 | |||
|- | |||
| Reals || Proximal || negative || 2 || 2 || 1 || 1 ± 1 (--0,+-2) | |||
|- | |||
| Randoms || Proximal || arbitrary negative || 7 || 10 || 0.7 || 0.45 | |||
|- | |||
| Randoms || Proximal || alternate negative || 2 || 10 || 0.2 || 0.45 | |||
|- | |||
| Reals || Proximal || positive || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || Proximal || arbitrary positive || 4 || 10 || 0.4 || 0.5 | |||
|- | |||
| Randoms || Proximal || alternate positive || 6 || 10 || 0.6 || 0.5 | |||
|- | |||
| Reals || Distal || negative || 16 || 2 || 8 || 8 ± 1 (--9,+-7) | |||
|- | |||
| Randoms || Distal || arbitrary negative || 27 || 10 || 2.7 || 3.0 | |||
|- | |||
| Randoms || Distal || alternate negative || 33 || 10 || 3.3 || 3.0 | |||
|- | |||
| Reals || Distal || positive || 18 || 2 || 9 || 9 (-+9,++9) | |||
|- | |||
| Randoms || Distal || arbitrary positive || 49 || 10 || 4.9 || 4.35 | |||
|- | |||
| Randoms || Distal || alternate positive || 38 || 10 || 3.8 || 4.35 | |||
|} | |||
Comparison: | |||
The occurrences of real Rap1s are greater than the randoms. This suggests that the real Rap1s are likely active or activable. | |||
==Acknowledgements== | ==Acknowledgements== |
Latest revision as of 16:48, 7 September 2023
Associate Editor(s)-in-Chief: Henry A. Hoff
"Rap1 is another [General regulatory factor] GRF that organizes chromatin, binds promoters of genes that encode ribosomal and glycolytic proteins, and binds telomeres (Shore 1994; Ganapathi et al. 2011; Hughes and de Boer 2013). [...] DNA shape analysis revealed that Rap1 motifs possess an intrinsically wide minor groove spanning the central degenerate region of the motif that was wider at binding-competent sites [...]. A clear trend was observed between increased width of the minor groove in the central degenerate region of the motif and increased Rap1 binding in vitro."[1]
Human genes
Consensus sequences
Consensus sequences: C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T).[1]
Reduced consensus sequence including more frequent nucleotides: (A/G)(A/C)ACCC(A/G)N(A/G)C(A/C)(C/T)(A/C).[1]
"When the core DNA sequence of the Rap1 motif [...] was held constant (ACCCRnRCA), less than half of the sites were detectably bound [...]."[1]
Rap1 samplings
Copying an apparent consensus sequence for Rap1 (CCCACCAACAAAA) and putting it in "⌘F" finds none located between ZSCAN22 or none between ZNF497 and A1BG as can be found by the computer programs.
For the Basic programs testing consensus sequence ACCC(A/G)N(A/G)CA (starting with SuccessablesRAP.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
- negative strand, negative direction, looking for ACCC(A/G)N(A/G)CA, 0.
- positive strand, negative direction, looking for ACCC(A/G)N(A/G)CA, 0.
- positive strand, positive direction, looking for ACCC(A/G)N(A/G)CA, 0.
- negative strand, positive direction, looking for ACCC(A/G)N(A/G)CA, 0.
- inverse complement, negative strand, negative direction, looking for TG(C/T)N(C/T)GGGT, 0.
- inverse complement, positive strand, negative direction, looking for TG(C/T)N(C/T)GGGT, 0.
- inverse complement, positive strand, positive direction, looking for TG(C/T)N(C/T)GGGT, 0.
- inverse complement, negative strand, positive direction, looking for TG(C/T)N(C/T)GGGT, 0.
Random dataset samplings
- Rapr0: 2, ACCCAGGCA at 3737, ACCCGGGCA at 1902.
- Rapr1: 0.
- Rapr2: 0.
- Rapr3: 0.
- Rapr4: 0.
- Rapr5: 1, ACCCAAGCA at 4131.
- Rapr6: 2, ACCCGAACA at 1605, ACCCGGGCA at 1452.
- Rapr7: 0.
- Rapr8: 0.
- Rapr9: 0.
- Rapr0ci: 0.
- Rapr1ci: 0.
- Rapr2ci: 0.
- Rapr3ci: 0.
- Rapr4ci: 1, TGCCTGGGT at 1378.
- Rapr5ci: 0.
- Rapr6ci: 1, TGCATGGGT at 3553.
- Rapr7ci: 0.
- Rapr8ci: 1, TGTATGGGT at 572.
- Rapr9ci: 1, TGCGTGGGT at 431.
Rapr UTRs
- Rapr0: ACCCAGGCA at 3737.
- Rapr6ci: TGCATGGGT at 3553.
Rapr proximal promoters
- Rapr5: ACCCAAGCA at 4131.
Rapr distal promoters
- Rapr0: ACCCGGGCA at 1902.
- Rapr6: ACCCGAACA at 1605, ACCCGGGCA at 1452.
- Rapr4ci: TGCCTGGGT at 1378.
- Rapr8ci: TGTATGGGT at 572.
- Rapr9ci: TGCGTGGGT at 431.
Rap1 prevalent samplings
For the Basic programs testing consensus sequence (A/G)(A/C)ACCC(A/G)N(A/G)C(A/C)(C/T)(A/C) (starting with SuccessablesRAPP.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
- negative strand, negative direction, looking for (A/G)(A/C)ACCC(A/G)N(A/G)C(A/C)(C/T)(A/C), 0.
- positive strand, negative direction, looking for (A/G)(A/C)ACCC(A/G)N(A/G)C(A/C)(C/T)(A/C), 0.
- positive strand, positive direction, looking for (A/G)(A/C)ACCC(A/G)N(A/G)C(A/C)(C/T)(A/C), 1, GAACCCACACCTC at 1807.
- negative strand, positive direction, looking for (A/G)(A/C)ACCC(A/G)N(A/G)C(A/C)(C/T)(A/C), 0.
- complement, negative strand, negative direction, looking for (C/T)(G/T)TGGG(C/T)N(C/T)G(G/T)(A/G)(G/T), 0.
- complement, positive strand, negative direction, looking for (C/T)(G/T)TGGG(C/T)N(C/T)G(G/T)(A/G)(G/T), 0.
- complement, positive strand, positive direction, looking for (C/T)(G/T)TGGG(C/T)N(C/T)G(G/T)(A/G)(G/T), 0.
- complement, negative strand, positive direction, looking for (C/T)(G/T)TGGG(C/T)N(C/T)G(G/T)(A/G)(G/T), 1, CTTGGGTGTGGAG at 1807.
- inverse complement, negative strand, negative direction, looking for (G/T)(A/G)(G/T)G(C/T)N(C/T)GGGT(G/T)(C/T), 0.
- inverse complement, positive strand, negative direction, looking for (G/T)(A/G)(G/T)G(C/T)N(C/T)GGGT(G/T)(C/T), 0.
- inverse complement, positive strand, positive direction, looking for (G/T)(A/G)(G/T)G(C/T)N(C/T)GGGT(G/T)(C/T), 0.
- inverse complement, negative strand, positive direction, looking for (G/T)(A/G)(G/T)G(C/T)N(C/T)GGGT(G/T)(C/T), 0.
- inverse negative strand, negative direction, looking for (A/C)(C/T)(A/C)C(A/G)N(A/G)CCCA(A/C)(A/G), 0.
- inverse positive strand, negative direction, looking for (A/C)(C/T)(A/C)C(A/G)N(A/G)CCCA(A/C)(A/G), 0.
- inverse positive strand, positive direction, looking for (A/C)(C/T)(A/C)C(A/G)N(A/G)CCCA(A/C)(A/G), 0.
- inverse negative strand, positive direction, looking for (A/C)(C/T)(A/C)C(A/G)N(A/G)CCCA(A/C)(A/G), 0.
RAPP distal promoters
Positive strand, positive direction: GAACCCACACCTC at 1807.
Random dataset samplings
- Rappr0: 0.
- Rappr1: 0.
- Rappr2: 0.
- Rappr3: 0.
- Rappr4: 0.
- Rappr5: 0.
- Rappr6: 1, GCACCCGGGCATC at 1454.
- Rappr7: 0.
- Rappr8: 0.
- Rappr9: 0.
- Rappr0ci: 0.
- Rappr1ci: 0.
- Rappr2ci: 0.
- Rappr3ci: 0.
- Rappr4ci: 1, TATGCCTGGGTTT at 1380.
- Rappr5ci: 0.
- Rappr6ci: 0.
- Rappr7ci: 0.
- Rappr8ci: 0.
- Rappr9ci: 0.
Rappr distal promoters
- Rappr6: 1, GCACCCGGGCATC at 1454.
- Rappr4ci: 1, TATGCCTGGGTTT at 1380.
Rap1 full consensus samplings
Copying a responsive elements consensus sequence C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T) and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or one between ZSCAN22 and A1BG as can be found by the computer programs. With all of the multiples possible the total possible is 26,244.
For the Basic programs testing consensus sequence C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T) (starting with SuccessablesRap1.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
- negative strand, negative direction, looking for C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T), 14, CCGGTCCGTACCAC at 4109, CCGGTCCACGACAC at 3958, CGAGTCCTCAACCT at 3117, CCCACCTAGTGAAC at 3102, CCGACCCGCACCAC at 3049, CCAGTCCTCAACTT at 2594, CCGGTCCGTGCCAC at 2526, CCAGTCCTCAAACT at 2257, CGAGTCCTCAAACT at 2141, CCCGTCCTCTACCT at 1830, CCGACCCGCGCCAC at 1764, CCGGTCCGTGCCAC at 655, CCAGTCCTCTAACT at 585, CCGGCCCACGCCAC at 382.
- positive strand, negative direction, looking for C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T), 11, CCAGCCTGGGCAAC at 4042, CCAGCCATTTCCAC at 3691, CCAGCCTGGGCAAC at 3303, CACGCCATTGCACT at 3289, CCAGCCTGGGCAAC at 2775, CCCAGCCTGGGCAA at 2774, CCAGCCTGGGCAAC at 2440, CCAGTCTGGGCAAT at 1361, CACGCCTGTAGATC at 972, CCAGCCTGGGCAAC at 904, CAAGCCTGGGCAAC at 464.
- positive strand, positive direction, looking for C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T), 4, CAGACCCAGGGACC at 4424, CCCATCCGGACCCT at 3760, CGAGTCCATTGACT at 3735, CACATCCAGACAAA at 2262.
- negative strand, positive direction, looking for 1C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T), 5, CACACCCTGAGCTC at 2972, CAAGTCAGGAAATA at 2625, CGAGCCTGCAGACC at 440, CGGGGCTGGGCCCA at 422, CAGGTCCTCTGCAA at 225.
- inverse complement, negative strand, negative direction, looking for (A/G/T)(A/G/T)(G/T)(C/G/T)(A/C/T)(A/C/G)(A/C/T)(A/G/T)G(A/C/G)(C/T)(C/G/T)(C/G/T)G, 1, AGTTCGTTGACGGG at 3853.
- inverse complement, positive strand, negative direction, looking for (A/G/T)(A/G/T)(G/T)(C/G/T)(A/C/T)(A/C/G)(A/C/T)(A/G/T)G(A/C/G)(C/T)(C/G/T)(C/G/T)G, 2, AATGACCGGGTGCG at 2196, GTTGCCCAGGCTGG at 1464.
- inverse complement, positive strand, positive direction, looking for (A/G/T)(A/G/T)(G/T)(C/G/T)(A/C/T)(A/C/G)(A/C/T)(A/G/T)G(A/C/G)(C/T)(C/G/T)(C/G/T)G, 6, AGTCCATTGACTCG at 3737, TGGTTCATGGTGTG at 2601, GTGGCGTGGACCGG at 2571, GTTTTGAGGACCCG at 2503, TGGTCGCGGACGTG at 1471, TGGTCGCGGACGTG at 1371.
- inverse complement, negative strand, positive direction, looking for (A/G/T)(A/G/T)(G/T)(C/G/T)(A/C/T)(A/C/G)(A/C/T)(A/G/T)G(A/C/G)(C/T)(C/G/T)(C/G/T)G, 5, TTTCTCTTGCTGTG at 4393, AGGTACTGGCTCTG at 3122, GTGCCCCAGGTCTG at 3020, GTTGCCTAGGCGGG at 2486, AGGTCACAGGCGCG at 161.
Rap1 (4560-2846) UTRs
- Negative strand, negative direction: CCGGTCCGTACCAC at 4109, CCGGTCCACGACAC at 3958, AGTTCGTTGACGGG at 3853, CGAGTCCTCAACCT at 3117, CCCACCTAGTGAAC at 3102, CCGACCCGCACCAC at 3049.
- Positive strand, negative direction: CCAGCCTGGGCAAC at 4042, CCAGCCATTTCCAC at 3691, CCAGCCTGGGCAAC at 3303, CACGCCATTGCACT at 3289.
Rap1 positive direction (4445-4265) core promoters
- Negative strand, positive direction: TTTCTCTTGCTGTG at 4393.
- Positive strand, positive direction: CAGACCCAGGGACC at 4424.
Rap1 negative direction (2811-2596) proximal promoters
- Positive strand, negative direction: CCAGCCTGGGCAAC at 2775, CCCAGCCTGGGCAA at 2774.
Rap1 negative direction (2596-1) distal promoters
- Negative strand, negative direction: CCAGTCCTCAACTT at 2594, CCGGTCCGTGCCAC at 2526, CCAGTCCTCAAACT at 2257, CGAGTCCTCAAACT at 2141, CCCGTCCTCTACCT at 1830, CCGACCCGCGCCAC at 1764, CCGGTCCGTGCCAC at 655, CCAGTCCTCTAACT at 585, CCGGCCCACGCCAC at 382.
- Positive strand, negative direction: CCAGCCTGGGCAAC at 2440, AATGACCGGGTGCG at 2196, GTTGCCCAGGCTGG at 1464, CCAGTCTGGGCAAT at 1361, CACGCCTGTAGATC at 972, CCAGCCTGGGCAAC at 904, CAAGCCTGGGCAAC at 464.
Rap1 positive direction (4050-1) distal promoters
- Negative strand, positive direction: AGGTACTGGCTCTG at 3122, GTGCCCCAGGTCTG at 3020, CACACCCTGAGCTC at 2972, CAAGTCAGGAAATA at 2625, GTTGCCTAGGCGGG at 2486, CGAGCCTGCAGACC at 440, CGGGGCTGGGCCCA at 422, CAGGTCCTCTGCAA at 225, AGGTCACAGGCGCG at 161.
- Positive strand, positive direction: CCCATCCGGACCCT at 3760, AGTCCATTGACTCG at 3737, CGAGTCCATTGACT at 3735, TGGTTCATGGTGTG at 2601, GTGGCGTGGACCGG at 2571, GTTTTGAGGACCCG at 2503, CACATCCAGACAAA at 2262, TGGTCGCGGACGTG at 1471, TGGTCGCGGACGTG at 1371.
Rap1 random dataset samplings
- Rap1r0: 8, CAAGGCCGTTCAAA at 4165, CAAGTCAAGACCTT at 4050, CCAGGCTGTGGCTC at 2846, CACGTCCTCACCAT at 2601, CAAACCAGTGAACT at 1331, CCAAGCTGGAACTA at 1303, CCAGGCTGGGCATC at 1255, CCGGGCTGGGAAAC at 831.
- Rap1r1: 9, CGAGCCAAGGACCC at 4467, CCGATCCTTGGCCC at 4087, CACGTCCTTAACTA at 3437, CGGGTCAATGCCTT at 2784, CCGACCCAGTAAAT at 2362, CCGACCATCGCAAA at 1774, CGAAGCATCAAAAA at 1440, CGGGTCTACGAACA at 611, CCAACCTTTACACT at 528.
- Rap1r2: 1, CGAAGCCTTACACA at 509.
- Rap1r3: 3, CAAATCCTCAGATA at 2773, CCGGTCCAGGGCCA at 1592, CCAGCCCTCTGAAC at 1434.
- Rap1r4: 5, CGAACCAAGTCCTT at 3972, CCGACCTATAAACT at 3687, CGGGCCTAGGAACC at 3170, CGAGCCCATTGCCC at 1846, CCAATCTTGTCCAC at 1770.
- Rap1r5: 8, CCAATCCTGAAAAA at 4165, CCCACCAGGGCACC at 3507, CGAATCCGGGCAAT at 3457, CAGGCCTTTAAAAT at 2372, CACACCTTCGCCTC at 1650, CCCGCCAGCTCCAC at 800, CGAAGCCTTGCCCT at 481, CCCGCCCTTGGCCC at 357.
- Rap1r6: 8, CGCGGCCACAACAA at 4395, CAAATCCGGGAACC at 3344, CACACCTTGTCCCT at 3171, CCCAGCTGTTGAAT at 2887, CAGAGCATGTGAAT at 2333, CCCACCCAGGGCAC at 966, CCCACCAAGGACAT at 792, CGGGCCAATGGAAA at 745.
- Rap1r7: 3, CGCAGCTAGTGCAT at 4289, CAGACCATCTAAAA at 1275, CGGACCCGGGACAA at 61.
- Rap1r8: 2, CCAGCCAGTTCACT at 2796, CGCGTCCACTGAAC at 1582.
- Rap1r9: 5, CCGACCTGGGAATT at 4201, CACGCCTATGAACA at 3747, CGGGGCTATGGACT at 2865, CCAAGCCTGAGATT at 2327, CAAGGCTATTCACT at 648.
- Rap1r0ci: 2, TGTTTGAAGGTCGG at 4188, AATGCCATGGTTCG at 781.
- Rap1r1ci: 2, TATTCATGGCCTTG at 2005, AGGCCACTGGTTTG at 923.
- Rap1r2ci: 3, GAGCACAAGGCCGG at 4095, TTGGCATAGCTTGG at 622, GGGGCACGGCTTTG at 551.
- Rap1r3ci: 3, GAGGTAAGGGTGTG at 2010, AGGGCACTGGTCGG at 1888, TGGTTCTGGATGCG at 1690.
- Rap1r4ci: 5, GGGCAACGGACCGG at 4237, GTGCCGAGGATGTG at 4072, GAGCAAATGGTCCG at 1814, GTTTAATTGGCGGG at 712, TATCACTTGATTGG at 651.
- Rap1r5ci: 8, AATTTGTGGCTCGG at 3886, TGTTAATTGCTCGG at 3572, ATTGCCTTGCCGCG at 3324, AGTTAGTTGGTTTG at 3270, GAGCAGTGGGCTGG at 3164, AATTCGTTGCTTGG at 2998, GTTTTACGGACGGG at 2083, ATGTAACAGCCCCG at 985.
- Rap1r6ci: 4, TTTTAACGGGTCCG at 2787, GGGGCCTAGACCCG at 1601, GGTTCGTGGCCGCG at 1280, TATGAATGGCTTTG at 255.
- Rap1r7ci: 8, ATGTTGCAGATTGG at 4180, GGGGACAGGGCTTG at 3723, ATGGCGCTGGTGGG at 3398, GGTCTGTTGCCGTG at 3131, AGGTAAAGGGTTCG at 2380, GAGTCCAGGCCGTG at 2231, GAGCCCCGGGTTTG at 1847, GTGTTCCAGACGGG at 129.
- Rap1r8ci: 6, TTGTAAATGCTTCG at 2508, AGGTAGCGGCCTTG at 2032, TAGCCAATGACTCG at 1135, GGTGAGATGGTGTG at 536, AATCTCCAGGTGCG at 294, TGGTTATGGGTGGG at 99.
- Rap1r9ci: 6, ATTTCCCAGGCGGG at 2135, GGTTACCGGACTGG at 1712, TTTTAACTGCCTTG at 846, ATTTACTGGCCCGG at 632, TATTTACTGGCCCG at 631, ATTTTATAGCCGGG at 168.
Rap1r arbitrary (evens) (4560-2846) UTRs
- Rap1r0: CAAGGCCGTTCAAA at 4165, CAAGTCAAGACCTT at 4050, CCAGGCTGTGGCTC at 2846.
- Rap1r4: CGAACCAAGTCCTT at 3972, CCGACCTATAAACT at 3687, CGGGCCTAGGAACC at 3170.
- Rap1r6: CGCGGCCACAACAA at 4395, CAAATCCGGGAACC at 3344, CACACCTTGTCCCT at 3171, CCCAGCTGTTGAAT at 2887.
- Rap1r0ci: TGTTTGAAGGTCGG at 4188.
- Rap1r2ci: GAGCACAAGGCCGG at 4095.
- Rap1r4ci: GGGCAACGGACCGG at 4237, GTGCCGAGGATGTG at 4072.
Rap1r alternate (odds) (4560-2846) UTRs
- Rap1r1: CGAGCCAAGGACCC at 4467, CCGATCCTTGGCCC at 4087, CACGTCCTTAACTA at 3437.
- Rap1r5: CCAATCCTGAAAAA at 4165, CCCACCAGGGCACC at 3507, CGAATCCGGGCAAT at 3457.
- Rap1r7: CGCAGCTAGTGCAT at 4289.
- Rap1r9: CCGACCTGGGAATT at 4201, CACGCCTATGAACA at 3747, CGGGGCTATGGACT at 2865.
- Rap1r5ci: AATTTGTGGCTCGG at 3886, TGTTAATTGCTCGG at 3572, ATTGCCTTGCCGCG at 3324, AGTTAGTTGGTTTG at 3270, GAGCAGTGGGCTGG at 3164, AATTCGTTGCTTGG at 2998.
- Rap1r7ci: ATGTTGCAGATTGG at 4180, GGGGACAGGGCTTG at 3723, ATGGCGCTGGTGGG at 3398, GGTCTGTTGCCGTG at 3131.
Rap1r arbitrary negative direction (evens) (2846-2811) core promoters
- Rap1r0: CCAGGCTGTGGCTC at 2846.
Rap1r arbitrary positive direction (odds) (4445-4265) core promoters
- Rap1r7: CGCAGCTAGTGCAT at 4289.
Rap1r alternate positive direction (evens) (4445-4265) core promoters
- Rap1r6: CGCGGCCACAACAA at 4395.
Rap1r arbitrary negative direction (evens) (2811-2596) proximal promoters
- Rap1r0: CACGTCCTCACCAT at 2601.
- Rap1r6: CAGAGCATGTGAAT at 2333, CCCACCCAGGGCAC at 966, CCCACCAAGGACAT at 792, CGGGCCAATGGAAA at 745.
- Rap1r8: CCAGCCAGTTCACT at 2796.
- Rap1r6ci: TTTTAACGGGTCCG at 2787.
Rap1r alternate negative direction (odds) (2811-2596) proximal promoters
- Rap1r1: CGGGTCAATGCCTT at 2784.
- Rap1r3: CAAATCCTCAGATA at 2773.
Rap1r arbitrary positive direction (odds) (4265-4050) proximal promoters
- Rap1r1: CCGATCCTTGGCCC at 4087.
- Rap1r5: CCAATCCTGAAAAA at 4165.
- Rap1r9: CCGACCTGGGAATT at 4201.
- Rap1r7ci: ATGTTGCAGATTGG at 4180.
Rap1r alternate positive direction (evens) (4265-4050) proximal promoters
- Rap1r0: CAAGGCCGTTCAAA at 4165, CAAGTCAAGACCTT at 4050.
- Rap1r0ci: TGTTTGAAGGTCGG at 4188.
- Rap1r2ci: GAGCACAAGGCCGG at 4095.
- Rap1r4ci: GGGCAACGGACCGG at 4237, GTGCCGAGGATGTG at 4072.
Rap1r arbitrary negative direction (evens) (2596-1) distal promoters
- Rap1r0: CAAACCAGTGAACT at 1331, CCAAGCTGGAACTA at 1303, CCAGGCTGGGCATC at 1255, CCGGGCTGGGAAAC at 831.
- Rap1r2: CGAAGCCTTACACA at 509.
- Rap1r4: CGAGCCCATTGCCC at 1846, CCAATCTTGTCCAC at 1770.
- Rap1r6: CAGAGCATGTGAAT at 2333, CCCACCCAGGGCAC at 966, CCCACCAAGGACAT at 792, CGGGCCAATGGAAA at 745.
- Rap1r8: CGCGTCCACTGAAC at 1582.
- Rap1r0ci: AATGCCATGGTTCG at 781.
- Rap1r2ci: TTGGCATAGCTTGG at 622, GGGGCACGGCTTTG at 551.
- Rap1r4ci: GAGCAAATGGTCCG at 1814, GTTTAATTGGCGGG at 712, TATCACTTGATTGG at 651.
- Rap1r6ci: GGGGCCTAGACCCG at 1601, GGTTCGTGGCCGCG at 1280, TATGAATGGCTTTG at 255.
- Rap1r8ci: TTGTAAATGCTTCG at 2508, AGGTAGCGGCCTTG at 2032, TAGCCAATGACTCG at 1135, GGTGAGATGGTGTG at 536, AATCTCCAGGTGCG at 294, TGGTTATGGGTGGG at 99.
Rap1r alternate negative direction (odds) (2596-1) distal promoters
- Rap1r1: CCGACCCAGTAAAT at 2362, CCGACCATCGCAAA at 1774, CGAAGCATCAAAAA at 1440, CGGGTCTACGAACA at 611, CCAACCTTTACACT at 528.
- Rap1r3: CCGGTCCAGGGCCA at 1592, CCAGCCCTCTGAAC at 1434.
- Rap1r5: CAGGCCTTTAAAAT at 2372, CACACCTTCGCCTC at 1650, CCCGCCAGCTCCAC at 800, CGAAGCCTTGCCCT at 481, CCCGCCCTTGGCCC at 357.
- Rap1r7: CAGACCATCTAAAA at 1275, CGGACCCGGGACAA at 61.
- Rap1r9: CCAAGCCTGAGATT at 2327, CAAGGCTATTCACT at 648.
- Rap1r1ci: TATTCATGGCCTTG at 2005, AGGCCACTGGTTTG at 923.
- Rap1r3ci: GAGGTAAGGGTGTG at 2010, AGGGCACTGGTCGG at 1888, TGGTTCTGGATGCG at 1690.
- Rap1r5ci: GTTTTACGGACGGG at 2083, ATGTAACAGCCCCG at 985.
- Rap1r7ci: AGGTAAAGGGTTCG at 2380, GAGTCCAGGCCGTG at 2231, GAGCCCCGGGTTTG at 1847, GTGTTCCAGACGGG at 129.
- Rap1r9ci: ATTTCCCAGGCGGG at 2135, GGTTACCGGACTGG at 1712, TTTTAACTGCCTTG at 846, ATTTACTGGCCCGG at 632, TATTTACTGGCCCG at 631, ATTTTATAGCCGGG at 168.
Rap1r arbitrary positive direction (odds) (4050-1) distal promoters
- Rap1r1: CACGTCCTTAACTA at 3437, CGGGTCAATGCCTT at 2784, CCGACCCAGTAAAT at 2362, CCGACCATCGCAAA at 1774, CGAAGCATCAAAAA at 1440, CGGGTCTACGAACA at 611, CCAACCTTTACACT at 528.
- Rap1r3: CAAATCCTCAGATA at 2773, CCGGTCCAGGGCCA at 1592, CCAGCCCTCTGAAC at 1434.
- Rap1r5: CCCACCAGGGCACC at 3507, CGAATCCGGGCAAT at 3457, CAGGCCTTTAAAAT at 2372, CACACCTTCGCCTC at 1650, CCCGCCAGCTCCAC at 800, CGAAGCCTTGCCCT at 481, CCCGCCCTTGGCCC at 357.
- Rap1r7: CAGACCATCTAAAA at 1275, CGGACCCGGGACAA at 61.
- Rap1r9: CACGCCTATGAACA at 3747, CGGGGCTATGGACT at 2865, CCAAGCCTGAGATT at 2327, CAAGGCTATTCACT at 648.
- Rap1r1ci: TATTCATGGCCTTG at 2005, AGGCCACTGGTTTG at 923.
- Rap1r3ci: GAGGTAAGGGTGTG at 2010, AGGGCACTGGTCGG at 1888, TGGTTCTGGATGCG at 1690.
- Rap1r5ci: AATTTGTGGCTCGG at 3886, TGTTAATTGCTCGG at 3572, ATTGCCTTGCCGCG at 3324, AGTTAGTTGGTTTG at 3270, GAGCAGTGGGCTGG at 3164, AATTCGTTGCTTGG at 2998, GTTTTACGGACGGG at 2083, ATGTAACAGCCCCG at 985.
- Rap1r7ci: GGGGACAGGGCTTG at 3723, ATGGCGCTGGTGGG at 3398, GGTCTGTTGCCGTG at 3131, AGGTAAAGGGTTCG at 2380, GAGTCCAGGCCGTG at 2231, GAGCCCCGGGTTTG at 1847, GTGTTCCAGACGGG at 129.
- Rap1r9ci: ATTTCCCAGGCGGG at 2135, GGTTACCGGACTGG at 1712, TTTTAACTGCCTTG at 846, ATTTACTGGCCCGG at 632, TATTTACTGGCCCG at 631, ATTTTATAGCCGGG at 168.
Rap1r alternate positive direction (evens) (4050-1) distal promoters
- Rap1r0: CAAGTCAAGACCTT at 4050, CCAGGCTGTGGCTC at 2846, CACGTCCTCACCAT at 2601, CAAACCAGTGAACT at 1331, CCAAGCTGGAACTA at 1303, CCAGGCTGGGCATC at 1255, CCGGGCTGGGAAAC at 831.
- Rap1r2: CGAAGCCTTACACA at 509.
- Rap1r4: CGAACCAAGTCCTT at 3972, CCGACCTATAAACT at 3687, CGGGCCTAGGAACC at 3170, CGAGCCCATTGCCC at 1846, CCAATCTTGTCCAC at 1770.
- Rap1r6: CAAATCCGGGAACC at 3344, CACACCTTGTCCCT at 3171, CCCAGCTGTTGAAT at 2887, CAGAGCATGTGAAT at 2333, CCCACCCAGGGCAC at 966, CCCACCAAGGACAT at 792, CGGGCCAATGGAAA at 745.
- Rap1r8: CCAGCCAGTTCACT at 2796, CGCGTCCACTGAAC at 1582.
- Rap1r0ci: AATGCCATGGTTCG at 781.
- Rap1r2ci: TTGGCATAGCTTGG at 622, GGGGCACGGCTTTG at 551.
- Rap1r4ci: GAGCAAATGGTCCG at 1814, GTTTAATTGGCGGG at 712, TATCACTTGATTGG at 651.
- Rap1r6ci: TTTTAACGGGTCCG at 2787, GGGGCCTAGACCCG at 1601, GGTTCGTGGCCGCG at 1280, TATGAATGGCTTTG at 255.
- Rap1r8ci: TTGTAAATGCTTCG at 2508, AGGTAGCGGCCTTG at 2032, TAGCCAATGACTCG at 1135, GGTGAGATGGTGTG at 536, AATCTCCAGGTGCG at 294, TGGTTATGGGTGGG at 99.
Discussion
When the Rap1 motif was held constant to ACCCRNRCA[1], no real results occurred. However, using the ten random datasets for testing ACCCRNRCA and its inverse complement yielded five consensus sequence results and four inverse complements. Two were in the UTR of A1BG from the negative direction. One was in the proximal promoter from the positive direction, and the remaining five were in the distal promoters.
The reduced consensus (A/G)(A/C)ACCC(A/G)N(A/G)C(A/C)(C/T)(A/C)[1] had one result GAACCCACACCTC in the positive direction at 1807, less than half way from ZNF497. Of ten random datasets only one had a result: GCACCCGGGCATC at 1454. Also, for the inverse complement, there was only one TATGCCTGGGTTT at 1380. In both the real sequences and random sequences, each was in the distal promoter closer to the zinc finger than A1BG. The occurrence of one random result per ten datasets suggests that such a result is rarely random. While the real occurrence is likely active as a regulatory response.
The full consensus sequence C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T)[1] gave four to six results in the UTR negative direction, one in the core promoter in the positive direction, two in the proximal promoter in the negative direction and one in the positive direction. In the distal promoter each direction had eight to nine results.
For the random data sets: UTR ranged from zero to four in the UTR, core promoter produced only zero to one, proximal promoter produced zero to one, and the distal promoter contained one to seven for either direction.
Comparing the two, the real UTR, proximal promoter, and distal promoter usually exceeded the random results. This suggests that some of the real results could just be due to random associations of nucleotides, but the rest are likely real.
Rap1 analysis and results
Reduced consensus sequence including more frequent nucleotides: C(A/C/G)(A/C/G)(A/G)(C/G/T)C(A/C/T)(A/G/T)(C/G/T)(A/G/T)(A/C/G)(A/C)(A/C/T)(A/C/T).[1]
Reals or randoms | Promoters | direction | Numbers | Strands | Occurrences | Averages (± 0.1) |
---|---|---|---|---|---|---|
Reals | UTR | negative | 10 | 2 | 5 | 5 ± 1 (--6,+-4) |
Randoms | UTR | arbitrary negative | 14 | 10 | 1.4 | 1.7 |
Randoms | UTR | alternate negative | 20 | 10 | 2.0 | 1.7 |
Reals | Core | negative | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary negative | 1 | 10 | 0.1 | 0.05 |
Randoms | Core | alternate negative | 0 | 10 | 0 | 0.05 |
Reals | Core | positive | 2 | 2 | 1 | 1 ± 0 (-+1,++1) |
Randoms | Core | arbitrary positive | 1 | 10 | 0.1 | 0.1 |
Randoms | Core | alternate positive | 1 | 10 | 0.1 | 0.1 |
Reals | Proximal | negative | 2 | 2 | 1 | 1 ± 1 (--0,+-2) |
Randoms | Proximal | arbitrary negative | 7 | 10 | 0.7 | 0.45 |
Randoms | Proximal | alternate negative | 2 | 10 | 0.2 | 0.45 |
Reals | Proximal | positive | 0 | 2 | 0 | 0 |
Randoms | Proximal | arbitrary positive | 4 | 10 | 0.4 | 0.5 |
Randoms | Proximal | alternate positive | 6 | 10 | 0.6 | 0.5 |
Reals | Distal | negative | 16 | 2 | 8 | 8 ± 1 (--9,+-7) |
Randoms | Distal | arbitrary negative | 27 | 10 | 2.7 | 3.0 |
Randoms | Distal | alternate negative | 33 | 10 | 3.3 | 3.0 |
Reals | Distal | positive | 18 | 2 | 9 | 9 (-+9,++9) |
Randoms | Distal | arbitrary positive | 49 | 10 | 4.9 | 4.35 |
Randoms | Distal | alternate positive | 38 | 10 | 3.8 | 4.35 |
Comparison:
The occurrences of real Rap1s are greater than the randoms. This suggests that the real Rap1s are likely active or activable.
Acknowledgements
The content on this page was first contributed by: Henry A. Hoff.