Auxin response factor gene transcriptions
Associate Editor(s)-in-Chief: Henry A. Hoff
The "genome binding of two [auxin response factors] ARFs (ARF2 and ARF5/Monopteros [MP]) differ largely because these two factors have different preferred ARF binding site (ARFbs) arrangements (orientation and spacing)."[1] "ARFbs were originally defined as TGTCTC (Ulmasov et al., 1995, Guilfoyle et al., 1998), [...]. More recently, protein binding microarray (PBM) experiments suggested that TGTCGG are preferred ARFbs, [...] (Boer et al., 2014, Franco-Zorrilla et al., 2014, Liao et al., 2015)."[1]
Human genes
Gene ID: 84514 is GHDC GH3 domain containing on 17q21.2.
- NP_001136095.1 GH3 domain-containing protein isoform 3 precursor: "Transcript Variant: This variant (3) uses an alternate splice site, resulting in a frameshifted and alternate 3' coding region, compared to variant 1. The encoded isoform (3) has a shorter and distinct C-terminus, compared to isoform 1."[2] Conserved Domains (2) summary: cl21606, Location:249 → 410, PLN02247; indole-3-acetic acid-amido synthetase, pfam03321, Location:54 → 420, GH3; GH3 auxin-responsive promoter.[2]
- NP_115873.1 GH3 domain-containing protein isoform 1 precursor: "Transcript Variant: This variant (1) encodes the longer isoform (1)."[2] Conserved Domains (2) summary: cl21606, Location:89 → 530, PLN02247; indole-3-acetic acid-amido synthetase, pfam03321, Location:54 → 490, GH3; GH3 auxin-responsive promoter.[2]
Gene ID: 153201 is SLC36A2 solute carrier family 36 member 2 on 5q33.1: "This gene encodes a pH-dependent proton-coupled amino acid transporter that belongs to the amino acid auxin permease 1 protein family. The encoded protein primarily transports small amino acids such as glycine, alanine and proline. Mutations in this gene are associated with iminoglycinuria and hyperglycinuria."[3]
Gene ID: 206358 is SLC36A1 solute carrier family 36 member 1 on 5q33.1: "This gene encodes a member of the eukaryote-specific amino acid/auxin permease (AAAP) 1 transporter family. The encoded protein functions as a proton-dependent, small amino acid transporter. This gene is clustered with related family members on chromosome 5q33.1. Alternative splicing results in multiple transcript variants."[4]
Gene expressions
Interactions
Consensus sequences
A more general consensus sequence may be (C/G/T)N(G/T)G(C/T)(C/T)NNNN, where ARF2 is (C/G/T)(A/C/T)(G/T)G(C/T)(C/T)(G/T)(C/G)(A/C/T)(A/G/T) and ARF5/MP is (C/G/T)N(G/T)GTC(G/T).[1] ARF1 has G at the fourth position.[1]
Binding site for
Inverse copies
Enhancer activity
Promoter occurrences
Hypotheses
- A1BG has no Auxin response factors in either promoter.
- A1BG is not transcribed by an Auxin response factor.
- Auxin response factor does not participate in the transcription of A1BG.
ARF (Stigliani) samplings
Copying an auxin response factor consensus sequence 5'-TGTCGG-3' and putting the sequence in "⌘F" finds no location between ZNF497 and A1BG or one location between ZSCAN22 and A1BG as can be found by the computer programs.
For the Basic programs testing consensus sequence 5'-(C/G/T)(A/C/T)(G/T)G(C/T)(C/T)(G/T)(C/G)(A/C/T)(A/G/T)-3' (starting with SuccessablesARF.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
- negative strand, negative direction: 12, CAGGTTTCTG at 4506, TCGGTCTGCA at 4236, TAGGTCGGTA at 3685, TTTGTCTGTA at 2881, TTTGTTTGTT at 2488, GAGGCCTCCG at 2358, TATGTCTGTA at 1570, CTTGCTTCCG at 1557, TTTGTTTGTT at 1392, GAGGTCGGAG at 1064, TTTGTCTGTA at 171, CTGGTCGGTA at 38.
- positive strand, negative direction: 17, CCTGTCTCAA at 4374, CATGCCTGTA at 4122, CAGGCTTGAG at 3402, TCTGTCTCAA at 3324, GAGGTTGCTG at 3264, CATGCCTGTA at 2673, CATGCCTGTA at 2539, GAGGTTGCAG at 2401, CATGCTGGTG at 2328, GTGGCTGGAG at 2071, TCTGTCTCAA at 2034, CTTGCCTGAA at 1624, CAGGCTGGAG at 1466, GAGGTTGCAG at 1322, TCTGTCTCAA at 1090, GAGGTTGCAG at 1031, TCTGTCTCAA at 924.
- negative strand, positive direction: 23, GAGGCCTCCT at 4407, GAGGCCTCAG at 4194, CAGGTCTCAG at 3774, CTGGCCTCCA at 3685, CATGTTTGCA at 3341, CTGGTCTCCT at 3302, GTTGTCTCTT at 3056, CAGGCCTCAG at 3040, CTTGTCTGAG at 3007, CTGGCTGCCT at 2889, CAGGCCTCTG at 2882, GAGGCTGGTG at 2812, GATGTTGCAG at 2720, CCTGCCTCAG at 2525, CATGTTGCCT at 2479, CAGGCTGGAG at 2322, TCTGTTTCAT at 2265, TAGGTCTGTT at 2261, TCTGTTGGCA at 2187, GCTGTCTGCT at 1734, GCGGCCTGAA at 726, CTTGCTGCAG at 532, GCTGCCGGTG at 486.
- positive strand, positive direction: 5, CCTGTTTGTG at 4257, GTGGCCGGTG at 1850, GAGGTTGGAT at 1282, GAGGTTGGAG at 610, CCGGTCGCCG at 332.
- inverse complement, negative strand, negative direction: 8, AACCAACCGG at 3948, TACAGACCTC at 3837, ATGAGGCCTC at 2356, CACCGACCTC at 2071, CTGAGACAGA at 2031, TACCGACCTC at 1748, CTGAGACAGA at 1087, CTGAGACAGA at 921.
- inverse complement, positive strand, negative direction: 33, CTCCAGCCTG at 4348, CTGAGGCAGG at 4282, AGGCAGCATC at 3902, AAGAGGCAGC at 3899, CTCAAGCAAC at 3848, CTCCAGCCTG at 3297, TGGAGACCAG at 3123, CTGAGGCAGG at 2568, AACAGGCCAG at 2518, AACAAACAGG at 2514, AACAAACAAA at 2489, AGCAAACAAA at 2485, CTCCAGCCTG at 2434, CTGAGGCAGG at 2367, TTGAGACCAG at 2263, TTGAGACCAA at 2147, TGGAGGCCAG at 2076, CTCCAGCCTG at 2008, CTGAGGCAGG at 1941, TGCCAGCAGA at 1614, AACAAACCTA at 1590, ATGAAACAAA at 1586, TACAGACATC at 1571, AACAAACAAA at 1393, AGGAGGCAGA at 1314, CTGAGGCAGG at 1288, TTGCGACCAG at 1193, CTCCAGCCTC at 1064, CTGAGGCAGG at 997, CTCCAGCCTG at 898, CTGAGGCAGG at 831, AGCCAGCCTG at 507, AAGAGGCCGG at 374.
- inverse complement, negative strand, positive direction: 33, CTGAGGCCTC at 4192, CACCAGCAGC at 3723, AGGCAGCAGG at 3694, CAGAAGCCAG at 3220, ATGCAGCAGG at 3147, TTCAGGCCTC at 3038, TGCAGACCTC at 2863, TTCAAACAGA at 2652, TAGAAACCAC at 2633, TTGCAGCCGC at 2355, ATGAAACCGC at 2150, AGGCAACCAC at 2122, CTCAGGCAAC at 2119, AAGCAGCCAA at 2011, CTCCGACAGG at 1966, CACCGGCCAC at 1850, TTGCAACCTC at 1618, CTGCAGCAAG at 1510, ATGCGGCAAG at 1426, ATGCGGCAAG at 1326, CTCCAACCTA at 1282, CTGCGGCAGC at 1037, CTGCGGCAAG at 1006, CTCCAACCTG at 946, CTCCAACCTG at 846, CTGCGGCAAG at 754, CAGCGGCCTG at 724, CTCCAACCTC at 610, CTGCAGCATC at 536, TGCAGACCGG at 442, AGCAAGCCAC at 342, CGGCAGCAAG at 338, CAGCGGCAGC at 335.
- inverse complement, positive strand, positive direction: 10, TGGAAACCAC at 3949, AGGAGACCGG at 2985, CGGAGACCGA at 2885, CTCCGACCAC at 2812, CGGCGGCCAC at 1761, CTCCGGCAAG at 1489, CTCCGGCAAG at 1389, AGGAAGCCGG at 764, CACAGACCTC at 272, AAGAAACATA at 114.
ARFS (4560-2846) UTRs
- Negative strand, negative direction: CAGGTTTCTG at 4506, TCGGTCTGCA at 4236, AACCAACCGG at 3948, TACAGACCTC at 3837, TAGGTCGGTA at 3685, TTTGTCTGTA at 2881.
- Positive strand, negative direction: CCTGTCTCAA at 4374, CTCCAGCCTG at 4348, CTGAGGCAGG at 4282, CATGCCTGTA at 4122, AGGCAGCATC at 3902, AAGAGGCAGC at 3899, CTCAAGCAAC at 3848, CAGGCTTGAG at 3402, TCTGTCTCAA at 3324, CTCCAGCCTG at 3297, GAGGTTGCTG at 3264, TGGAGACCAG at 3123.
ARFS positive direction (4445-4265) core promoters
- Negative strand, positive direction: GAGGCCTCCT at 4407.
AFRS negative direction (2811-2596) proximal promoters
- Positive strand, negative direction: CATGCCTGTA at 2673.
ARFS positive direction (4265-4050) proximal promoters
- Negative strand, positive direction: GAGGCCTCAG at 4194, CTGAGGCCTC at 4192.
- Positive strand, positive direction: CCTGTTTGTG at 4257.
ARFS negative direction (2596-1) distal promoters
- Negative strand, negative direction: TTTGTTTGTT at 2488, GAGGCCTCCG at 2358, ATGAGGCCTC at 2356, CACCGACCTC at 2071, CTGAGACAGA at 2031, TACCGACCTC at 1748, TATGTCTGTA at 1570, CTTGCTTCCG at 1557, TTTGTTTGTT at 1392, CTGAGACAGA at 1087, GAGGTCGGAG at 1064, CTGAGACAGA at 921, TTTGTCTGTA at 171, CTGGTCGGTA at 38.
- Positive strand, negative direction: CTGAGGCAGG at 2568, CATGCCTGTA at 2539, AACAGGCCAG at 2518, AACAAACAGG at 2514, AACAAACAAA at 2489, AGCAAACAAA at 2485, CTCCAGCCTG at 2434, GAGGTTGCAG at 2401, CTGAGGCAGG at 2367, CATGCTGGTG at 2328, TTGAGACCAG at 2263, TTGAGACCAA at 2147, TGGAGGCCAG at 2076, GTGGCTGGAG at 2071, TCTGTCTCAA at 2034, CTCCAGCCTG at 2008, CTGAGGCAGG at 1941, CTTGCCTGAA at 1624, TGCCAGCAGA at 1614, AACAAACCTA at 1590, ATGAAACAAA at 1586, TACAGACATC at 1571, CAGGCTGGAG at 1466, AACAAACAAA at 1393, GAGGTTGCAG at 1322, AGGAGGCAGA at 1314, CTGAGGCAGG at 1288, TTGCGACCAG at 1193, TCTGTCTCAA at 1090, CTCCAGCCTC at 1064, GAGGTTGCAG at 1031, CTGAGGCAGG at 997, TCTGTCTCAA at 924, CTCCAGCCTG at 898, CTGAGGCAGG at 831, AGCCAGCCTG at 507, AAGAGGCCGG at 374.
ARFS positive direction (4050-1) distal promoters
- Negative strand, positive direction: CAGGTCTCAG at 3774, CACCAGCAGC at 3723, AGGCAGCAGG at 3694, CTGGCCTCCA at 3685, CATGTTTGCA at 3341, CTGGTCTCCT at 3302, CAGAAGCCAG at 3220, ATGCAGCAGG at 3147, GTTGTCTCTT at 3056, CAGGCCTCAG at 3040, TTCAGGCCTC at 3038, CTTGTCTGAG at 3007, CTGGCTGCCT at 2889, CAGGCCTCTG at 2882, TGCAGACCTC at 2863, GAGGCTGGTG at 2812, GATGTTGCAG at 2720, TTCAAACAGA at 2652, TAGAAACCAC at 2633, CCTGCCTCAG at 2525, CATGTTGCCT at 2479, TTGCAGCCGC at 2355, CAGGCTGGAG at 2322, TCTGTTTCAT at 2265, TAGGTCTGTT at 2261, TCTGTTGGCA at 2187, ATGAAACCGC at 2150, AGGCAACCAC at 2122, CTCAGGCAAC at 2119, AAGCAGCCAA at 2011, CTCCGACAGG at 1966, CACCGGCCAC at 1850, GCTGTCTGCT at 1734, TTGCAACCTC at 1618, CTGCAGCAAG at 1510, ATGCGGCAAG at 1426, ATGCGGCAAG at 1326, CTCCAACCTA at 1282, CTGCGGCAGC at 1037, CTGCGGCAAG at 1006, CTCCAACCTG at 946, CTCCAACCTG at 846, CTGCGGCAAG at 754, GCGGCCTGAA at 726, CAGCGGCCTG at 724, CTCCAACCTC at 610, CTGCAGCATC at 536, CTTGCTGCAG at 532, GCTGCCGGTG at 486, TGCAGACCGG at 442, AGCAAGCCAC at 342, CGGCAGCAAG at 338, CAGCGGCAGC at 335.
- Positive strand, positive direction: TGGAAACCAC at 3949, AGGAGACCGG at 2985, CGGAGACCGA at 2885, CTCCGACCAC at 2812, GTGGCCGGTG at 1850, CGGCGGCCAC at 1761, CTCCGGCAAG at 1489, CTCCGGCAAG at 1389, GAGGTTGGAT at 1282, AGGAAGCCGG at 764, GAGGTTGGAG at 610, CCGGTCGCCG at 332, CACAGACCTC at 272, AAGAAACATA at 114.
Stigliani random dataset samplings
- ARFSr0: 10, GAGGTTTCTA at 4222, TTTGTTTGAA at 4182, CATGCTTCAT at 4097, TCTGCCGGCG at 4079, GCGGCTTGCA at 3552, CTGGCCGCCA at 3408, CCTGCTTCAA at 3336, CCTGCCTCTA at 2717, CCGGTTTGAA at 1690, TAGGTCGGCA at 1420.
- ARFSr1: 13, CTGGCTTGAA at 4190, GTGGTCGCTT at 4158, TTTGCTTGCA at 4106, CTGGCCGGCA at 3710, GTGGTCGGTG at 3006, CAGGTTTCCG at 2814, CTTGTTTCCT at 2165, TCGGTTTGCA at 1979, GATGTCGCAA at 1454, GTGGTCGCAA at 581, TCGGCCTCCA at 517, CAGGCCTGAT at 450, TTTGTCTGCG at 10.
- ARFSr2: 9, TTGGTCTGTT at 2965, GTTGTTGGTG at 2867, GTTGTCTGCA at 2538, GTGGTTTCTT at 2465, GCTGCTGCCA at 1569, GCTGCTGCTG at 1566, TCTGTCTCCA at 1318, CCTGTTTCCG at 1101, GTGGTTTCAT at 1059.
- ARFSr3: 13, TTGGTTGGCG at 4110, CAGGTTTCCA at 3591, TCTGTTGGTT at 3550, TCGGTCTCAT at 3469, CTTGTTTCAT at 2786, CTGGCTTGAA at 2515, GTTGTTTGTT at 2269, CCGGTTGCTT at 2038, TTTGTTTCCT at 1032, TTTGCCTCCA at 933, CCGGCCTGAA at 873, GTGGTCGGCG at 647, GCGGTCTGTG at 511.
- ARFSr4: 11, TCGGTTGGCT at 4248, GAGGTTGCAG at 3364, TTTGTTGGAA at 2262, GCGGTTTCTA at 2123, CCTGTCGGTA at 2030, GCGGCTTCTA at 1587, GATGCTGCCT at 1401, GATGCTGGTG at 1294, GTGGCTTCTG at 964, CCTGTTGCCG at 806, CAGGTTTCCT at 251.
- ARFSr5: 11, CTGGCCGCTT at 4440, TATGCTTGAT at 4193, TATGTTGGCA at 4175, TTTGCTGGAG at 4144, TTGGTTTGAA at 3272, CCTGCTTCAA at 2916, GAGGTTTCAA at 1883, CATGCTTCCA at 1669, GTGGTTTCAT at 935, CATGTTTCTA at 147, CCTGTTGGAA at 26.
- ARFSr6: 10, TTTGTTGCTT at 4537, TTGGTTTGTT at 4533, GTGGTTTCTG at 4482, CCTGTTTGTG at 4364, GATGCTTGTT at 4162, TTTGCTTCTA at 3481, TTTGCCTGCG at 2085, TTTGTTGCCT at 1707, CATGTTGGTG at 1551, TATGTTTGTA at 912.
- ARFSr7: 15, GATGTTGCAG at 4175, TTTGCTTCCT at 4127, GATGTCTGCG at 4007, GTTGCCTCCG at 3838, TTGGTTTGAT at 3730, TATGCCGGCA at 3499, TTGGCCGGAT at 3303, TCTGTTGCCG at 3129, TTGGTCTGTT at 3125, GTTGTTGGAA at 2943, TCGGTTGGAG at 2526, CATGCCTGTT at 2197, GATGCCGCCT at 1771, CAGGCCTGCT at 1761, GTGGCCGCAA at 658.
- ARFSr8: 10, GCGGTTGGAA at 3165, TTTGCTTCTA at 3067, GTTGCCTCCG at 2487, CAGGTTGCCT at 2484, GTTGTTTCTA at 2385, CTTGTCTGCA at 2038, GCTGCCGGAG at 684, CCGGTTTCAG at 370, TAGGCTTCTA at 263, TTTGCCGGTT at 51.
- ARFSr9: 16, GTGGCTGCAG at 4364, GCTGCCGCTA at 4311, GAGGCCGGTT at 3984, CTGGTTGGAG at 2978, GTGGTTTGCA at 2709, CAGGTCTGCG at 2651, CATGTCGCTT at 1853, GTTGTTTCTG at 1737, TTTGTCGGTA at 1273, CTGGCTGGTA at 1214, TCTGCTGGCG at 1141, TTTGCCTCCG at 1038, TTTGCTTCTG at 713, CTGGCTTGCA at 492, CTGGTTTGTT at 95, TTTGTCTCTT at 27.
- ARFSr0ci: 7, TACCAACATC at 4415, TTGCGGCATA at 3447, CGCAGGCAAA at 3248, ATCCGACCTG at 2513, CAGAAGCAGG at 1482, CGGAGACCGA at 1317, CGCAAACCGA at 668.
- ARFSr1ci: 18, CACAAGCAAC at 4529, ATCCGGCAGG at 4308, AACAGACATG at 4198, TTGCAGCCAG at 3851, ATGAAACCGC at 3798, AACCAACCTC at 3541, AGGAAACCAA at 3537, TTCCGACAGC at 3040, ATCCAGCAGG at 2841, TTGAAACCAG at 2247, AAGCAGCAAG at 2225, AGCAAGCAGC at 2222, TTCCAACATA at 2099, AACCGACAAC at 1732, AGCCAACATC at 1202, CTGAAGCCTA at 1043, TTCCAGCAAA at 780, CACAGGCCTG at 448.
- ARFSr2ci: 10, TAGCGGCCTG at 3752, AAGAGGCAAG at 3609, AACAAACCAA at 3173, AGCAAACCAA at 2444, TGGAGGCCAA at 2263, CGCAAACCGA at 1747, ATGAAACCAA at 1632, CAGAAACAGC at 949, ATCCGGCCGG at 801, TTCCAACCGA at 122.
- ARFSr3ci: 5, CTGCGGCATC at 2720, TTCCGACCGG at 2366, CTGAAGCCGC at 1451, TTCAAACAGG at 1343, AGCCGGCCGG at 867.
- ARFSr4ci: 7, AAGAGGCAAG at 4403, TGGAGGCATC at 3583, CACCGGCAAG at 3462, TGGAAGCCTG at 1869, TTCAGGCCAA at 790, CTCCAACAAC at 420, ATGAGACAGC at 76.
- ARFSr5ci: 5, AACCGGCAAG at 1436, CAGCGGCAGA at 688, CGGCGGCCAG at 681, TAGCGGCCGC at 596, ATCAAACATG at 141.
- ARFSr6ci: 12, TGGCGGCAAG at 4437, TTCCGACAAA at 4300, TACCGACATG at 3542, TGCCAGCCGA at 3406, TGCCAGCAGG at 3259, TTGCAACCAC at 3073, CAGCGACCGG at 2995, CGCCGACCGC at 2846, CGGAAGCATA at 2583, TAGCGGCAGG at 1564, TTGAAACAGA at 1101, TTCAAACATA at 177.
- ARFSr7ci: 9, AGGCGACCAA at 4431, TTGAAACCAA at 4205, TACAAACAAA at 3788, AACAGACCAA at 3761, CGCCAACCAC at 3690, CTGCGACAGG at 2792, ATCCGGCCGC at 1910, TTCAGGCCTG at 1759, TACCAGCCAA at 482.
- ARFSr8ci: 13, AAGAAGCAAG at 4389, CGCAGACAGA at 4261, TTCAAACCTA at 4003, AGCAGGCCAA at 3897, TGCCAGCCAG at 2790, CGGAAACAAA at 2131, ATCCGACCAC at 1945, CTCAGGCAGG at 1547, CACCAACCAG at 814, TGGCAGCCGC at 760, AAGCGGCCTA at 716, TTCCAACAAA at 619, TGGCGACAAA at 211.
- ARFSr9ci: 15, AACCGACCTG at 4195, AGGAAACCGA at 4191, TGCAAACAGC at 4120, CGCCGACCTG at 4081, CTCAAACCGG at 4043, CGGCGGCAAG at 3899, AACAAACAAG at 3460, CAGAAACCTG at 3195, TACAGACATG at 2919, TGCCAACCTC at 2805, CGGCAGCCTC at 2557, TTGAGGCAGG at 2172, TTGAAACAAA at 1749, AGCCAGCAAC at 1313, AGGAAACCTC at 1282.
ARFSr arbitrary (evens) (4560-2846) UTRs
- ARFSr0: GAGGTTTCTA at 4222, TTTGTTTGAA at 4182, CATGCTTCAT at 4097, TCTGCCGGCG at 4079, GCGGCTTGCA at 3552, CTGGCCGCCA at 3408, CCTGCTTCAA at 3336.
- ARFSr2: TTGGTCTGTT at 2965, GTTGTTGGTG at 2867.
- ARFSr4: TCGGTTGGCT at 4248, GAGGTTGCAG at 3364.
- ARFSr6: TTTGTTGCTT at 4537, TTGGTTTGTT at 4533, GTGGTTTCTG at 4482, CCTGTTTGTG at 4364, GATGCTTGTT at 4162, TTTGCTTCTA at 3481.
- ARFSr8: GCGGTTGGAA at 3165, TTTGCTTCTA at 3067.
- ARFSr0ci: TACCAACATC at 4415, TTGCGGCATA at 3447, CGCAGGCAAA at 3248.
- ARFSr2ci: TAGCGGCCTG at 3752, AAGAGGCAAG at 3609, AACAAACCAA at 3173.
- ARFSr4ci: 7, AAGAGGCAAG at 4403, TGGAGGCATC at 3583, CACCGGCAAG at 3462.
- ARFSr6ci: TGGCGGCAAG at 4437, TTCCGACAAA at 4300, TACCGACATG at 3542, TGCCAGCCGA at 3406, TGCCAGCAGG at 3259, TTGCAACCAC at 3073, CAGCGACCGG at 2995, CGCCGACCGC at 2846.
- ARFSr8ci: AAGAAGCAAG at 4389, CGCAGACAGA at 4261, TTCAAACCTA at 4003, AGCAGGCCAA at 3897.
ARFSr alternate (odds) (4560-2846) UTRs
- ARFSr1: CTGGCTTGAA at 4190, GTGGTCGCTT at 4158, TTTGCTTGCA at 4106, CTGGCCGGCA at 3710, GTGGTCGGTG at 3006.
- ARFSr3: TTGGTTGGCG at 4110, CAGGTTTCCA at 3591, TCTGTTGGTT at 3550, TCGGTCTCAT at 3469.
- ARFSr5: CTGGCCGCTT at 4440, TATGCTTGAT at 4193, TATGTTGGCA at 4175, TTTGCTGGAG at 4144, TTGGTTTGAA at 3272, CCTGCTTCAA at 2916.
- ARFSr7: GATGTTGCAG at 4175, TTTGCTTCCT at 4127, GATGTCTGCG at 4007, GTTGCCTCCG at 3838, TTGGTTTGAT at 3730, TATGCCGGCA at 3499, TTGGCCGGAT at 3303, TCTGTTGCCG at 3129, TTGGTCTGTT at 3125, GTTGTTGGAA at 2943.
- ARFSr9: GTGGCTGCAG at 4364, GCTGCCGCTA at 4311, GAGGCCGGTT at 3984, CTGGTTGGAG at 2978.
- ARFSr1ci: CACAAGCAAC at 4529, ATCCGGCAGG at 4308, AACAGACATG at 4198, TTGCAGCCAG at 3851, ATGAAACCGC at 3798, AACCAACCTC at 3541, AGGAAACCAA at 3537, TTCCGACAGC at 3040.
- ARFSr7ci: AGGCGACCAA at 4431, TTGAAACCAA at 4205, TACAAACAAA at 3788, AACAGACCAA at 3761, CGCCAACCAC at 3690.
- ARFSr9ci: AACCGACCTG at 4195, AGGAAACCGA at 4191, TGCAAACAGC at 4120, CGCCGACCTG at 4081, CTCAAACCGG at 4043, CGGCGGCAAG at 3899, AACAAACAAG at 3460, CAGAAACCTG at 3195, TACAGACATG at 2919.
ARFSr arbitrary negative direction (evens) (2846-2811) core promoters
- ARFSr6ci: CGCCGACCGC at 2846.
ARFSr alternate negative direction (odds) (2846-2811) core promoters
- ARFSr1: CAGGTTTCCG at 2814.
- ARFSr1ci: ATCCAGCAGG at 2841.
ARFSr arbitrary positive direction (odds) (4445-4265) core promoters
- ARFSr9: GTGGCTGCAG at 4364, GCTGCCGCTA at 4311.
- ARFSr1ci: ATCCGGCAGG at 4308.
- ARFSr7ci: AGGCGACCAA at 4431.
ARFSr alternate positive direction (evens) (4445-4265) core promoters
- ARFSr6: CCTGTTTGTG at 4364.
- ARFSr0ci: TACCAACATC at 4415.
- ARFSr4ci: AAGAGGCAAG at 4403.
- ARFSr6ci: TGGCGGCAAG at 4437, TTCCGACAAA at 4300.
- ARFSr8ci: AAGAAGCAAG at 4389.
ARFSr arbitrary negative direction (evens) (2811-2596) proximal promoters
- ARFSr0: CCTGCCTCTA at 2717.
- ARFSr8ci: TGCCAGCCAG at 2790.
ARFSr alternate negative direction (odds) (2811-2596) proximal promoters
- ARFSir3: CTTGTTTCAT at 2786.
- ARFSr3ci: CTGCGGCATC at 2720.
- ARFSr7ci: CTGCGACAGG at 2792.
- ARFSr9ci: TGCCAACCTC at 2805.
ARFSr arbitrary positive direction (odds) (4265-4050) proximal promoters
- ARFSr1: CTGGCTTGAA at 4190, GTGGTCGCTT at 4158, TTTGCTTGCA at 4106.
- ARFSr3: TTGGTTGGCG at 4110.
- ARFSr5: TATGCTTGAT at 4193, TATGTTGGCA at 4175, TTTGCTGGAG at 4144.
- ARFSr7: GATGTTGCAG at 4175, TTTGCTTCCT at 4127.
- ARFSr1ci: AACAGACATG at 4198.
- ARFSr7ci: TTGAAACCAA at 4205.
- ARFSr9ci: AACCGACCTG at 4195, AGGAAACCGA at 4191, TGCAAACAGC at 4120, CGCCGACCTG at 4081.
ARFSr alternate positive direction (evens) (4265-4050) proximal promoters
- ARFSr0: GAGGTTTCTA at 4222, TTTGTTTGAA at 4182, CATGCTTCAT at 4097, TCTGCCGGCG at 4079.
- ARFSr4: TCGGTTGGCT at 4248.
- ARFSr6: GATGCTTGTT at 4162.
- ARFSr8ci: CGCAGACAGA at 4261.
ARFSr arbitrary negative direction (evens) (2596-1) distal promoters
- ARFSr0: CCGGTTTGAA at 1690, TAGGTCGGCA at 1420.
- ARFSr2: GTTGTCTGCA at 2538, GTGGTTTCTT at 2465, GCTGCTGCCA at 1569, GCTGCTGCTG at 1566, TCTGTCTCCA at 1318, CCTGTTTCCG at 1101, GTGGTTTCAT at 1059.
- ARFSr4: TTTGTTGGAA at 2262, GCGGTTTCTA at 2123, CCTGTCGGTA at 2030, GCGGCTTCTA at 1587, GATGCTGCCT at 1401, GATGCTGGTG at 1294, GTGGCTTCTG at 964, CCTGTTGCCG at 806, CAGGTTTCCT at 251.
- ARFSr6: TTTGCCTGCG at 2085, TTTGTTGCCT at 1707, CATGTTGGTG at 1551, TATGTTTGTA at 912.
- ARFSr8: GTTGCCTCCG at 2487, CAGGTTGCCT at 2484, GTTGTTTCTA at 2385, CTTGTCTGCA at 2038, GCTGCCGGAG at 684, CCGGTTTCAG at 370, TAGGCTTCTA at 263, TTTGCCGGTT at 51.
- ARFSr0ci: ATCCGACCTG at 2513, CAGAAGCAGG at 1482, CGGAGACCGA at 1317, CGCAAACCGA at 668.
- ARFSr2ci: AGCAAACCAA at 2444, TGGAGGCCAA at 2263, CGCAAACCGA at 1747, ATGAAACCAA at 1632, CAGAAACAGC at 949, ATCCGGCCGG at 801, TTCCAACCGA at 122.
- ARFSr4ci: TGGAAGCCTG at 1869, TTCAGGCCAA at 790, CTCCAACAAC at 420, ATGAGACAGC at 76.
- ARFSr6ci: CGGAAGCATA at 2583, TAGCGGCAGG at 1564, TTGAAACAGA at 1101, TTCAAACATA at 177.
- ARFSr8ci: CGGAAACAAA at 2131, ATCCGACCAC at 1945, CTCAGGCAGG at 1547, CACCAACCAG at 814, TGGCAGCCGC at 760, AAGCGGCCTA at 716, TTCCAACAAA at 619, TGGCGACAAA at 211.
ARFSr alternate negative direction (odds) (2596-1) distal promoters
- ARFSr1: CTTGTTTCCT at 2165, TCGGTTTGCA at 1979, GATGTCGCAA at 1454, GTGGTCGCAA at 581, TCGGCCTCCA at 517, CAGGCCTGAT at 450, TTTGTCTGCG at 10.
- ARFSr3: CTGGCTTGAA at 2515, GTTGTTTGTT at 2269, CCGGTTGCTT at 2038, TTTGTTTCCT at 1032, TTTGCCTCCA at 933, CCGGCCTGAA at 873, GTGGTCGGCG at 647, GCGGTCTGTG at 511.
- ARFSr5: GAGGTTTCAA at 1883, CATGCTTCCA at 1669, GTGGTTTCAT at 935, CATGTTTCTA at 147, CCTGTTGGAA at 26.
- ARFSr7: TCGGTTGGAG at 2526, CATGCCTGTT at 2197, GATGCCGCCT at 1771, CAGGCCTGCT at 1761, GTGGCCGCAA at 658.
- ARFSr9: CATGTCGCTT at 1853, GTTGTTTCTG at 1737, TTTGTCGGTA at 1273, CTGGCTGGTA at 1214, TCTGCTGGCG at 1141, TTTGCCTCCG at 1038, TTTGCTTCTG at 713, CTGGCTTGCA at 492, CTGGTTTGTT at 95, TTTGTCTCTT at 27.
- ARFSr1ci: TTGAAACCAG at 2247, AAGCAGCAAG at 2225, AGCAAGCAGC at 2222, TTCCAACATA at 2099, AACCGACAAC at 1732, AGCCAACATC at 1202, CTGAAGCCTA at 1043, TTCCAGCAAA at 780, CACAGGCCTG at 448.
- ARFSr3ci: TTCCGACCGG at 2366, CTGAAGCCGC at 1451, TTCAAACAGG at 1343, AGCCGGCCGG at 867.
- ARFSr5ci: AACCGGCAAG at 1436, CAGCGGCAGA at 688, CGGCGGCCAG at 681, TAGCGGCCGC at 596, ATCAAACATG at 141.
- ARFSr7ci: ATCCGGCCGC at 1910, TTCAGGCCTG at 1759, TACCAGCCAA at 482.
- ARFSr9ci: CGGCAGCCTC at 2557, TTGAGGCAGG at 2172, TTGAAACAAA at 1749, AGCCAGCAAC at 1313, AGGAAACCTC at 1282.
ARFSr arbitrary positive direction (odds) (4050-1) distal promoters
- ARFSr1: CTGGCCGGCA at 3710, GTGGTCGGTG at 3006, CAGGTTTCCG at 2814, CTTGTTTCCT at 2165, TCGGTTTGCA at 1979, GATGTCGCAA at 1454, GTGGTCGCAA at 581, TCGGCCTCCA at 517, CAGGCCTGAT at 450, TTTGTCTGCG at 10.
- ARFSr3: CAGGTTTCCA at 3591, TCTGTTGGTT at 3550, TCGGTCTCAT at 3469, CTTGTTTCAT at 2786, CTGGCTTGAA at 2515, GTTGTTTGTT at 2269, CCGGTTGCTT at 2038, TTTGTTTCCT at 1032, TTTGCCTCCA at 933, CCGGCCTGAA at 873, GTGGTCGGCG at 647, GCGGTCTGTG at 511.
- ARFSr5: TTGGTTTGAA at 3272, CCTGCTTCAA at 2916, GAGGTTTCAA at 1883, CATGCTTCCA at 1669, GTGGTTTCAT at 935, CATGTTTCTA at 147, CCTGTTGGAA at 26.
- ARFSr7: GATGTCTGCG at 4007, GTTGCCTCCG at 3838, TTGGTTTGAT at 3730, TATGCCGGCA at 3499, TTGGCCGGAT at 3303, TCTGTTGCCG at 3129, TTGGTCTGTT at 3125, GTTGTTGGAA at 2943, TCGGTTGGAG at 2526, CATGCCTGTT at 2197, GATGCCGCCT at 1771, CAGGCCTGCT at 1761, GTGGCCGCAA at 658.
- ARFSr9: GAGGCCGGTT at 3984, CTGGTTGGAG at 2978, GTGGTTTGCA at 2709, CAGGTCTGCG at 2651, CATGTCGCTT at 1853, GTTGTTTCTG at 1737, TTTGTCGGTA at 1273, CTGGCTGGTA at 1214, TCTGCTGGCG at 1141, TTTGCCTCCG at 1038, TTTGCTTCTG at 713, CTGGCTTGCA at 492, CTGGTTTGTT at 95, TTTGTCTCTT at 27.
- ARFSr1ci: TTGCAGCCAG at 3851, ATGAAACCGC at 3798, AACCAACCTC at 3541, AGGAAACCAA at 3537, TTCCGACAGC at 3040, ATCCAGCAGG at 2841, TTGAAACCAG at 2247, AAGCAGCAAG at 2225, AGCAAGCAGC at 2222, TTCCAACATA at 2099, AACCGACAAC at 1732, AGCCAACATC at 1202, CTGAAGCCTA at 1043, TTCCAGCAAA at 780, CACAGGCCTG at 448.
- ARFSr3ci: CTGCGGCATC at 2720, TTCCGACCGG at 2366, CTGAAGCCGC at 1451, TTCAAACAGG at 1343, AGCCGGCCGG at 867.
- ARFSr5ci: AACCGGCAAG at 1436, CAGCGGCAGA at 688, CGGCGGCCAG at 681, TAGCGGCCGC at 596, ATCAAACATG at 141.
- ARFSr7ci: 9TACAAACAAA at 3788, AACAGACCAA at 3761, CGCCAACCAC at 3690, CTGCGACAGG at 2792, ATCCGGCCGC at 1910, TTCAGGCCTG at 1759, TACCAGCCAA at 482.
- ARFSr9ci: CTCAAACCGG at 4043, CGGCGGCAAG at 3899, AACAAACAAG at 3460, CAGAAACCTG at 3195, TACAGACATG at 2919, TGCCAACCTC at 2805, CGGCAGCCTC at 2557, TTGAGGCAGG at 2172, TTGAAACAAA at 1749, AGCCAGCAAC at 1313, AGGAAACCTC at 1282.
ARFSr alternate positive direction (evens) (4050-1) distal promoters
- ARFSr0: GCGGCTTGCA at 3552, CTGGCCGCCA at 3408, CCTGCTTCAA at 3336, CCTGCCTCTA at 2717, CCGGTTTGAA at 1690, TAGGTCGGCA at 1420.
- ARFSr2: TTGGTCTGTT at 2965, GTTGTTGGTG at 2867, GTTGTCTGCA at 2538, GTGGTTTCTT at 2465, GCTGCTGCCA at 1569, GCTGCTGCTG at 1566, TCTGTCTCCA at 1318, CCTGTTTCCG at 1101, GTGGTTTCAT at 1059.
- ARFSr4: GAGGTTGCAG at 3364, TTTGTTGGAA at 2262, GCGGTTTCTA at 2123, CCTGTCGGTA at 2030, GCGGCTTCTA at 1587, GATGCTGCCT at 1401, GATGCTGGTG at 1294, GTGGCTTCTG at 964, CCTGTTGCCG at 806, CAGGTTTCCT at 251.
- ARFSr6: TTTGCTTCTA at 3481, TTTGCCTGCG at 2085, TTTGTTGCCT at 1707, CATGTTGGTG at 1551, TATGTTTGTA at 912.
- ARFSr8: GCGGTTGGAA at 3165, TTTGCTTCTA at 3067, GTTGCCTCCG at 2487, CAGGTTGCCT at 2484, GTTGTTTCTA at 2385, CTTGTCTGCA at 2038, GCTGCCGGAG at 684, CCGGTTTCAG at 370, TAGGCTTCTA at 263, TTTGCCGGTT at 51.
- ARFSr0ci: TTGCGGCATA at 3447, CGCAGGCAAA at 3248, ATCCGACCTG at 2513, CAGAAGCAGG at 1482, CGGAGACCGA at 1317, CGCAAACCGA at 668.
- ARFSr2ci: TAGCGGCCTG at 3752, AAGAGGCAAG at 3609, AACAAACCAA at 3173, AGCAAACCAA at 2444, TGGAGGCCAA at 2263, CGCAAACCGA at 1747, ATGAAACCAA at 1632, CAGAAACAGC at 949, ATCCGGCCGG at 801, TTCCAACCGA at 122.
- ARFSr4ci: TGGAGGCATC at 3583, CACCGGCAAG at 3462, TGGAAGCCTG at 1869, TTCAGGCCAA at 790, CTCCAACAAC at 420, ATGAGACAGC at 76.
- ARFSr6ci: TACCGACATG at 3542, TGCCAGCCGA at 3406, TGCCAGCAGG at 3259, TTGCAACCAC at 3073, CAGCGACCGG at 2995, CGCCGACCGC at 2846, CGGAAGCATA at 2583, TAGCGGCAGG at 1564, TTGAAACAGA at 1101, TTCAAACATA at 177.
- ARFSr8ci: TTCAAACCTA at 4003, AGCAGGCCAA at 3897, TGCCAGCCAG at 2790, CGGAAACAAA at 2131, ATCCGACCAC at 1945, CTCAGGCAGG at 1547, CACCAACCAG at 814, TGGCAGCCGC at 760, AAGCGGCCTA at 716, TTCCAACAAA at 619, TGGCGACAAA at 211.
Auxin response factor (Stigliani) analysis and results
ARF2 is (C/G/T)(A/C/T)(G/T)G(C/T)(C/T)(G/T)(C/G)(A/C/T)(A/G/T).[1]
Reals or randoms | Promoters | direction | Numbers | Strands | Occurrences | Averages (± 0.1) |
---|---|---|---|---|---|---|
Reals | UTR | negative | 18 | 2 | 9 | 9 ± 3 (--6,+-12) |
Randoms | UTR | arbitrary negative | 40 | 10 | 4.0 | 4.55 ± 0.55 |
Randoms | UTR | alternate negative | 51 | 10 | 5.1 | 4.55 ± 0.55 |
Reals | Core | negative | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary negative | 1 | 10 | 0.1 | 0.15 |
Randoms | Core | alternate negative | 2 | 10 | 0.2 | 0.15 |
Reals | Core | positive | 1 | 2 | 0.5 | 0.5 ± 0.5 (-+1,++0) |
Randoms | Core | arbitrary positive | 4 | 10 | 0.4 | 0.5 |
Randoms | Core | alternate positive | 6 | 10 | 0.6 | 0.5 |
Reals | Proximal | negative | 1 | 2 | 0.5 | 0.5 ± 0.5 (--0,+-1) |
Randoms | Proximal | arbitrary negative | 2 | 10 | 0.2 | 0.3 |
Randoms | Proximal | alternate negative | 4 | 10 | 0.4 | 0.3 |
Reals | Proximal | positive | 3 | 2 | 1.5 | 1.5 ± 0.5 (-+2,++1) |
Randoms | Proximal | arbitrary positive | 15 | 10 | 1.5 | 1.1 |
Randoms | Proximal | alternate positive | 7 | 10 | 0.7 | 1.1 |
Reals | Distal | negative | 51 | 2 | 25.5 | 25.5 ± 11.5 (--14,+-37) |
Randoms | Distal | arbitrary negative | 57 | 10 | 5.7 | 5.9 |
Randoms | Distal | alternate negative | 61 | 10 | 6.1 | 5.9 |
Reals | Distal | positive | 67 | 2 | 33.5 | 33.5 ± 19.5 (-+53,++14) |
Randoms | Distal | arbitrary positive | 99 | 10 | 9.9 | 9.1 |
Randoms | Distal | alternate positive | 83 | 10 | 8.3 | 9.1 |
Comparison:
The occurrences of real ARFS UTRs, cores, negative direction proximals and the distals are greater than the randoms, positive direction proximals overlap the high randoms. This suggests that the real ARFSs are likely active or activable.
TGTCTC (Ulmasov) ARFbs samplings
- Negative strand, negative direction: 7, TGTCTC at 4519, TGTCTC at 3673, TGTCTC at 2779, TGTCTC at 2444, TGTCTC at 2018, TGTCTC at 1074, TGTCTC at 908.
- Positive strand, negative direction: 6, TGTCTC at 4372, TGTCTC at 3322, TGTCTC at 2166, TGTCTC at 2032, TGTCTC at 1088, TGTCTC at 922.
- Negative strand, positive direction: 4, TGTCTC at 3054, TGTCTC at 2467, TGTCTC at 2173, TGTCTC at 2079.
- Positive strand, positive direction: 3, TGTCTC at 3180, TGTCTC at 3134, TGTCTC at 2653.
- ci, negative strand, negative direction: 4, GAGACA at 2029, GAGACA at 1452, GAGACA at 1085, GAGACA at 919.
- ci, positive strand, negative direction: 0.
- ci, negative strand, positive direction: 1, GAGACA at 712.
- ci, positive strand, positive direction: 2, GAGACA at 2308, GAGACA at 98.
AuxinU (4560-2846) UTRs
- Negative strand, negative direction: TGTCTC at 4519, TGTCTC at 3673.
- Positive strand, negative direction: TGTCTC at 4372, TGTCTC at 3322.
AuxinU negative direction (2811-2596) proximal promoters
- Negative strand, negative direction: TGTCTC at 2779.
AuxinU negative direction (2596-1) distal promoters
- Negative strand, negative direction: TGTCTC at 2444, GAGACA at 2029, TGTCTC at 2018, GAGACA at 1452, GAGACA at 1085, TGTCTC at 1074, GAGACA at 919, TGTCTC at 908.
- Positive strand, negative direction: TGTCTC at 2166, TGTCTC at 2032, TGTCTC at 1088, TGTCTC at 922.
AuxinU positive direction (4050-1) distal promoters
- Negative strand, positive direction: TGTCTC at 3054, TGTCTC at 2467, TGTCTC at 2173, TGTCTC at 2079, GAGACA at 712.
- Positive strand, positive direction: TGTCTC at 3180, TGTCTC at 3134, TGTCTC at 2653, GAGACA at 2308, GAGACA at 98.
Ulmasov random dataset samplings
- Ulmasovr0: 0.
- Ulmasovr1: 0.
- Ulmasovr2: 2, TGTCTC at 1316, TGTCTC at 599.
- Ulmasovr3: 0.
- Ulmasovr4: 1, TGTCTC at 3995.
- Ulmasovr5: 0.
- Ulmasovr6: 1, TGTCTC at 1292.
- Ulmasovr7: 0.
- Ulmasovr8: 0.
- Ulmasovr9: 1, TGTCTC at 25.
- Ulmasovr0ci: 2, GAGACA at 4234, GAGACA at 2330.
- Ulmasovr1ci: 2, GAGACA at 3256, GAGACA at 33.
- Ulmasovr2ci: 0.
- Ulmasovr3ci: 0.
- Ulmasovr4ci: 1, GAGACA at 74.
- Ulmasovr5ci: 1, GAGACA at 923.
- Ulmasovr6ci: 0.
- Ulmasovr7ci: 0.
- Ulmasovr8ci: 0.
- Ulmasovr9ci: 0.
AuxinUr arbitrary (evens) (4560-2846) UTRs
- Ulmasovr4: TGTCTC at 3995.
- Ulmasovr0ci: GAGACA at 4234.
AuxinUr alternate (odds) (4560-2846) UTRs
- Ulmasovr1ci: GAGACA at 3256.
AuxinUr alternate positive direction (evens) (4265-4050) proximal promoters
- Ulmasovr0ci: GAGACA at 4234.
AuxinUr arbitrary negative direction (evens) (2596-1) distal promoters
- Ulmasovr2: TGTCTC at 1316, TGTCTC at 599.
- Ulmasovr6: TGTCTC at 1292.
- Ulmasovr0ci: GAGACA at 2330.
- Ulmasovr4ci: GAGACA at 74.
AuxinUr alternate negative direction (odds) (2596-1) distal promoters
- Ulmasovr9: TGTCTC at 25.
- Ulmasovr1ci: GAGACA at 33.
- Ulmasovr5ci: GAGACA at 923.
AuxinUr arbitrary positive direction (odds) (4050-1) distal promoters
- Ulmasovr9: TGTCTC at 25.
- Ulmasovr1ci: GAGACA at 3256, GAGACA at 33.
- Ulmasovr5ci: GAGACA at 923.
AuxinUr alternate positive direction (evens) (4050-1) distal promoters
- Ulmasovr2: TGTCTC at 1316, TGTCTC at 599.
- Ulmasovr4: TGTCTC at 3995.
- Ulmasovr6: TGTCTC at 1292.
- Ulmasovr0ci: GAGACA at 2330.
- Ulmasovr4ci: GAGACA at 74.
Auxin response factor (Ulmasov) analysis and results
"ARFbs were originally defined as TGTCTC (Ulmasov et al., 1995, Guilfoyle et al., 1998), [...]."[1]
Reals or randoms | Promoters | direction | Numbers | Strands | Occurrences | Averages (± 0.1) |
---|---|---|---|---|---|---|
Reals | UTR | negative | 4 | 2 | 2 | 2 ± 0 (--2,+-2) |
Randoms | UTR | arbitrary negative | 2 | 10 | 0.2 | 0.15 |
Randoms | UTR | alternate negative | 1 | 10 | 0.1 | 0.15 |
Reals | Core | negative | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary negative | 0 | 10 | 0 | 0 |
Randoms | Core | alternate negative | 0 | 10 | 0 | 0 |
Reals | Core | positive | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary positive | 0 | 10 | 0 | 0 |
Randoms | Core | alternate positive | 0 | 10 | 0 | 0 |
Reals | Proximal | negative | 1 | 2 | 0.5 | 0.5 ± 0.5 (--1,+-0) |
Randoms | Proximal | arbitrary negative | 0 | 10 | 0 | 0 |
Randoms | Proximal | alternate negative | 0 | 10 | 0 | 0 |
Reals | Proximal | positive | 0 | 2 | 0 | 0 |
Randoms | Proximal | arbitrary positive | 0 | 10 | 0 | 0.05 |
Randoms | Proximal | alternate positive | 1 | 10 | 0.1 | 0.05 |
Reals | Distal | negative | 12 | 2 | 6 | 6 ± 2 (--8,+-4) |
Randoms | Distal | arbitrary negative | 5 | 10 | 0.5 | 0.4 |
Randoms | Distal | alternate negative | 3 | 10 | 0.3 | 0.4 |
Reals | Distal | positive | 10 | 2 | 5 | 5 ± 0 (-+5,++5) |
Randoms | Distal | arbitrary positive | 4 | 10 | 0.4 | 0.5 |
Randoms | Distal | alternate positive | 6 | 10 | 0.6 | 0.5 |
Comparison:
The occurrences of real ARF(Ulmasov)s are greater than the randoms. This suggests that the real ARF(Ulmasov)s are likely active or activable.
TGTCGG (Boer) ARFbs samplings
- Negative strand, negative direction: 1, TGTCGG at 3727.
- Positive strand, negative direction: 0.
- Negative strand, positive direction: 0.
- Positive strand, positive direction: 3, TGTCGG at 3896, TGTCGG at 3101, TGTCGG at 65.
- ci, negative strand, negative direction: 0.
- ci, positive strand, negative direction: 0.
- ci, negative strand, positive direction: 1, CCGACA at 1964.
- ci, positive strand, positive direction: 3, CCGACA at 3640, CCGACA at 3349, CCGACA at 264.
ARFB (4560-2846) UTRs
- Negative strand, negative direction: TGTCGG at 3727.
ARFB positive direction (4050-1) distal promoters
- Negative strand, positive direction: CCGACA at 1964.
- Positive strand, positive direction: TGTCGG at 3896, CCGACA at 3640, CCGACA at 3349, TGTCGG at 3101, CCGACA at 264, TGTCGG at 65.
Boer random dataset samplings
- Boerr0: 1, TGTCGG at 867.
- Boerr1: 0.
- Boerr2: 0.
- Boerr3: 0.
- Boerr4: 1, TGTCGG at 2028.
- Boerr5: 1, TGTCGG at 2902.
- Boerr6: 1, TGTCGG at 475.
- Boerr7: 1, TGTCGG at 3352.
- Boerr8: 1, TGTCGG at 3903.
- Boerr9: 1, TGTCGG at 1271.
- Boerr0ci: 1, CCGACA at 1319.
- Boerr1ci: 3, CCGACA at 3038, CCGACA at 1730, CCGACA at 959.
- Boerr2ci: 1, CCGACA at 1719.
- Boerr3ci: 0.
- Boerr4ci: 2, CCGACA at 4357, CCGACA at 399.
- Boerr5ci: 0.
- Boerr6ci: 2, CCGACA at 4298, CCGACA at 3540.
- Boerr7ci: 1, CCGACA at 2420.
- Boerr8ci: 1, CCGACA at 431.
- Boerr9ci: 2, CCGACA at 2521, CCGACA at 392.
Boerr arbitrary (evens) (4560-2846) UTRs
- Boerr8: TGTCGG at 3903.
- Boerr4ci: CCGACA at 4357.
- Boerr6ci: CCGACA at 4298, CCGACA at 3540.
Boerr alternate (odds) (4560-2846) UTRs
- Boerr5: TGTCGG at 2902.
- Boerr7: TGTCGG at 3352.
- Boerr1ci: CCGACA at 3038.
Boerr alternate positive direction (evens) (4445-4265) core promoters
- Boerr4ci: CCGACA at 4357.
- Boerr6ci: CCGACA at 4298.
Boerr arbitrary negative direction (evens) (2596-1) distal promoters
- Boerr0: TGTCGG at 867.
- Boerr4: TGTCGG at 2028.
- Boerr6: TGTCGG at 475.
- Boerr0ci: CCGACA at 1319.
- Boerr2ci: CCGACA at 1719.
- Boerr4ci: CCGACA at 399.
- Boerr8ci: CCGACA at 431.
Boerr alternate negative direction (odds) (2596-1) distal promoters
- Boerr9: TGTCGG at 1271.
- Boerr1ci: CCGACA at 1730, CCGACA at 959.
- Boerr7ci: CCGACA at 2420.
- Boerr9ci: CCGACA at 2521, CCGACA at 392.
Boerr arbitrary positive direction (odds) (4050-1) distal promoters
- Boerr5: TGTCGG at 2902.
- Boerr7: TGTCGG at 3352.
- Boerr9: TGTCGG at 1271.
- Boerr1ci: CCGACA at 3038, CCGACA at 1730, CCGACA at 959.
- Boerr7ci: CCGACA at 2420.
- Boerr9ci: CCGACA at 2521, CCGACA at 392.
Boerr alternate positive direction (evens) (4050-1) distal promoters
- Boerr0: TGTCGG at 867.
- Boerr4: TGTCGG at 2028.
- Boerr6: TGTCGG at 475.
- Boerr8: TGTCGG at 3903.
- Boerr0ci: CCGACA at 1319.
- Boerr2ci: CCGACA at 1719.
- Boerr4ci: CCGACA at 399.
- Boerr6ci: CCGACA at 3540.
- Boerr8ci: CCGACA at 431.
Auxin response factor (Boer) analysis and results
"More recently, protein binding microarray (PBM) experiments suggested that TGTCGG are preferred ARFbs, [...] (Boer et al., 2014, Franco-Zorrilla et al., 2014, Liao et al., 2015)."[1]
Reals or randoms | Promoters | direction | Numbers | Strands | Occurrences | Averages (± 0.1) |
---|---|---|---|---|---|---|
Reals | UTR | negative | 1 | 2 | 0.5 | 0.5 |
Randoms | UTR | arbitrary negative | 4 | 10 | 0.4 | 0.35 |
Randoms | UTR | alternate negative | 3 | 10 | 0.3 | 0.35 |
Reals | Core | negative | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary negative | 0 | 10 | 0 | 0 |
Randoms | Core | alternate negative | 0 | 10 | 0 | 0 |
Reals | Core | positive | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary positive | 0 | 10 | 0 | 0.1 |
Randoms | Core | alternate positive | 2 | 10 | 0.2 | 0.1 |
Reals | Proximal | negative | 0 | 2 | 0 | 0 |
Randoms | Proximal | arbitrary negative | 0 | 10 | 0 | 0 |
Randoms | Proximal | alternate negative | 0 | 10 | 0 | 0 |
Reals | Proximal | positive | 0 | 2 | 0 | 0 |
Randoms | Proximal | arbitrary positive | 0 | 10 | 0 | 0 |
Randoms | Proximal | alternate positive | 0 | 10 | 0 | 0 |
Reals | Distal | negative | 0 | 2 | 0 | 0 |
Randoms | Distal | arbitrary negative | 7 | 10 | 0.7 | 0.65 |
Randoms | Distal | alternate negative | 6 | 10 | 0.6 | 0.65 |
Reals | Distal | positive | 7 | 2 | 3.5 | (-+1,++6) |
Randoms | Distal | arbitrary positive | 9 | 10 | 0.9 | 0.9 |
Randoms | Distal | alternate positive | 9 | 10 | 0.9 | 0.9 |
Comparison:
The occurrences of real ARFBs are greater than the randoms. This suggests that the real ARFBs are likely active or activable.
ARF5 samplings
Copying an ARF5 consensus sequence CAGGTCT and putting the sequence in "⌘F" finds two locations between ZNF497 and A1BG or TCTGTCT found no location between ZSCAN22 and A1BG as can be found by the computer programs.
For the Basic programs testing consensus sequence (C/G/T)N(G/T)GTC(G/T) (starting with SuccessablesARF5.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
- negative strand, negative direction, looking for (C/G/T)N(G/T)GTC(G/T), 41, TTTGTCT at 4518, GGGGTCT at 4448, GAGGTCG at 4345, TCGGTCT at 4233, CTGGTCG at 4033, CGTGTCT at 3917, CGGGTCG at 3731, CTTGTCG at 3726, CAGGTCG at 3701, GGTGTCG at 3694, TAGGTCG at 3682, TGTGTCT at 3672, TTGGTCT at 3486, GAGGTCG at 3294, CGGGTCG at 3209, CTGGTCG at 3124, TTTGTCT at 2878, GTTGTCT at 2778, GGGGTCG at 2766, GTTGTCT at 2443, GAGGTCG at 2431, CCTGTCG at 2273, CTGGTCG at 2264, CCTGTCT at 2119, GCTGTCT at 2017, GAGGTCG at 2005, TATGTCT at 1567, TGGGTCT at 1518, TGGGTCT at 1411, CTGGTCG at 1194, GTTGTCT at 1073, GAGGTCG at 1061, TAGGTCG at 976, GTTGTCT at 907, GAGGTCG at 895, CTGGTCG at 737, CTGGTCG at 728, TCGGTCG at 504, CTTGTCT at 289, TTTGTCT at 168, CTGGTCG at 35.
- positive strand, negative direction, looking for (C/G/T)N(G/T)GTC(G/T), 10, CCTGTCT at 4371, CCTGTCT at 4210, TATGTCT at 3833, GTGGTCG at 3813, TCTGTCT at 3321, TATGTCT at 2986, TCTGTCT at 2031, CGTGTCT at 1222, TCTGTCT at 1087, TCTGTCT at 921.
- positive strand, positive direction, looking for (C/G/T)N(G/T)GTC(G/T), 34, GGGGTCT at 4330, TCTGTCG at 3895, GAGGTCT at 3891, GAGGTCT at 3806, GTGGTCG at 3720, CTGGTCT at 3548, CGGGTCG at 3239, TCGGTCT at 3221, GGTGTCG at 3194, TTTGTCT at 3179, CCTGTCT at 3133, TGTGTCG at 3100, TGGGTCT at 3091, TGTGTCT at 2837, TTTGTCT at 2652, TTGGTCT at 2228, CGGGTCT at 1742, GTGGTCT at 1631, GTGGTCG at 1463, GCGGTCG at 1457, GTGGTCG at 1363, GCGGTCG at 1357, TCGGTCG at 1271, GTGGTCG at 1127, CGTGTCG at 1054, TCGGTCT at 935, TCGGTCT at 835, TGTGTCG at 718, GGTGTCG at 634, GTGGTCG at 623, TCGGTCG at 617, CCGGTCG at 329, GGGGTCT at 204, GAGGTCT at 15.
- negative strand, positive direction, looking for (C/G/T)N(G/T)GTC(G/T), 19, GGGGTCT at 4414, GTGGTCT at 4380, CAGGTCT at 3771, CTGGTCT at 3299, CTGGTCT at 3245, GTTGTCT at 3053, CAGGTCT at 3019, CTTGTCT at 3004, GTGGTCT at 2941, CGGGTCT at 2489, TAGGTCT at 2258, TGTGTCT at 2078, GGGGTCT at 1958, CCTGTCT at 1862, GCTGTCT at 1731, GGGGTCT at 1711, GAGGTCG at 1687, TGTGTCT at 268, TCTGTCT at 100.
- inverse complement, negative strand, negative direction, looking for (A/C)GAC(A/C)N(A/C/G), 21, CGACACC at 3959, AGACCTC at 3837, CGACACC at 3711, AGACACA at 3558, AGACAGA at 3321, CGACCCA at 3182, CGACCCG at 3043, CGACCCC at 3037, CGACCAC at 2328, CGACCTC at 2071, AGACAGA at 2031, CGACCCG at 1893, CGACCCG at 1758, CGACCTC at 1748, CGACCTC at 1466, AGACAAC at 1454, AGACCCG at 1358, CGACCCG at 1113, AGACAGA at 1087, AGACAGA at 921, CGACCTA at 783.
- inverse complement, positive strand, negative direction, looking for (A/C)GAC(A/C)N(A/C/G), 15, AGACAAG at 4183, AGACCAG at 4032, AGACCAC at 3763, AGACCAG at 3123, AGACCAG at 2600, AGACCAG at 2263, AGACCAA at 2147, AGACCAC at 2123, CGACAGA at 2017, AGACATC at 1571, CGACCAG at 1193, CGACAAC at 1070, AGACCAG at 727, AGACAGG at 561, AGACAGG at 424.
- inverse complement, positive strand, positive direction, looking for (A/C)GAC(A/C)N(A/C/G), 30, AGACCCA at 4418, CGACACC at 4394, CGACCCG at 4179, CGACCCG at 3991, CGACACC at 3642, CGACAAG at 3351, CGACCAG at 3244, AGACCAA at 3023, AGACCGG at 2985, AGACACG at 2959, AGACCGA at 2885, CGACCAC at 2812, CGACCTC at 2772, CGACCTA at 2736, CGACCTC at 2322, AGACAAA at 2262, AGACAAC at 2184, AGACCCC at 1866, CGACCGG at 1738, CGACAGA at 1731, AGACCGC at 1478, AGACCGC at 1378, CGACCAC at 781, CGACCCG at 419, CGACCCG at 388, CGACCCC at 279, AGACCTC at 272, CGACACA at 266, AGACCAC at 104, AGACAGA at 100.
- inverse complement, negative strand, positive direction, looking for (A/C)GAC(A/C)N(A/C/G), 6, AGACAGC at 3895, AGACCTC at 3552, AGACCTC at 2863, CGACAGG at 1966, AGACACA at 714, AGACCGG at 442.
ARF5 (4560-2846) UTRs
- Negative strand, negative direction: TTTGTCT at 4518, GGGGTCT at 4448, GAGGTCG at 4345, TCGGTCT at 4233, CTGGTCG at 4033, CGACACC at 3959, CGTGTCT at 3917, AGACCTC at 3837, CGGGTCG at 3731, CTTGTCG at 3726, CGACACC at 3711, CAGGTCG at 3701, GGTGTCG at 3694, TAGGTCG at 3682, TGTGTCT at 3672, AGACACA at 3558, TTGGTCT at 3486, AGACAGA at 3321, GAGGTCG at 3294, CGGGTCG at 3209, CGACCCA at 3182, CTGGTCG at 3124, CGACCCG at 3043, CGACCCC at 3037, TTTGTCT at 2878.
- Positive strand, negative direction: CCTGTCT at 4371, CCTGTCT at 4210, AGACAAG at 4183, AGACCAG at 4032, TATGTCT at 3833, GTGGTCG at 3813, TCTGTCT at 3321, AGACCAC at 3763, AGACCAG at 3123, TATGTCT at 2986.
ARF5 negative direction (2846-2811) core promoters
- Positive strand, negative direction: TCTGTCT at 2031, CGTGTCT at 1222, TCTGTCT at 1087, TCTGTCT at 921.
ARF5 positive direction (4445-4265) core promoters
- Negative strand, positive direction: GGGGTCT at 4414, GTGGTCT at 4380.
- Positive strand, positive direction: AGACCCA at 4418, CGACACC at 4394, GGGGTCT at 4330.
ARF5 negative direction (2811-2596) proximal promoters
- Negative strand, negative direction: GTTGTCT at 2778, GGGGTCG at 2766.
- Positive strand, negative direction: AGACCAG at 2600.
ARF5 positive direction (4265-4050) proximal promoters
- Positive strand, positive direction: CGACCCG at 4179.
ARF5 negative direction (2596-1) distal promoters
- Negative strand, negative direction: GTTGTCT at 2443, GAGGTCG at 2431, CGACCAC at 2328, CCTGTCG at 2273, CTGGTCG at 2264, CCTGTCT at 2119, CGACCTC at 2071, AGACAGA at 2031, GCTGTCT at 2017, GAGGTCG at 2005, CGACCCG at 1893, CGACCCG at 1758, CGACCTC at 1748, TATGTCT at 1567, TGGGTCT at 1518, CGACCTC at 1466, AGACAAC at 1454, TGGGTCT at 1411, AGACCCG at 1358, CTGGTCG at 1194, CGACCCG at 1113, AGACAGA at 1087, GTTGTCT at 1073, GAGGTCG at 1061, TAGGTCG at 976, AGACAGA at 921, GTTGTCT at 907, GAGGTCG at 895, CGACCTA at 783, CTGGTCG at 737, CTGGTCG at 728, TCGGTCG at 504, CTTGTCT at 289, TTTGTCT at 168, CTGGTCG at 35.
- Positive strand, negative direction: AGACCAG at 2263, AGACCAA at 2147, AGACCAC at 2123, TCTGTCT at 2031, CGACAGA at 2017, AGACATC at 1571, CGTGTCT at 1222, CGACCAG at 1193, TCTGTCT at 1087, CGACAAC at 1070, TCTGTCT at 921, AGACCAG at 727, AGACAGG at 561, AGACAGG at 424.
ARF5 positive direction (4050-1) distal promoters
ARF5 positive direction (4050-1) distal promoters, Negative strand
- Negative strand, positive direction: AGACAGC at 3895, CAGGTCT at 3771, AGACCTC at 3552, CTGGTCT at 3299, CTGGTCT at 3245, GTTGTCT at 3053, CAGGTCT at 3019, CTTGTCT at 3004, GTGGTCT at 2941, AGACCTC at 2863, CGGGTCT at 2489, TAGGTCT at 2258, TGTGTCT at 2078, CGACAGG at 1966, GGGGTCT at 1958, CCTGTCT at 1862, GCTGTCT at 1731, GGGGTCT at 1711, GAGGTCG at 1687, AGACACA at 714, AGACCGG at 442, TGTGTCT at 268, TCTGTCT at 100.
ARF5 positive direction (4050-1) distal promoters, Positive strand
- Positive strand, positive direction: CGACCCG at 3991, TCTGTCG at 3895, GAGGTCT at 3891, GAGGTCT at 3806, GTGGTCG at 3720, CGACACC at 3642, CTGGTCT at 3548, CGACAAG at 3351, CGACCAG at 3244, CGGGTCG at 3239, TCGGTCT at 3221, GGTGTCG at 3194, TTTGTCT at 3179, CCTGTCT at 3133, TGTGTCG at 3100, TGGGTCT at 3091, AGACCAA at 3023, AGACCGG at 2985, AGACACG at 2959, AGACCGA at 2885, TGTGTCT at 2837, CGACCAC at 2812, CGACCTC at 2772, CGACCTA at 2736, TTTGTCT at 2652, CGACCTC at 2322, AGACAAA at 2262, TTGGTCT at 2228, AGACAAC at 2184, AGACCCC at 1866, CGGGTCT at 1742, CGACCGG at 1738, CGACAGA at 1731, GTGGTCT at 1631, AGACCGC at 1478, GTGGTCG at 1463, GCGGTCG at 1457, AGACCGC at 1378, GTGGTCG at 1363, GCGGTCG at 1357, TCGGTCG at 1271, GTGGTCG at 1127, CGTGTCG at 1054, TCGGTCT at 935, TCGGTCT at 835, CGACCAC at 781, TGTGTCG at 718, GGTGTCG at 634, GTGGTCG at 623, TCGGTCG at 617, CGACCCG at 419, CGACCCG at 388, CCGGTCG at 329, CGACCCC at 279, AGACCTC at 272, CGACACA at 266, GGGGTCT at 204, AGACCAC at 104, AGACAGA at 100, GAGGTCT at 15.
ARF5 duplicates
Starting with the random occurrences among the ARF5 possibilities 68 have duplicates among the random or real datasets. These same duplicates only occur 78 % of the time among the real datasets.
Using the real consensus sequences to look for duplicates first among the other reals then among the randoms found 97 % had duplicates among either the other reals or among the randoms.
The possible variety of ARF5s within the consensus sequences (C/G/T)N(G/T)GTC(G/T): 3*4*2*1*1*1*2 = 48 plus complement inverses (A/C)GAC(A/C)N(A/C/G): 2*1*1*1*2*4*3 = 48 with duplicates, 96 minus duplicates of some 18 suggests that up to 78 could occur if sampling were large enough.
That the randoms and real occurrences do not match up suggests that the reals are not randomly occurring.
ARF5 random dataset samplings
- ARF5r0: 9, CGGGTCT at 4553, GCTGTCG at 4145, GAGGTCG at 3559, CGGGTCG at 2311, TATGTCT at 2115, TAGGTCG at 1752, TAGGTCG at 1417, GAGGTCT at 1047, GTTGTCG at 19.
- ARF5r1: 16, GTGGTCG at 4155, CTGGTCT at 3958, GTGGTCG at 3003, CCGGTCT at 2670, GGGGTCG at 2011, CCGGTCG at 1682, GATGTCG at 1451, CTTGTCG at 1066, CGGGTCT at 604, GTGGTCG at 578, CCGGTCG at 510, TTGGTCG at 416, CTTGTCG at 354, GTGGTCT at 286, TATGTCT at 24, TTTGTCT at 7.
- ARF5r2: 13, CGTGTCG at 3553, CCGGTCT at 3400, TTTGTCT at 3099, TTGGTCT at 2962, GTTGTCT at 2535, CAGGTCG at 1499, TTGGTCT at 1338, TCTGTCT at 1315, CCGGTCT at 1214, TTTGTCG at 1122, TCTGTCT at 598, CGTGTCT at 594, CTTGTCG at 82.
- ARF5r3: 13, GATGTCT at 4388, TTGGTCT at 4012, CCGGTCG at 3478, TCGGTCT at 3466, CTTGTCG at 2983, TTGGTCG at 2945, CTTGTCT at 2808, CTGGTCG at 1887, TGTGTCT at 1137, GTGGTCG at 644, GCGGTCT at 508, CAGGTCT at 246, TTGGTCT at 26.
- ARF5r4: 10, CTTGTCG at 4290, CGGGTCG at 4241, GCGGTCG at 3552, GAGGTCG at 3342, CCTGTCT at 3075, GCGGTCG at 2469, GTTGTCT at 2163, CCTGTCG at 2027, TAGGTCT at 980, CCGGTCG at 357.
- ARF5r5: 14, TAGGTCG at 3412, TGGGTCG at 3176, CTGGTCG at 3120, GGTGTCG at 2901, GGGGTCT at 2572, TTTGTCG at 2563, TTTGTCT at 2173, CCTGTCG at 2098, TGTGTCG at 2056, TGTGTCG at 1550, GGGGTCT at 1043, TTGGTCT at 1014, TGGGTCG at 462, TAGGTCG at 306.
- ARF5r6: 4, TGTGTCT at 4367, CATGTCG at 3661, TATGTCT at 1291, TATGTCG at 474.
- ARF5r7: 13, TCGGTCT at 4410, CTGGTCT at 4356, CGTGTCT at 4304, TGGGTCT at 4231, TTTGTCG at 4189, GATGTCT at 4004, CAGGTCG at 3611, TTGGTCT at 3122, CGGGTCT at 3091, GCGGTCG at 2780, TATGTCG at 2081, TGTGTCT at 1356, GAGGTCG at 672.
- ARF5r8: 5, GCGGTCT at 3362, CTTGTCT at 2035, GGGGTCG at 1741, CAGGTCT at 1517, GGGGTCG at 246.
- ARF5r9: 11, CCTGTCG at 4084, CTTGTCG at 3649, TCGGTCG at 2775, CAGGTCT at 2648, GCGGTCG at 2612, TTTGTCG at 2032, CATGTCG at 1850, TTTGTCG at 1270, CTGGTCT at 926, GAGGTCT at 450, TTTGTCT at 24.
- ARF5r0ci: 12, CGACCCC at 4084, CGACCCC at 3906, CGACAAA at 3650, AGACCCC at 3156, CGACCCG at 2630, CGACCTG at 2513, AGACAAA at 2332, AGACAAG at 2184, CGACCCG at 1601, CGACAAA at 1321, AGACCGA at 1317, CGACAGC at 578.
- ARF5r1ci: 15, AGACATG at 4198, AGACCGC at 3993, CGACACC at 3755, CGACCTG at 3656, CGACATA at 3472, CGACAGC at 3040, AGACCCG at 2767, AGACAAC at 2750, CGACACC at 2400, CGACCCA at 2356, CGACCGA at 1764, CGACAAC at 1732, CGACAAC at 961, AGACCCG at 680, AGACAGC at 35.
- ARF5r2ci: 8, AGACAAG at 4198, AGACCCC at 3312, AGACCGA at 2234, CGACCGG at 1907, CGACAGA at 1824, AGACCTG at 684, CGACCCG at 324, CGACATG at 272.
- ARF5r3ci: 6, AGACCAA at 3739, AGACCAA at 3389, CGACCGG at 2366, AGACCCC at 1963, AGACATA at 662, CGACAGG at 243.
- ARF5r4ci: 10, CGACAGC at 4359, CGACATC at 4297, CGACCTA at 3681, CGACCTG at 3072, CGACACG at 2285, AGACACA at 1968, CGACACC at 1956, AGACCAG at 1575, AGACCTA at 261, AGACAGC at 76.
- ARF5r5ci: 11, CGACCGC at 3846, CGACCCC at 3597, CGACAGA at 3298, AGACCCC at 3241, AGACCCA at 2597, AGACCCA at 2309, AGACCAC at 1459, AGACATA at 1258, AGACACG at 925, AGACCAG at 766, CGACCTA at 311.
- ARF5r6ci: 14, CGACAAA at 4300, CGACACC at 4124, CGACCCG at 3786, AGACCCC at 3733, CGACATG at 3542, CGACCGG at 2995, AGACAGC at 2989, CGACCCC at 2956, CGACCGC at 2846, CGACAAC at 2827, AGACAGG at 2592, AGACCCG at 1601, AGACCCA at 1179, AGACCTA at 856.
- ARF5r7ci: 10, CGACCAA at 4431, AGACCAA at 3761, AGACAAC at 3417, CGACACC at 2987, CGACAGG at 2792, CGACCCG at 2536, CGACAGC at 2422, CGACCCC at 2415, AGACAGA at 2279, CGACAGA at 514.
- ARF5r8ci: 13, AGACCAA at 4294, AGACCTA at 4265, AGACAGA at 4261, CGACCAA at 3937, AGACATC at 3219, AGACCCA at 2833, AGACCAC at 2230, CGACCAC at 1945, CGACCCC at 1108, CGACACC at 492, CGACATA at 433, CGACAAA at 211, CGACCAG at 20.
- ARF5r9ci: 9, CGACCTG at 4195, CGACCTG at 4081, CGACACC at 3157, AGACATG at 2919, CGACCGG at 2150, AGACATG at 1632, CGACCTA at 1571, CGACAAA at 385, CGACCTC at 290.
ARF5r arbitrary (evens) (4560-2846) UTRs
- ARF5r0: CGGGTCT at 4553, GCTGTCG at 4145, GAGGTCG at 3559.
- ARF5r2: CGTGTCG at 3553, CCGGTCT at 3400, TTTGTCT at 3099, TTGGTCT at 2962.
- ARF5r4: CTTGTCG at 4290, CGGGTCG at 4241, GCGGTCG at 3552, GAGGTCG at 3342, CCTGTCT at 3075.
- ARF5r6: TGTGTCT at 4367, CATGTCG at 3661.
- ARF5r8: GCGGTCT at 3362.
- ARF5r0ci: CGACCCC at 4084, CGACCCC at 3906, CGACAAA at 3650, AGACCCC at 3156.
- ARF5r2ci: AGACAAG at 4198, AGACCCC at 3312.
- ARF5r4ci: CGACAGC at 4359, CGACATC at 4297, CGACCTA at 3681, CGACCTG at 3072.
- ARF5r6ci: CGACAAA at 4300, CGACACC at 4124, CGACCCG at 3786, AGACCCC at 3733, CGACATG at 3542, CGACCGG at 2995, AGACAGC at 2989, CGACCCC at 2956, CGACCGC at 2846.
- ARF5r8ci: AGACCAA at 4294, AGACCTA at 4265, AGACAGA at 4261, CGACCAA at 3937, AGACATC at 3219.
ARF5r alternate (odds) (4560-2846) UTRs
- ARF5r1: GTGGTCG at 4155, CTGGTCT at 3958, GTGGTCG at 3003.
- ARF5r3: GATGTCT at 4388, TTGGTCT at 4012, CCGGTCG at 3478, TCGGTCT at 3466, CTTGTCG at 2983, TTGGTCG at 2945.
- ARF5r5: TAGGTCG at 3412, TGGGTCG at 3176, CTGGTCG at 3120, GGTGTCG at 2901.
- ARF5r7: TCGGTCT at 4410, CTGGTCT at 4356, CGTGTCT at 4304, TGGGTCT at 4231, TTTGTCG at 4189, GATGTCT at 4004, CAGGTCG at 3611, TTGGTCT at 3122, CGGGTCT at 3091.
- ARF5r9: CCTGTCG at 4084, CTTGTCG at 3649.
- ARF5r1ci: AGACATG at 4198, AGACCGC at 3993, CGACACC at 3755, CGACCTG at 3656, CGACATA at 3472, CGACAGC at 3040.
- ARF5r3ci: AGACCAA at 3739, AGACCAA at 3389.
- ARF5r5ci: CGACCGC at 3846, CGACCCC at 3597, CGACAGA at 3298, AGACCCC at 3241.
- ARF5r7ci: CGACCAA at 4431, AGACCAA at 3761, AGACAAC at 3417, CGACACC at 2987.
- ARF5r9ci: CGACCTG at 4195, CGACCTG at 4081, CGACACC at 3157, AGACATG at 2919.
ARF5r arbitrary negative direction (evens) (2846-2811) core promoters
- ARF5r6ci: CGACCGC at 2846, CGACAAC at 2827.
- ARF5r8ci: AGACCCA at 2833.
ARF5r arbitrary positive direction (odds) (4445-4265) core promoters
- ARF5r3: GATGTCT at 4388.
- ARF5r7: TCGGTCT at 4410, CTGGTCT at 4356, CGTGTCT at 4304.
- ARF5r7ci: CGACCAA at 4431.
ARF5r alternate positive direction (evens) (4445-4265) core promoters
- ARF5r4: CTTGTCG at 4290.
- ARF5r6: TGTGTCT at 4367.
- ARF5r4ci: CGACAGC at 4359, CGACATC at 4297.
- ARF5r6ci: CGACAAA at 4300.
- ARF5r8ci: AGACCAA at 4294, AGACCTA at 4265.
ARF5r arbitrary negative direction (evens) (2811-2596) proximal promoters
- ARF5r0ci: CGACCCG at 2630.
ARF5r alternate negative direction (odds) (2811-2596) proximal promoters
- ARF5r1: CCGGTCT at 2670.
- ARF5r3: CTTGTCT at 2808.
- ARF5r7: GCGGTCG at 2780.
- ARF5r9: TCGGTCG at 2775, CAGGTCT at 2648, GCGGTCG at 2612.
- ARF5r1ci: AGACCCG at 2767, AGACAAC at 2750.
- ARF5r5ci: AGACCCA at 2597.
- ARF5r7ci: CGACAGG at 2792.
ARF5r arbitrary positive direction (odds) (4265-4050) proximal promoters
- ARF5r1: GTGGTCG at 4155.
- ARF5r7: TGGGTCT at 4231, TTTGTCG at 4189.
- ARF5r9: CCTGTCG at 4084.
- ARF5r1ci: AGACATG at 4198.
- ARF5r9ci: CGACCTG at 4195, CGACCTG at 4081.
ARF5r alternate positive direction (evens) (4265-4050) proximal promoters
- ARF5r0: GCTGTCG at 4145.
- ARF5r4: CGGGTCG at 4241.
- ARF5r0ci: CGACCCC at 4084.
- ARF5r2ci: AGACAAG at 4198.
- ARF5r6ci: CGACACC at 4124.
- ARF5r8ci: AGACCTA at 4265, AGACAGA at 4261.
ARF5r arbitrary negative direction (evens) (2596-1) distal promoters
- ARF5r0: CGGGTCG at 2311, TATGTCT at 2115, TAGGTCG at 1752, TAGGTCG at 1417, GAGGTCT at 1047, GTTGTCG at 19.
- ARF5r2: GTTGTCT at 2535, CAGGTCG at 1499, TTGGTCT at 1338, TCTGTCT at 1315, CCGGTCT at 1214, TTTGTCG at 1122, TCTGTCT at 598, CGTGTCT at 594, CTTGTCG at 82.
- ARF5r4: GCGGTCG at 2469, GTTGTCT at 2163, CCTGTCG at 2027, TAGGTCT at 980, CCGGTCG at 357.
- ARF5r6: TATGTCT at 1291, TATGTCG at 474.
- ARF5r8: CTTGTCT at 2035, GGGGTCG at 1741, CAGGTCT at 1517, GGGGTCG at 246.
- ARF5r0ci: CGACCTG at 2513, AGACAAA at 2332, AGACAAG at 2184, CGACCCG at 1601, CGACAAA at 1321, AGACCGA at 1317, CGACAGC at 578.
- ARF5r2ci: AGACCGA at 2234, CGACCGG at 1907, CGACAGA at 1824, AGACCTG at 684, CGACCCG at 324, CGACATG at 272.
- ARF5r4ci: CGACACG at 2285, AGACACA at 1968, CGACACC at 1956, AGACCAG at 1575, AGACCTA at 261, AGACAGC at 76.
- ARF5r6ci: AGACAGG at 2592, AGACCCG at 1601, AGACCCA at 1179, AGACCTA at 856.
- ARF5r8ci: AGACCAC at 2230, CGACCAC at 1945, CGACCCC at 1108, CGACACC at 492, CGACATA at 433, CGACAAA at 211, CGACCAG at 20.
ARF5r alternate negative direction (odds) (2596-1) distal promoters
- ARF5r1: GGGGTCG at 2011, CCGGTCG at 1682, GATGTCG at 1451, CTTGTCG at 1066, CGGGTCT at 604, GTGGTCG at 578, CCGGTCG at 510, TTGGTCG at 416, CTTGTCG at 354, GTGGTCT at 286, TATGTCT at 24, TTTGTCT at 7.
- ARF5r3: CTGGTCG at 1887, TGTGTCT at 1137, GTGGTCG at 644, GCGGTCT at 508, CAGGTCT at 246, TTGGTCT at 26.
- ARF5r5: GGGGTCT at 2572, TTTGTCG at 2563, TTTGTCT at 2173, CCTGTCG at 2098, TGTGTCG at 2056, TGTGTCG at 1550, GGGGTCT at 1043, TTGGTCT at 1014, TGGGTCG at 462, TAGGTCG at 306.
- ARF5r7: TATGTCG at 2081, TGTGTCT at 1356, GAGGTCG at 672.
- ARF5r9: TTTGTCG at 2032, CATGTCG at 1850, TTTGTCG at 1270, CTGGTCT at 926, GAGGTCT at 450, TTTGTCT at 24.
- ARF5r1ci: CGACACC at 2400, CGACCCA at 2356, CGACCGA at 1764, CGACAAC at 1732, CGACAAC at 961, AGACCCG at 680, AGACAGC at 35.
- ARF5r3ci: CGACCGG at 2366, AGACCCC at 1963, AGACATA at 662, CGACAGG at 243.
- ARF5r5ci: AGACCCA at 2309, AGACCAC at 1459, AGACATA at 1258, AGACACG at 925, AGACCAG at 766, CGACCTA at 311.
- ARF5r7ci: CGACCCG at 2536, CGACAGC at 2422, CGACCCC at 2415, AGACAGA at 2279, CGACAGA at 514.
- ARF5r9ci: CGACCGG at 2150, AGACATG at 1632, CGACCTA at 1571, CGACAAA at 385, CGACCTC at 290.
ARF5r arbitrary positive direction (odds) (4050-1) distal promoters
- ARF5r1: CTGGTCT at 3958, GTGGTCG at 3003, CCGGTCT at 2670, GGGGTCG at 2011, CCGGTCG at 1682, GATGTCG at 1451, CTTGTCG at 1066, CGGGTCT at 604, GTGGTCG at 578, CCGGTCG at 510, TTGGTCG at 416, CTTGTCG at 354, GTGGTCT at 286, TATGTCT at 24, TTTGTCT at 7.
- ARF5r3: TTGGTCT at 4012, CCGGTCG at 3478, TCGGTCT at 3466, CTTGTCG at 2983, TTGGTCG at 2945, CTTGTCT at 2808, CTGGTCG at 1887, TGTGTCT at 1137, GTGGTCG at 644, GCGGTCT at 508, CAGGTCT at 246, TTGGTCT at 26.
- ARF5r5: TAGGTCG at 3412, TGGGTCG at 3176, CTGGTCG at 3120, GGTGTCG at 2901, GGGGTCT at 2572, TTTGTCG at 2563, TTTGTCT at 2173, CCTGTCG at 2098, TGTGTCG at 2056, TGTGTCG at 1550, GGGGTCT at 1043, TTGGTCT at 1014, TGGGTCG at 462, TAGGTCG at 306.
- ARF5r7: GATGTCT at 4004, CAGGTCG at 3611, TTGGTCT at 3122, CGGGTCT at 3091, GCGGTCG at 2780, TATGTCG at 2081, TGTGTCT at 1356, GAGGTCG at 672.
- ARF5r9: CTTGTCG at 3649, TCGGTCG at 2775, CAGGTCT at 2648, GCGGTCG at 2612, TTTGTCG at 2032, CATGTCG at 1850, TTTGTCG at 1270, CTGGTCT at 926, GAGGTCT at 450, TTTGTCT at 24.
- ARF5r1ci: AGACCGC at 3993, CGACACC at 3755, CGACCTG at 3656, CGACATA at 3472, CGACAGC at 3040, AGACCCG at 2767, AGACAAC at 2750, CGACACC at 2400, CGACCCA at 2356, CGACCGA at 1764, CGACAAC at 1732, CGACAAC at 961, AGACCCG at 680, AGACAGC at 35.
- ARF5r3ci: AGACCAA at 3739, AGACCAA at 3389, CGACCGG at 2366, AGACCCC at 1963, AGACATA at 662, CGACAGG at 243.
- ARF5r5ci: CGACCGC at 3846, CGACCCC at 3597, CGACAGA at 3298, AGACCCC at 3241, AGACCCA at 2597, AGACCCA at 2309, AGACCAC at 1459, AGACATA at 1258, AGACACG at 925, AGACCAG at 766, CGACCTA at 311.
- ARF5r7ci: AGACCAA at 3761, AGACAAC at 3417, CGACACC at 2987, CGACAGG at 2792, CGACCCG at 2536, CGACAGC at 2422, CGACCCC at 2415, AGACAGA at 2279, CGACAGA at 514.
- ARF5r9ci: CGACACC at 3157, AGACATG at 2919, CGACCGG at 2150, AGACATG at 1632, CGACCTA at 1571, CGACAAA at 385, CGACCTC at 290.
ARF5r alternate positive direction (evens) (4050-1) distal promoters
- ARF5r0: GAGGTCG at 3559, CGGGTCG at 2311, TATGTCT at 2115, TAGGTCG at 1752, TAGGTCG at 1417, GAGGTCT at 1047, GTTGTCG at 19.
- ARF5r2: CGTGTCG at 3553, CCGGTCT at 3400, TTTGTCT at 3099, TTGGTCT at 2962, GTTGTCT at 2535, CAGGTCG at 1499, TTGGTCT at 1338, TCTGTCT at 1315, CCGGTCT at 1214, TTTGTCG at 1122, TCTGTCT at 598, CGTGTCT at 594, CTTGTCG at 82.
- ARF5r4: GCGGTCG at 3552, GAGGTCG at 3342, CCTGTCT at 3075, GCGGTCG at 2469, GTTGTCT at 2163, CCTGTCG at 2027, TAGGTCT at 980, CCGGTCG at 357.
- ARF5r6: CATGTCG at 3661, TATGTCT at 1291, TATGTCG at 474.
- ARF5r8: GCGGTCT at 3362, CTTGTCT at 2035, GGGGTCG at 1741, CAGGTCT at 1517, GGGGTCG at 246.
- ARF5r0ci: CGACCCC at 3906, CGACAAA at 3650, AGACCCC at 3156, CGACCCG at 2630, CGACCTG at 2513, AGACAAA at 2332, AGACAAG at 2184, CGACCCG at 1601, CGACAAA at 1321, AGACCGA at 1317, CGACAGC at 578.
- ARF5r2ci: AGACCCC at 3312, AGACCGA at 2234, CGACCGG at 1907, CGACAGA at 1824, AGACCTG at 684, CGACCCG at 324, CGACATG at 272.
- ARF5r4ci: CGACCTA at 3681, CGACCTG at 3072, CGACACG at 2285, AGACACA at 1968, CGACACC at 1956, AGACCAG at 1575, AGACCTA at 261, AGACAGC at 76.
- ARF5r6ci: CGACCCG at 3786, AGACCCC at 3733, CGACATG at 3542, CGACCGG at 2995, AGACAGC at 2989, CGACCCC at 2956, CGACCGC at 2846, CGACAAC at 2827, AGACAGG at 2592, AGACCCG at 1601, AGACCCA at 1179, AGACCTA at 856.
- ARF5r8ci: CGACCAA at 3937, AGACATC at 3219, AGACCCA at 2833, AGACCAC at 2230, CGACCAC at 1945, CGACCCC at 1108, CGACACC at 492, CGACATA at 433, CGACAAA at 211, CGACCAG at 20.
ARF5 analysis and results
ARF5/MP is (C/G/T)N(G/T)GTC(G/T).[1]
Reals or randoms | Promoters | direction | Numbers | Strands | Occurrences | Averages (± 0.1) |
---|---|---|---|---|---|---|
Reals | UTR | negative | 35 | 2 | 17.5 | 17.5 ± 7.5 (--25,+-10) |
Randoms | UTR | arbitrary negative | 39 | 10 | 3.9 | 4.15 ± 0.25 |
Randoms | UTR | alternate negative | 44 | 10 | 4.4 | 4.15 ± 0.25 |
Reals | Core | negative | 4 | 2 | 2 | 2 ± 2 (--0,+-4) |
Randoms | Core | arbitrary negative | 3 | 10 | 0.3 | 0.15 |
Randoms | Core | alternate negative | 0 | 10 | 0 | 0.15 |
Reals | Core | positive | 5 | 2 | 2.5 | 2.5 ± 0.5 (-+2,++3) |
Randoms | Core | arbitrary positive | 5 | 10 | 0.5 | 0.6 |
Randoms | Core | alternate positive | 7 | 10 | 0.7 | 0.6 |
Reals | Proximal | negative | 3 | 2 | 1.5 | 1.5 ± 0.5 (--2,+-1) |
Randoms | Proximal | arbitrary negative | 1 | 10 | 0.1 | 0.55 ± 0.45 |
Randoms | Proximal | alternate negative | 10 | 10 | 1.0 | 0.55 ± 0.45 |
Reals | Proximal | positive | 1 | 2 | 0.5 | 0.5 ± 0.5 (-+0,++1) |
Randoms | Proximal | arbitrary positive | 7 | 10 | 0.7 | 0.7 |
Randoms | Proximal | alternate positive | 7 | 10 | 0.7 | 0.7 |
Reals | Distal | negative | 49 | 2 | 24.5 | 24.5 ± 10.5(--35,+-14) |
Randoms | Distal | arbitrary negative | 56 | 10 | 5.6 | 6.0 ± 0.4 |
Randoms | Distal | alternate negative | 64 | 10 | 6.4 | 6.0 ± 0.4 |
Reals | Distal | positive | 83 | 2 | 41.5 | 41.5 ± 18.5 (-+23,++60) |
Randoms | Distal | arbitrary positive | 106 | 10 | 10.6 | 9.5 ± 1.1 |
Randoms | Distal | alternate positive | 84 | 10 | 8.4 | 9.5 ± 1.1 |
Comparison:
The occurrences of real ARF5 UTRs, cores, and distals are greater than the randoms, the negative direction proximals are greater than or equal to the randoms, the positive direction proximals are greater than the randoms. This suggests that the real ARF5s are likely active or activable.
Discussion
Auxin response factors (ARFs) have been identified in the UTR for A1BG between ZSCAN22 and A1BG and between ZNF497 and A1BG. ARF5s occur in this side's core promoter or proximal promoter. If these response elements are active then A1BG can be transcribed as a regulatory element in auxin signaling. The "genome binding of two ARFs (ARF2 and ARF5/Monopteros [MP]) differ largely because these two factors have different preferred ARF binding site (ARFbs) arrangements (orientation and spacing)."[1]
The position weight matrices (PWMs) used to model ARF DNA binding specificity suggest more general consensus sequences may be (C/G/T)N(G/T)G(C/T)(C/T), where ARF2 is (C/G/T)(A/C/T)(G/T)G(C/T)(C/T)(G/T)(C/G)(A/C/T)(A/G/T) and ARF5/MP is (C/G/T)N(G/T)GTC(G/T).[1] The likely consensus sequence for ARF2 would allow 2592 possible response elements, and that for ARF5/MP would be 48.
Ulmasov ARFbs
"ARFbs were originally defined as TGTCTC (Ulmasov et al., 1995, Guilfoyle et al., 1998), [...]."[1]
The consensus sequence found by Ulmasov (1995) TGTCTC occurs in the negative direction for the UTR (four), proximal promoter (one) and distal promoter (twelve) and in the positive direction only in the distal promoter (ten). But, the random datasets had only one on average in the distal promoter for either direction and only 0.2 in the UTR and none in the proximal promoter. This suggests that the occurrences of the consensus sequences of Ulmasov in the promoters of A1BG are real and likely active or activable.
Boer ARFbs
"More recently, protein binding microarray (PBM) experiments suggested that TGTCGG are preferred ARFbs, [...] (Boer et al., 2014, Franco-Zorrilla et al., 2014, Liao et al., 2015)."[1]
The consensus sequence of Boer (2014): TGTCGG occurs only once in the UTR in the negative direction and seven times in the positive direction in the distal promoters. The random datasets had about one per dataset in the UTR. The random occurrences of one to three times in the distal promoters for each of ten data sets. While the occurrence in the UTR about matches a random occurrence, the occurrences in the distal promoters are more frequent in the real promoters vs. the random datasets.
Stigliani ARF2s
Random sampling (even numbered datasets for ZSCAN22 to A1BG) for UTRs range from two to eight ARFs whereas the two strands have 6 (negative strand) and 12, respectively, in the negative direction. The negative strand, negative direction results (6) for A1BG fall within the range of random results by number of results, whereas the results for the positive strand, negative direction (12) are well outside the number of random results, suggesting they are real. None of the actual nucleotide sequences for either strand, negative direction match any of the random results.
For ARF core promoters, four of the ten random nucleotide sequences have core promoters, ranging from one to two, correspond (odd numbered random sequences for positive direction, ZNF497 to A1BG) only and none match the nucleotide sequence for the real, negative strand, positive direction.
Only three of the ten random datasets had results for the proximal promoters, only one result each, in the negative direction and none matched the real result. The positive direction random datasets had results in seven ranging from one to four with no nucleotide sequence matches.
The nucleotide sequences for the distal promoters do not match: the random data sets range from (2 to 9) in number and (5 to 14) but the real sets range from (14 to 38) and (15 to 51), respectively. The real occurrences way outnumber the random results.
The results in the promoters of A1BG have about 100 unique nucleotide sequences (Stigliani et al) of the total possible 2592 such sequences per the PMWs assuming a weight of one. The actual weighting is expected to reduce the number of likely sequences resulting in few duplicates between sequences found to occur and those found in the random data sets. Common to both results are only six nucleotide sequences.
Stigliani ARF5s
The varieties of the short consensus sequence for ARF5/MP (C/G/T)N(G/T)GTC(G/T) have been detected in the UTR (25), proximal (3) and distal (49) promoters for A1BG between ZSCAN22 and the A1BG gene and the core (5), proximal (1) and distal (84) promoters for A1BG on the ZNF497 side.
Random datasets arbitrarily chosen to represent the negative direction (even numbered datasets) and positive direction (odd numbered datasets) have only from one to five for the consensus sequence UTRs and two to nine for the inverse complement sequences. Regarding core promoters, two random datasets had one or two inverse complement nucleotide sequences on the even numbered sites, whereas the positive direction random datasets had two datasets with one and three nucleotide sequences. Random datasets had proximal promoters only on the positive side (one to two). Distal promoter sequences for the negative direction had from two to nine nucleotide sequences. For the positive direction, the consensus sequences ranged from eight to fifteen.
Starting with the random occurrences among the ARF5 possibilities 68 have duplicates among the random or real datasets. These same duplicates only occur 78 % of the time among the real datasets.
Using the real consensus sequences to look for duplicates first among the other reals then among the randoms found 97 % had duplicates among either the other reals or among the randoms.
The possible variety of ARF5s within the consensus sequences (C/G/T)N(G/T)GTC(G/T): 3*4*2*1*1*1*2 = 48 plus complement inverses (A/C)GAC(A/C)N(A/C/G): 2*1*1*1*2*4*3 = 48 with duplicates, 96 minus duplicates of some 18 suggests that up to 78 could occur if sampling were large enough.
That the randoms and real occurrences do not match up suggests that the reals are not randomly occurring.
Acknowledgements
The content on this page was first contributed by: Henry A. Hoff.
See also
References
- ↑ 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 Arnaud Stigliani, Raquel Martin-Arevalillo, Jérémy Lucas, Adrien Bessy, Thomas Vinos-Poyo, Victoria Mironova, Teva Vernoux, Renaud Dumas and François Parcy (3 June 2019). "Capturing Auxin Response Factors Syntax Using DNA Binding Models". Molecular Plant. 12 (6): 822–832. doi:10.1016/j.molp.2018.09.010. PMID 30336329. Retrieved 29 August 2020.
- ↑ 2.0 2.1 2.2 2.3 HGNC (8 November 2020). "GHDC GH3 domain containing [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 11 November 2020.
- ↑ RefSeq (September 2010). "SLC36A1 solute carrier family 36 member 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 11 November 2020.
- ↑ RefSeq (April 2015). "SLC36A1 solute carrier family 36 member 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 11 November 2020.