GATA gene transcriptions
Associate Editor(s)-in-Chief: Henry A. Hoff
Although "the P3, P6 substitutions alter the conserved 'GATAAG' I box motif, a 'GATA' motif is present in the introduced EcoRV site. This introduced GATA sequence clearly does not serve as a functional I box [...]."[1]
"The I-box motif, 5'-GGATGAGATAAGA-3', or its shorter version 5'-GATAAG-3', has been found in the promoters of a large number of RBCS genes (Giuliano et al., 1988; Manzara and Gruissem, 1988). A related motif (the GATA box) is present in the promoters of the light-regulated chlorophyll a/b binding protein (CAB) genes of different species (Gidoni et al., 1989), and has been shown to be involved in the activation of an Arabidopsis CAB gene by light and by the circadian clock (Anderson and Kay, 1995). I-box and GATA binding factors have been identified in nuclear extracts from tobacco and tomato leaves and cotyledons (Borello et al., 1993; Giuliano et al., 1988; Manzara et al., 1991; Schindler and Cashmore, 1990). The I-box has therefore been suggested to be involved in light-regulated and/or leaf-specific gene expression of photosynthetic genes (Manzara et al., 1991), but to date no I-box binding protein has been cloned from plants."[2]
SRF is important during the development of the embryo, as it has been linked to the formation of mesoderm.[3][4]
The Serum response factor (SRF) has been shown to interact with GATA4.[5][6]
Cell specific developmental expression is tightly controlled, but, once expressed, require no additional activation -- GATA transcription factor (GATA), hepatocyte nuclear factors (HNF), PIT-1, MyoD, Myf5, Hox, winged-helix transcription factors.
The class of diverse Cys4 zinc fingers includes the family of GATA-factors.
In the diagram on the right, STAT5 may be involved with an erythropoiesis receptor, or Epo Receptor. Murine, members of the subfamily Murinae, Epo Receptor truncations and known functions are included. Erythroid differentiation depends on transcriptional regulator GATA1, zinc finger DNA binding domain binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements. In erythropoiesis, EpoR is best known for inducing survival of progenitors.
Some "putative wound-response elements including AGC box-like sequences28, TCA motif-like sequences28, carrot extensin gene wound-response elements (AT-rich motif, TTTTTTT, TGACGT)29, constitutive PAL footprint and elicitor-inducible PAL footprint31, and proteinase inhibitor II footprint31 have been found in cabch29 promoter. Some cis-elements related to organ and tissue-specific expression such as GATA motif-like sequence, ASF-1 binding site-like elements also existed in 5′ upstream region. Meanwhile, some basic transcriptional regulatory cis-elements including G box-like and GC box-like elements are located in this region."[7]
"A computer search for transcription promoter elements [...] showed the presence of a prominent TATA box 22 nucleotides upstream of the transcription start site and an Sp1 site at position -42 to -33. The 5'-flanking sequence also contains three E boxes with CANNTG consensus sequences at positions -464 to -459, -90 to -85, and -52 to -47 that have been marked as E box, E1 box, and E2 box, respectively [...]. In addition, the 5'-flanking region contains one or more GRE, XRE, GATA-1, GCN-4, PEA-3, AP1, and AP2 consensus motifs and also three imperfect CArG sites [...]."[8]
Consensus sequences
"Upstream noncoding regulatory sequences were retrieved and analyzed using Regulatory Sequence Analysis Tools (34). The program DNA-Pattern was used to search for and catalogue occurrences of consensus GCRE (TGABTVW) and GATA (GATAAG, GATAAH, GATTA) motifs in yeast promoters."[9]
"GATA factors bind to a common upstream consensus site T/A(GATA)A/G and activate transcription in cotransfection assays."[10]
GATA1 "binds specifically to DNA consensus sequence [a 'GATA' motif][1] [the GATA box][2] [AT]GATA[AG] promoter elements".[11]
In "response to anemia and hypoxia, erythropoietin (Epo) gene transcription is activated in the kidney and liver (reviewed in Ebert and Bunn1).[12]
"Epo gene expression is regulated by an enhancer located 3' to the transcriptional termination site.7 This 3' enhancer contains a hypoxia response element (HRE) that has been shown to bind hypoxia-inducible transcription factors (HIFs).7 A binding sequence for nuclear receptor also resides in the enhancer.1,8 Thus, these 2 cis-acting elements may control Epo gene expression in a hypoxia-inducible manner (reviewed in Koury9)."[12]
This "GATA box actively participates in Epo gene regulation. The GATA box acts as a negative regulatory element in the hepatoma cell lines.10 During normoxic conditions, GATA transcription factors bind to the GATA box and repress Epo gene transcription, but when exposed to hypoxia, GATA binding markedly decreases, with a marked increase in Epo gene expression.10,11"[12]
"A GATA factor–binding motif (GATA box) has been identified in the core promoter region of the Epo gene, where a TATA box normally resides.10"[12]
"The wild-type GATA-box in the wt-Epo-GFP transgene" [is] cTgataac.[12]
"Since both GATA-2 and GATA-3 bind to the GATA box in distal tubular cells, both factors are likely to repress constitutively ectopic Epo gene expression in these cells. Thus, GATA-based repression is essential for the inducible and cell type–specific expression of the Epo gene."[12]
Single transcription factor transdifferentiation
Brief expression of a single transcription factor, the ELT-7 GATA factor, can convert the identity of fully differentiated, specialized non-endodermal cells of the pharynx into fully differentiated intestinal cells in intact larvae and adult roundworm Caenorhabditis elegans with no requirement for a dedifferentiated intermediate.[13]
"ELT-7, a GATA transcription factor that regulates terminal intestinal differentiation (Maduro and Rothman, 2002; Sommermann et al., 2010), activates an intestinal marker (elt-2::lacZ::GFP) in non-intestinal cells when briefly ectopically expressed via a heat-shock promoter at any embryonic, larval, or adult stage [...]."[13]
The "END-1 GATA transcription factor, which specifies the endoderm progenitor (Zhu et al., 1997), does not activate widespread intestinal gene expression after the MCT [...]."[13]
Cardiomyocytes
Cell-based in vivo therapies may provide a transformative approach to augment vascular and muscle growth and to prevent non-contractile scar formation by delivering transcription factors[14] or microRNAs[15] to the heart.[16] Cardiac fibroblasts, which represent 50% of the cells in the mammalian heart, can be reprogrammed into cardiomyocyte-like cells in vivo by local delivery of cardiac core transcription factors (GATA4, MEF2C, TBX5 and for improved reprogramming plus ESRRG, MESP1, Myocardin and ZFPM2) after coronary ligation.[14][17] These results implicated therapies that can directly remuscularize the heart without cell transplantation. However, the efficiency of such reprogramming turned out to be very low and the phenotype of received cardiomyocyte-like cells does not resemble those of a mature normal cardiomyocyte. Furthermore, transplantation of cardiac transcription factors into injured murine hearts resulted in poor cell survival and minimal expression of cardiac genes.[18]
Blood stem cells
Definitive hematopoiesis emerges during embryogenesis via an endothelial-to-hematopoietic transition. Combination of four transcription factors, GATA2, GFI1B, c-Fos, and ETV6, is sufficient to induce in vitro development leading to the formation of endothelial-like precursor cells, with the subsequent appearance of hematopoietic cells.[19] Transient expression of six transcription factors (RUNX1T1, HLF, LMO2, Prdm5, PBX1, ZFP36) and also N-Myc with MEIS1, to improve reprogramming efficacy, is sufficient to activate the gene networks governing hematopoietic stem cells functional identity in committed blood cells. This finding marks a significant step toward one of the most sought-after goals of regenerative medicine: the ability to produce hematopoietic stem cells suitable for transplantation, using more mature or differentiated blood cells to make up the shortfall of bone marrow transplants[20] It should be noted, however, that the transcription factors used in this study belong to a group of proto-oncogenes and therefore these cells could be dangerous to humans. Still to be answered are the precise contribution of each of the eight factors to the reprogramming process and whether approaches that do not rely on viruses and transcription factors can have similar success. It also is not yet known whether the same results can be achieved using human cells or whether other, non-blood cells can be reprogrammed to iHSCs.[21]
Core promoters
Def. "a large cell, found in bone marrow, responsible for the production of platelets"[22] is called a megakaryocyte.
"The core promoters of the rat platelet factor 4 (PF4), mouse erythropoietin and chicken 𝛃 globin genes contain a GATA motif in place of the consensus TATAAA site. In the case of the PF4 gene, this site has been shown to play a critical role in restricting transcription to the megakaryocyte lineage."[10]
The GATA box in the core promoter of the chicken 𝛃 globin gene contains "GGATAA".[10] This suggests that the more general consensus sequence for the GATA box is (A/C/G)(A/G/T)(GATA)(A/G)(A/C).[10]
The chicken 𝛃 globin gene is Gene ID: 396485 HBBA hemoglobin beta, subunit A.[23] "Homologs of the HBBA gene: The HBBA gene is conserved in human, chimpanzee, Rhesus monkey, dog, cow, mouse, rat, and zebrafish."[23]
"The epsilon globin gene (HBE) is normally expressed in the embryonic yolk sac: two epsilon chains together with two zeta chains (an alpha-like globin) constitute the embryonic hemoglobin Hb Gower I; two epsilon chains together with two alpha chains form the embryonic Hb Gower II. Both of these embryonic hemoglobins are normally supplanted by fetal, and later, adult hemoglobin. The five beta-like globin genes are found within a 45 kb cluster on chromosome 11 in the following order: 5'-epsilon - G-gamma - A-gamma - delta - beta-3'"[24] The human ortholog is Gene ID: 3046 HBE1 hemoglobin subunit epsilon 1.[24]
The "TATA-binding protein of TFIID is required for initiation of transcription from the GATA box-containing promoters."[10]
"GATA-1 interacts with the core promoter GATA motif and inhibits generation of preinitiation complexes."[10]
"GATA-2 inhibits initiation of transcription from the PF4 core promoter."[10]
Human genes
Human erythropoietin genes
Gene ID: 2056 is EPO erythropoietin. "This gene encodes a secreted, glycosylated cytokine composed of four alpha helical bundles. The encoded protein is mainly synthesized in the kidney, secreted into the blood plasma, and binds to the erythropoietin receptor to promote red blood cell production, or erythropoiesis, in the bone marrow. Expression of this gene is upregulated under hypoxic conditions, in turn leading to increased erythropoiesis and enhanced oxygen-carrying capacity of the blood. Expression of this gene has also been observed in brain and in the eye, and elevated expression levels have been observed in diabetic retinopathy and ocular hypertension. Recombinant forms of the encoded protein exhibit neuroprotective activity against a variety of potential brain injuries, as well as antiapoptotic functions in several tissue types, and have been used in the treatment of anemia and to enhance the efficacy of cancer therapies."[25]
- NP_000790.2 erythropoietin precursor, pfam00758 Location:31 → 192, EPO_TPO; Erythropoietin/thrombopoietin.[25]
Human GATA genes
Gene ID: 2623 is GATA1 GATA binding protein 1. "This gene encodes a protein which belongs to the GATA family of transcription factors. The protein plays an important role in erythroid development by regulating the switch of fetal hemoglobin to adult hemoglobin. Mutations in this gene have been associated with X-linked dyserythropoietic anemia and thrombocytopenia."[11]
- NP_002040.1 erythroid transcription factor, smart00401 Location:202 → 247, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:203 → 247, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[11]
- XP_011542199.1 erythroid transcription factor isoform X1, smart00401 Location:202 → 247, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:203 → 247, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[11]
- XP_011542200.1 erythroid transcription factor isoform X2, smart00401 Location:119 → 164, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:120 → 164, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[11]
- XP_024308131.1 erythroid transcription factor isoform X3, cd00202 Location:120 → 164, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[11]
Gene ID: 2624 is GATA2 GATA binding protein 2. "This gene encodes a member of the GATA family of zinc-finger transcription factors that are named for the consensus nucleotide sequence they bind in the promoter regions of target genes. The encoded protein plays an essential role in regulating transcription of genes involved in the development and proliferation of hematopoietic and endocrine cell lineages. Alternative splicing results in multiple transcript variants."[26]
- NP_001139133.1 endothelial transcription factor GATA-2 isoform 1. Transcript Variant: This variant (1) represents the longest transcript. Both variants 1 and 2 encode the same isoform (1). cd00202 Location:349 → 398, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[26]
- NP_001139134.1 endothelial transcription factor GATA-2 isoform 2. Transcript Variant: This variant (3) differs in the 5' UTR and uses an alternate splice site in the CDS but maintains the reading frame, compared to variant 1. This variant encodes isoform 2, which is shorter than isoform 1. cd00202 Location:294 → 336, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[26]
- NP_116027.2 endothelial transcription factor GATA-2 isoform 1. Transcript Variant: This variant (2) differs in the 5' UTR compared to variant 1. Both variants 1 and 2 encode the same isoform (1). cd00202 Location:349 → 398, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[26]
Gene ID: 2625 is GATA3 GATA binding protein 3. "This gene encodes a protein which belongs to the GATA family of transcription factors. The protein contains two GATA-type zinc fingers and is an important regulator of T-cell development and plays an important role in endothelial cell biology. Defects in this gene are the cause of hypoparathyroidism with sensorineural deafness and renal dysplasia."[27]
- NP_001002295.1 trans-acting T-cell-specific transcription factor GATA-3 isoform 1. Transcript Variant: This variant (1) represents the longer transcript and encodes the longer isoform (1). smart00401 Location:313 → 362, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:317 → 367, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[27]
- NP_002042.1 trans-acting T-cell-specific transcription factor GATA-3 isoform 2. Transcript Variant: This variant (2) uses an alternate in-frame splice site in the mid-coding region, compared to variant 1, resulting in an isoform (2) that is 1 aa shorter than isoform 1. smart00401 Location:312 → 361, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:316 → 366, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[27]
- XP_005252500.1 trans-acting T-cell-specific transcription factor GATA-3 isoform X1. smart00401 Location:313 → 362, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:317 → 367, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[27]
- XP_005252499.1 trans-acting T-cell-specific transcription factor GATA-3 isoform X1. smart00401 Location:313 → 362, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:317 → 367, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[27]
Gene ID: 2626 is GATA4 GATA binding protein 4. "This gene encodes a member of the GATA family of zinc-finger transcription factors. Members of this family recognize the GATA motif which is present in the promoters of many genes. This protein is thought to regulate genes involved in embryogenesis and in myocardial differentiation and function, and is necessary for normal testicular development. Mutations in this gene have been associated with cardiac septal defects. Additionally, alterations in gene expression have been associated with several cancer types. Alternative splicing results in multiple transcript variants."[28]
- NP_001295022.1 transcription factor GATA-4 isoform 1. cd00202 Location:271 → 322, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C. pfam05349 Location:1 → 205, GATA-N; GATA-type transcription activator, N-terminal.[28]
- NP_001295023.1 transcription factor GATA-4 isoform 3. cd00202 Location:64 → 115, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[28]
- NP_001361202.1 transcription factor GATA-4 isoform 3.[28]
- NP_001361203.1 transcription factor GATA-4 isoform 4.[28]
- NP_002043.2 transcription factor GATA-4 isoform 2. cd00202 Location:270 → 321, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C. pfam05349 Location:1 → 205, GATA-N; GATA-type transcription activator, N-terminal.[28]
- XP_011542119.1 transcription factor GATA-4 isoform X1. cd00202 Location:271 → 322, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C. pfam05349 Location:1 → 205, GATA-N; GATA-type transcription activator, N-terminal.[28]
- XP_011542120.1 transcription factor GATA-4 isoform X1. cd00202 Location:271 → 322, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C. pfam05349 Location:1 → 205, GATA-N; GATA-type transcription activator, N-terminal.[28]
- XP_016868801.1 transcription factor GATA-4 isoform X1. cd00202 Location:271 → 322, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C. pfam05349 Location:1 → 205, GATA-N; GATA-type transcription activator, N-terminal.[28]
- XP_005272442.1 transcription factor GATA-4 isoform X1. cd00202 Location:271 → 322, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C. pfam05349 Location:1 → 205, GATA-N; GATA-type transcription activator, N-terminal.[28]
Gene ID: 7227 is TRPS1 transcriptional repressor GATA binding 1. "This gene encodes a transcription factor that represses GATA-regulated genes and binds to a dynein light chain protein. Binding of the encoded protein to the dynein light chain protein affects binding to GATA consensus sequences and suppresses its transcriptional activity. Defects in this gene are a cause of tricho-rhino-phalangeal syndrome (TRPS) types I-III."[29]
- NP_001269831.1 zinc finger transcription factor Trps1 isoform 2. Transcript Variant: This variant (2) uses an alternate 5' structure which results in the use of an alternate start codon, compared to variant 1. The encoded isoform (2) is shorter and has a distinct N-terminus, compared to isoform 1. smart00401 Location:895 → 938, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:899 → 955, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[29]
- NP_001269832.1 zinc finger transcription factor Trps1 isoform 3. Transcript Variant: This variant (3) uses an alternate splice in the coding region which results in the use of an alternate start codon, compared to variant 1. The encoded isoform (3) is shorter and has a distinct N-terminus, compared to isoform 1. smart00401 Location:897 → 940, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:901 → 957, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[29]
- NP_001317528.1 zinc finger transcription factor Trps1 isoform 4.[29]
- NP_054831.2 zinc finger transcription factor Trps1 isoform 1. Transcript Variant: This variant (1) represents the longest transcript and encodes the longest isoform (1). smart00401 Location:904 → 947, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:908 → 964, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C.[29]
Gene ID: 54815 is GATAD2A GATA zinc finger domain containing 2A.[30]
Gene ID: 57459 is GATAD2B GATA zinc finger domain containing 2B. "This gene encodes a zinc finger protein transcriptional repressor. The encoded protein is part of the methyl-CpG-binding protein-1 complex, which represses gene expression by deacetylating methylated nucleosomes. Mutations in this gene are linked to intellectual disability and dysmorphic features associated with cognitive disability."[31]
- NP_065750.1 transcriptional repressor p66-beta, pfam00320 Location:420 → 454, GATA; GATA zinc finger, pfam16563 Location:159 → 194, P66_CC; Coiled-coil and interaction region of P66A and P66B with MBD2.[31]
- XP_024304389.1 transcriptional repressor p66-beta isoform X1, pfam00320 Location:420 → 454, GATA; GATA zinc finger, pfam16563 Location:159 → 194, P66_CC; Coiled-coil and interaction region of P66A and P66B with MBD2.[31]
- XP_005245421.1 transcriptional repressor p66-beta isoform X1, pfam00320 Location:420 → 454, GATA; GATA zinc finger, pfam16563 Location:159 → 194, P66_CC; Coiled-coil and interaction region of P66A and P66B with MBD2.[31]
Gene ID: 57798 is GATAD1 GATA zinc finger domain containing 1. "The protein encoded by this gene contains a zinc finger at the N-terminus, and is thought to bind to a histone modification site that regulates gene expression. Mutations in this gene have been associated with autosomal recessive dilated cardiomyopathy. Alternatively spliced transcript variants have been found for this gene."[32]
Gene ID: 140628 is GATA5 GATA binding protein 5. "The protein encoded by this gene is a transcription factor that contains two GATA-type zinc fingers. The encoded protein is known to bind to hepatocyte nuclear factor-1alpha (HNF-1alpha), and this interaction is essential for cooperative activation of the intestinal lactase-phlorizin hydrolase promoter. In other organisms, similar proteins may be involved in the establishment of cardiac smooth muscle cell diversity."[33]
- NP_536721.1 transcription factor GATA-5, smart00401 Location:184 → 229, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:188 → 232, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C, pfam05349 Location:1 → 170, GATA-N; GATA-type transcription activator, N-terminal.[33]
- XP_006723762.1 transcription factor GATA-5 isoform X1, smart00401 Location:184 → 229, ZnF_GATA; zinc finger binding to DNA consensus sequence [AT]GATA[AG], cd00202 Location:188 → 232, ZnF_GATA; Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C, pfam05349 Location:1 → 170, GATA-N; GATA-type transcription activator, N-terminal.[33]
Human hemoglobin genes
Gene ID: 3046 is HBE1 hemoglobin subunit epsilon 1. "The epsilon globin gene (HBE) is normally expressed in the embryonic yolk sac: two epsilon chains together with two zeta chains (an alpha-like globin) constitute the embryonic hemoglobin Hb Gower I; two epsilon chains together with two alpha chains form the embryonic Hb Gower II. Both of these embryonic hemoglobins are normally supplanted by fetal, and later, adult hemoglobin. The five beta-like globin genes are found within a 45 kb cluster on chromosome 11 in the following order: 5'-epsilon - G-gamma - A-gamma - delta - beta-3'"[24]
- NP_005321.1 hemoglobin subunit epsilon, cd08925 Location:7 → 146, Hb-beta_like; Hemoglobin beta, gamma, delta, epsilon, and related Hb subunits.[24]
Human platelet factor genes
Gene ID: 5196 is PF4 platelet factor 4. "This gene encodes a member of the CXC chemokine family. This chemokine is released from the alpha granules of activated platelets in the form of a homotetramer which has high affinity for heparin and is involved in platelet aggregation. This protein is chemotactic for numerous other cell type and also functions as an inhibitor of hematopoiesis, angiogenesis and T-cell function. The protein also exhibits antimicrobial activity against Plasmodium falciparum."[34]
- NP_001350281.1 platelet factor 4 isoform 2, "Transcript Variant: This variant (2) uses an alternate splice site which results in the use of an alternate start codon, compared to variant 1. It encodes a longer protein (isoform 2) with a distinct N-terminus, compared to isoform 1.", cd00273 Location:47 → 109, "Chemokine_CXC: 1 of 4 subgroup designations based on the arrangement of the two N-terminal cysteine residues; includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity; many members contain an RCxC motif which may be a general requirement for binding to CXC chemokine receptors; those with the ELR motif are chemotatic for neutrophils and have been shown to be angiogenic, while those without the motif act on T and B cells, and are typically angiostatic; exist as monomers and dimers, but are believed to be functional as monomers; found only in vertebrates and a few viruses. See CDs: Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CC (cd00272), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for the additional chemokine subgroups.".[34]
- NP_002610.1 platelet factor 4 isoform 1 precursor, "Transcript Variant: This variant (1) represents the shorter transcript and encodes the shorter isoform (1).", cd00273 Location:38 → 100, "Chemokine_CXC: 1 of 4 subgroup designations based on the arrangement of the two N-terminal cysteine residues; includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity; many members contain an RCxC motif which may be a general requirement for binding to CXC chemokine receptors; those with the ELR motif are chemotatic for neutrophils and have been shown to be angiogenic, while those without the motif act on T and B cells, and are typically angiostatic; exist as monomers and dimers, but are believed to be functional as monomers; found only in vertebrates and a few viruses. See CDs: Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CC (cd00272), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for the additional chemokine subgroups.".[34]
Human STAT genes
GeneID: 6772 is STAT1 signal transducer and activator of transcription 1 (aka STAT91). "The protein encoded by this gene is a member of the STAT protein family. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein can be activated by various ligands including interferon-alpha, interferon-gamma, EGF, PDGF and IL6. This protein mediates the expression of a variety of genes, which is thought to be important for cell viability in response to different cell stimuli and pathogens. Two alternatively spliced transcript variants encoding distinct isoforms have been described."[35]
- NP_009330.1 signal transducer and activator of transcription 1-alpha/beta isoform alpha.
- NP_644671.1 signal transducer and activator of transcription 1-alpha/beta isoform beta.
- XP_016860272.1 signal transducer and activator of transcription 1-alpha/beta isoform X1.
- XP_006712781.1 signal transducer and activator of transcription 1-alpha/beta isoform X2.
- XR_001738914.2 Homo sapiens signal transducer and activator of transcription 1 (STAT1), transcript variant X3, misc_RNA.
- XR_001738915.2 Homo sapiens signal transducer and activator of transcription 1 (STAT1), transcript variant X4, misc_RNA.
GeneID: 6773 is STAT2 signal transducer and activator of transcription 2 (aka STAT113). "The protein encoded by this gene is a member of the STAT protein family. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. In response to interferon (IFN), this protein forms a complex with STAT1 and IFN regulatory factor family protein p48 (ISGF3G), in which this protein acts as a transactivator, but lacks the ability to bind DNA directly. Transcription adaptor P300/CBP (EP300/CREBBP) has been shown to interact specifically with this protein, which is thought to be involved in the process of blocking IFN-alpha response by adenovirus. Multiple transcript variants encoding different isoforms have been found for this gene."[36]
- NP_005410.1 signal transducer and activator of transcription 2 isoform 1.
- NP_938146.1 signal transducer and activator of transcription 2 isoform 2.
- XP_011536999.1 signal transducer and activator of transcription 2 isoform X1.
- XP_011537000.1 signal transducer and activator of transcription 2 isoform X2.
- XP_011537001.1 signal transducer and activator of transcription 2 isoform X3.
- XP_011537002.1 signal transducer and activator of transcription 2 isoform X4.
- XP_016875393.1 signal transducer and activator of transcription 2 isoform X5.
- XR_001748856.1 Homo sapiens signal transducer and activator of transcription 2 (STAT2), transcript variant X6, misc_RNA.
- XR_002957375.1 Homo sapiens signal transducer and activator of transcription 2 (STAT2), transcript variant X7, misc_RNA.
- XR_001748857.1 Homo sapiens signal transducer and activator of transcription 2 (STAT2), transcript variant X8, misc_RNA.
- XR_245953.3 Homo sapiens signal transducer and activator of transcription 2 (STAT2), transcript variant X9, misc_RNA.
- XR_001748858.2 Homo sapiens signal transducer and activator of transcription 2 (STAT2), transcript variant X10, misc_RNA.
- XR_002957376.1 Homo sapiens signal transducer and activator of transcription 2 (STAT2), transcript variant X11, misc_RNA.
GeneID: 6774 is STAT3 signal transducer and activator of transcription 3. "The protein encoded by this gene is a member of the STAT protein family. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein is activated through phosphorylation in response to various cytokines and growth factors including IFNs, EGF, IL5, IL6, HGF, LIF and BMP2. This protein mediates the expression of a variety of genes in response to cell stimuli, and thus plays a key role in many cellular processes such as cell growth and apoptosis. The small GTPase Rac1 has been shown to bind and regulate the activity of this protein. PIAS3 protein is a specific inhibitor of this protein. Mutations in this gene are associated with infantile-onset multisystem autoimmune disease and hyper-immunoglobulin E syndrome. Alternative splicing results in multiple transcript variants encoding distinct isoforms."[28]
- NP_644805.1 signal transducer and activator of transcription 3 isoform 1.
- NP_003141.2 signal transducer and activator of transcription 3 isoform 2.
- NP_998827.1 signal transducer and activator of transcription 3 isoform 3.
- XP_011523447.1 signal transducer and activator of transcription 3 isoform X1.
- XP_005257674.2 signal transducer and activator of transcription 3 isoform X1.
- XP_005257673.2 signal transducer and activator of transcription 3 isoform X2.
- XP_016880464.1 signal transducer and activator of transcription 3 isoform X2.
- XP_011523448.1 signal transducer and activator of transcription 3 isoform X3.
- XP_016880461.1 signal transducer and activator of transcription 3 isoform X3.
- XP_016880462.1 signal transducer and activator of transcription 3 isoform X4.
- XP_016880463.1 signal transducer and activator of transcription 3 isoform X4.
- XP_016880465.1 signal transducer and activator of transcription 3 isoform X4.
- XP_024306664.1 signal transducer and activator of transcription 3 isoform X5.
GeneID: 6775 is STAT4 signal transducer and activator of transcription 4. "The protein encoded by this gene is a member of the STAT family of transcription factors. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein is essential for mediating responses to IL12 in lymphocytes, and regulating the differentiation of T helper cells. Mutations in this gene may be associated with systemic lupus erythematosus and rheumatoid arthritis. Alternate splicing results in multiple transcript variants that encode the same protein."[37]
- NP_003142.1 signal transducer and activator of transcription 4 (variant 1).
- NP_001230764.1 signal transducer and activator of transcription 4 (variant 2).
- XP_011510007.1 signal transducer and activator of transcription 4 isoform X1.
- XP_006712782.1 signal transducer and activator of transcription 4 isoform X1.
- XP_016860273.1 signal transducer and activator of transcription 4 isoform X2.
GeneID: ID: 6776 is STAT5A signal transducer and activator of transcription 5A (aka STAT5). "The protein encoded by this gene is a member of the STAT family of transcription factors. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein is activated by, and mediates the responses of many cell ligands, such as IL2, IL3, IL7 GM-CSF, erythropoietin, thrombopoietin, and different growth hormones. Activation of this protein in myeloma and lymphoma associated with a TEL/JAK2 gene fusion is independent of cell stimulus and has been shown to be essential for tumorigenesis. The mouse counterpart of this gene is found to induce the expression of BCL2L1/BCL-X(L), which suggests the antiapoptotic function of this gene in cells. Alternatively spliced transcript variants have been found for this gene."[38]
- NP_001275647.1 signal transducer and activator of transcription 5A isoform 1 (variant 1).
- NP_003143.2 signal transducer and activator of transcription 5A isoform 1 (variant 2).
- NP_001275648.1 signal transducer and activator of transcription 5A isoform 2 (variant 3).
- NP_001275649.1 signal transducer and activator of transcription 5A isoform 3 (variant 4).
- XP_005257681.1 signal transducer and activator of transcription 5A isoform X1.
GeneID: ID: 6777 is STAT5B signal transducer and activator of transcription 5B (aka STAT5). "The protein encoded by this gene is a member of the STAT family of transcription factors. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein mediates the signal transduction triggered by various cell ligands, such as IL2, IL4, CSF1, and different growth hormones. It has been shown to be involved in diverse biological processes, such as TCR signaling, apoptosis, adult mammary gland development, and sexual dimorphism of liver gene expression. This gene was found to fuse to retinoic acid receptor-alpha (RARA) gene in a small subset of acute promyelocytic leukemias (APLL). The dysregulation of the signaling pathways mediated by this protein may be the cause of the APLL."[39]
- NP_036580.2 signal transducer and activator of transcription 5B.
- XP_024306665.1 signal transducer and activator of transcription 5B isoform X1.
- XP_024306666.1 signal transducer and activator of transcription 5B isoform X1.
- XP_016880466.1 signal transducer and activator of transcription 5B isoform X2.
- XP_005257683.1 signal transducer and activator of transcription 5B isoform X3.
Signal transducer and activator of transcription 5 (STAT5) actually consists of STAT5A (GeneID: ID: 6776) and STAT5B (GeneID: ID: 6777).
GeneID: ID: 6778 is STAT6 signal transducer and activator of transcription 6 (aka STAT6B, STAT6C). "The protein encoded by this gene is a member of the STAT family of transcription factors. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein plays a central role in exerting IL4 mediated biological responses. It is found to induce the expression of BCL2L1/BCL-X(L), which is responsible for the anti-apoptotic activity of IL4. Knockout studies in mice suggested the roles of this gene in differentiation of T helper 2 (Th2) cells, expression of cell surface markers, and class switch of immunoglobulins. Alternative splicing results in multiple transcript variants."[40]
- NP_001171549.1 signal transducer and activator of transcription 6 isoform 1 (variant 1).
- NP_003144.3 signal transducer and activator of transcription 6 isoform 1 (variant 2).
- NP_001171550.1 signal transducer and activator of transcription 6 isoform 1 (variant 3).
- NP_001171551.1 signal transducer and activator of transcription 6 isoform 2 (variant 4).
- NP_001171552.1 signal transducer and activator of transcription 6 isoform 2 (variant 5).
- NR_033659.1 Homo sapiens signal transducer and activator of transcription 6 (STAT6), transcript variant 6, non-coding RNA.
- XP_011537005.1 signal transducer and activator of transcription 6 isoform X1.
- XP_011537007.1 signal transducer and activator of transcription 6 isoform X1.
- XP_011537009.1 signal transducer and activator of transcription 6 isoform X1.
- XP_011537010.1 signal transducer and activator of transcription 6 isoform X2.
- XP_011537006.1 signal transducer and activator of transcription 6 isoform X1.
GATA samplings
Copying a responsive elements consensus sequence GATA and putting the sequence in "⌘F" finds three between ZNF497 and A1BG or seven between ZSCAN22 and A1BG as can be found by the computer programs.
For the Basic programs testing consensus sequence 5'-GATA-3' (starting with SuccessablesGATA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
- negative strand, negative direction: 7, GATA at 3526, GATA at 2996, GATA at 2975, GATA at 2898, GATA at 1525, GATA at 234, GATA at 210.
- positive strand, negative direction: 15, GATA at 3655, GATA at 3544, GATA at 3535, GATA at 3465, GATA at 3360, GATA at 2981, GATA at 2177, GATA at 1702, GATA at 1665, GATA at 1595, GATA at 353, GATA at 108, GATA at 98, GATA at 74, GATA at 57.
- negative strand, positive direction: 3, GATA at 2737, GATA at 2659, GATA at 1837.
- positive strand, positive direction: 4, GATA at 3698, GATA at 3258, GATA at 2157, GATA at 1975.
- inverse complement, negative strand, negative direction: 12, TATC at 4078, TATC at 4046, TATC at 3446, TATC at 3421, TATC at 2902, TATC at 2498, TATC at 1709, TATC at 1703, TATC at 468, TATC at 354, TATC at 250, TATC at 99.
- inverse complement, positive strand, negative direction: 2, TATC at 1730, TATC at 1528.
- inverse complement, negative strand, positive direction: 2, TATC at 4123, TATC at 3383.
- inverse complement, positive strand, positive direction: 4, TATC at 3700, TATC at 2626, TATC at 2549, TATC at 1838.
GATA (4560-2846) UTRs
- Negative strand, negative direction: TATC at 4078, TATC at 4046, GATA at 3526, TATC at 3446, TATC at 3421, GATA at 2996, GATA at 2975, TATC at 2902, GATA at 2898.
- Positive strand, negative direction: GATA at 3655, GATA at 3544, GATA at 3535, GATA at 3465, GATA at 3360, GATA at 2981.
GATA positive direction (4265-4050) proximal promoters
- Negative strand, positive direction: TATC at 4123.
GATA negative direction (2596-1) distal promoters
- Negative strand, negative direction: TATC at 2498, TATC at 1709, TATC at 1703, GATA at 1525, TATC at 468, TATC at 354, TATC at 250, GATA at 234, GATA at 210, TATC at 99.
- Positive strand, negative direction: GATA at 2177, TATC at 1730, GATA at 1702, GATA at 1665, GATA at 1595, TATC at 1528, GATA at 353, GATA at 108, GATA at 98, GATA at 74, GATA at 57.
GATA positive direction (4050-1) distal promoters
- Negative strand, positive direction: TATC at 3383, GATA at 2737, GATA at 2659, GATA at 1837.
- Positive strand, positive direction: TATC at 3700, GATA at 3698, GATA at 3258, TATC at 2626, TATC at 2549, GATA at 2157, GATA at 1975, TATC at 1838.
GATA random dataset samplings
- GATAr0: 13, GATA at 4491, GATA at 4477, GATA at 3537, GATA at 3218, GATA at 2990, GATA at 2796, GATA at 2423, GATA at 2419, GATA at 1537, GATA at 1465, GATA at 1405, GATA at 1239, GATA at 849.
- GATAr1: 16, GATA at 4162, GATA at 3935, GATA at 3444, GATA at 3382, GATA at 3018, GATA at 2682, GATA at 2173, GATA at 2154, GATA at 1685, GATA at 1302, GATA at 1106, GATA at 751, GATA at 674, GATA at 492, GATA at 162, GATA at 14.
- GATAr2: 15, GATA at 4470, GATA at 4445, GATA at 4342, GATA at 3943, GATA at 3534, GATA at 3516, GATA at 3129, GATA at 3040, GATA at 2113, GATA at 1429, GATA at 1137, GATA at 1017, GATA at 987, GATA at 656, GATA at 306.
- GATAr3: 12, GATA at 3900, GATA at 3864, GATA at 3229, GATA at 3120, GATA at 2960, GATA at 2773, GATA at 2142, GATA at 1775, GATA at 1708, GATA at 1287, GATA at 940, GATA at 305.
- GATAr4: 10, GATA at 4275, GATA at 4264, GATA at 2227, GATA at 1002, GATA at 941, GATA at 896, GATA at 510, GATA at 179, GATA at 161, GATA at 113.
- GATAr5: 5, GATA at 4501, GATA at 3737, GATA at 3084, GATA at 1142, GATA at 668.
- GATAr6: 12, GATA at 4556, GATA at 4488, GATA at 3789, GATA at 3216, GATA at 2166, GATA at 1999, GATA at 1912, GATA at 1814, GATA at 1241, GATA at 1199, GATA at 1069, GATA at 150.
- GATAr7: 21, GATA at 4530, GATA at 4441, GATA at 4390, GATA at 4075, GATA at 4064, GATA at 3437, GATA at 3423, GATA at 3304, GATA at 3104, GATA at 3053, GATA at 2825, GATA at 2719, GATA at 2314, GATA at 1859, GATA at 1816, GATA at 1549, GATA at 1468, GATA at 1039, GATA at 791, GATA at 526, GATA at 151.
- GATAr8: 17, GATA at 4519, GATA at 4490, GATA at 4439, GATA at 4344, GATA at 4195, GATA at 3979, GATA at 3577, GATA at 3554, GATA at 3467, GATA at 2998, GATA at 1795, GATA at 1700, GATA at 1491, GATA at 1150, GATA at 1029, GATA at 272, GATA at 9.
- GATAr9: 9, GATA at 4439, GATA at 4225, GATA at 4167, GATA at 4019, GATA at 2993, GATA at 2465, GATA at 2257, GATA at 1663, GATA at 670.
- GATAr0ci: 15, TATC at 4101, TATC at 4015, TATC at 4006, TATC at 3963, TATC at 3859, TATC at 3617, TATC at 2895, TATC at 2785, TATC at 2188, TATC at 1544, TATC at 1539, TATC at 707, TATC at 334, TATC at 277, TATC at 156.
- GATAr1ci: 16, TATC at 3944, TATC at 3523, TATC at 3377, TATC at 3146, TATC at 3098, TATC at 2971, TATC at 2689, TATC at 2335, TATC at 2312, TATC at 1473, TATC at 1028, TATC at 739, TATC at 570, TATC at 430, TATC at 377, TATC at 58.
- GATAr2ci: 13, TATC at 4246, TATC at 4217, TATC at 3575, TATC at 3536, TATC at 3478, TATC at 2853, TATC at 2640, TATC at 2480, TATC at 2414, TATC at 2356, TATC at 2322, TATC at 2179, TATC at 1388.
- GATAr3ci: 9, TATC at 4255, TATC at 3354, TATC at 3122, TATC at 2284, TATC at 1873, TATC at 942, TATC at 770, TATC at 290, TATC at 79.
- GATAr4ci: 14, TATC at 4506, TATC at 4489, TATC at 4474, TATC at 4302, TATC at 4094, TATC at 4027, TATC at 3800, TATC at 3282, TATC at 2558, TATC at 1678, TATC at 668, TATC at 641, TATC at 166, TATC at 115.
- GATAr5ci: 12, TATC at 3647, TATC at 3234, TATC at 3141, TATC at 2180, TATC at 1895, TATC at 1678, TATC at 1274, TATC at 950, TATC at 649, TATC at 510, TATC at 386, TATC at 91.
- GATAr6ci: 17, TATC at 4544, TATC at 4208, TATC at 3756, TATC at 3365, TATC at 3181, TATC at 3009, TATC at 2855, TATC at 2807, TATC at 2351, TATC at 2053, TATC at 2037, TATC at 1928, TATC at 1760, TATC at 1750, TATC at 1720, TATC at 626, TATC at 435.
- GATAr7ci: 9, TATC at 3588, TATC at 3432, TATC at 3331, TATC at 2773, TATC at 2347, TATC at 2020, TATC at 1566, TATC at 1489, TATC at 549.
- GATAr8ci: 11, TATC at 4230, TATC at 3742, TATC at 2416, TATC at 2325, TATC at 1916, TATC at 1365, TATC at 1336, TATC at 1199, TATC at 1076, TATC at 361, TATC at 78.
- GATAr9ci: 17, TATC at 4457, TATC at 4328, TATC at 4227, TATC at 3832, TATC at 3774, TATC at 3665, TATC at 2995, TATC at 2894, TATC at 2816, TATC at 2581, TATC at 2096, TATC at 1990, TATC at 1984, TATC at 1949, TATC at 1604, TATC at 554, TATC at 475.
GATAr arbitrary (evens) (4560-2846) UTRs
- GATAr0: GATA at 4491, GATA at 4477, GATA at 3537, GATA at 3218, GATA at 2990.
- GATAr2: GATA at 4470, GATA at 4445, GATA at 4342, GATA at 3943, GATA at 3534, GATA at 3516, GATA at 3129, GATA at 3040.
- GATAr4: GATA at 4275, GATA at 4264.
- GATAr6: GATA at 4556, GATA at 4488, GATA at 3789, GATA at 3216.
- GATAr8: GATA at 4519, GATA at 4490, GATA at 4439, GATA at 4344, GATA at 4195, GATA at 3979, GATA at 3577, GATA at 3554, GATA at 3467, GATA at 2998.
- GATAr0ci: TATC at 4101, TATC at 4015, TATC at 4006, TATC at 3963, TATC at 3859, TATC at 3617, TATC at 2895.
- GATAr2ci: TATC at 4246, TATC at 4217, TATC at 3575, TATC at 3536, TATC at 3478, TATC at 2853.
- GATAr4ci: TATC at 4506, TATC at 4489, TATC at 4474, TATC at 4302, TATC at 4094, TATC at 4027, TATC at 3800, TATC at 3282.
- GATAr6ci: TATC at 4544, TATC at 4208, TATC at 3756, TATC at 3365, TATC at 3181, TATC at 3009, TATC at 2855.
- GATAr8ci: TATC at 4230, TATC at 3742.
GATAr alternate (odds) (4560-2846) UTRs
- GATAr1: GATA at 4162, GATA at 3935, GATA at 3444, GATA at 3382, GATA at 3018.
- GATAr3: GATA at 3900, GATA at 3864, GATA at 3229, GATA at 3120, GATA at 2960.
- GATAr5: GATA at 4501, GATA at 3737, GATA at 3084.
- GATAr7: GATA at 4530, GATA at 4441, GATA at 4390, GATA at 4075, GATA at 4064, GATA at 3437, GATA at 3423, GATA at 3304, GATA at 3104, GATA at 3053.
- GATAr9: GATA at 4439, GATA at 4225, GATA at 4167, GATA at 4019, GATA at 2993.
- GATAr1ci: TATC at 3944, TATC at 3523, TATC at 3377, TATC at 3146, TATC at 3098, TATC at 2971.
- GATAr3ci: TATC at 4255, TATC at 3354, TATC at 3122.
- GATAr5ci: TATC at 3647, TATC at 3234, TATC at 3141.
- GATAr7ci: TATC at 3588, TATC at 3432, TATC at 3331.
- GATAr9ci: TATC at 4457, TATC at 4328, TATC at 4227, TATC at 3832, TATC at 3774, TATC at 3665, TATC at 2995, TATC at 2894.
GATAr alternate negative direction (odds) (2846-2811) core promoters
- GATAr7: GATA at 2825.
- GATAr9ci: TATC at 2816.
GATAr arbitrary positive direction (odds) (4445-4265) core promoters
- GATAr7: GATA at 4441, GATA at 4390.
- GATAr9: GATA at 4439.
- GATAr9ci: TATC at 4328.
GATAr alternate positive direction (evens) (4445-4265) core promoters
- GATAr2: GATA at 4445, GATA at 4342.
- GATAr4: GATA at 4275.
- GATAr8: GATA at 4439, GATA at 4344.
- GATAr4ci: TATC at 4302.
GATAr arbitrary negative direction (evens) (2811-2596) proximal promoters
- GATAr0: GATA at 2796.
- GATAr0ci: TATC at 2785.
- GATAr2ci: TATC at 2640.
- GATAr6ci: TATC at 2807.
GATAr alternate negative direction (odds) (2811-2596) proximal promoters
- GATAr1: GATA at 2682.
- GATAr3: GATA at 2773.
- GATAr7: GATA at 2719.
- GATAr1ci: TATC at 2689.
- GATAr7ci: TATC at 2773.
GATAr arbitrary positive direction (odds) (4265-4050) proximal promoters
- GATAr1: GATA at 4162.
- GATAr7: GATA at 4075, GATA at 4064.
- GATAr9: GATA at 4225, GATA at 4167.
- GATAr3ci: TATC at 4255.
- GATAr9ci: TATC at 4227.
GATAr alternate positive direction (evens) (4265-4050) proximal promoters
- GATAr4: GATA at 4264.
- GATAr8: GATA at 4195.
- GATAr2ci: TATC at 4246, TATC at 4217.
- GATAr4ci: TATC at 4094.
- GATAr6ci: TATC at 4208.
- GATAr8ci: TATC at 4230.
GATAr arbitrary negative direction (evens) (2596-1) distal promoters
- GATAr0: GATA at 2423, GATA at 2419, GATA at 1537, GATA at 1465, GATA at 1405, GATA at 1239, GATA at 849.
- GATAr2: GATA at 2113, GATA at 1429, GATA at 1137, GATA at 1017, GATA at 987, GATA at 656, GATA at 306.
- GATAr4: GATA at 2227, GATA at 1002, GATA at 941, GATA at 896, GATA at 510, GATA at 179, GATA at 161, GATA at 113.
- GATAr6: GATA at 2166, GATA at 1999, GATA at 1912, GATA at 1814, GATA at 1241, GATA at 1199, GATA at 1069, GATA at 150.
- GATAr8: GATA at 1795, GATA at 1700, GATA at 1491, GATA at 1150, GATA at 1029, GATA at 272, GATA at 9.
- GATAr0ci: TATC at 2188, TATC at 1544, TATC at 1539, TATC at 707, TATC at 334, TATC at 277, TATC at 156.
- GATAr2ci: TATC at 2480, TATC at 2414, TATC at 2356, TATC at 2322, TATC at 2179, TATC at 1388.
- GATAr4ci: TATC at 2558, TATC at 1678, TATC at 668, TATC at 641, TATC at 166, TATC at 115.
- GATAr6ci: TATC at 2351, TATC at 2053, TATC at 2037, TATC at 1928, TATC at 1760, TATC at 1750, TATC at 1720, TATC at 626, TATC at 435.
- GATAr8ci: TATC at 2416, TATC at 2325, TATC at 1916, TATC at 1365, TATC at 1336, TATC at 1199, TATC at 1076, TATC at 361, TATC at 78.
GATAr alternate negative direction (odds) (2596-1) distal promoters
- GATAr1: GATA at 2173, GATA at 2154, GATA at 1685, GATA at 1302, GATA at 1106, GATA at 751, GATA at 674, GATA at 492, GATA at 162, GATA at 14.
- GATAr3: GATA at 2142, GATA at 1775, GATA at 1708, GATA at 1287, GATA at 940, GATA at 305.
- GATAr5: GATA at 1142, GATA at 668.
- GATAr7: GATA at 2314, GATA at 1859, GATA at 1816, GATA at 1549, GATA at 1468, GATA at 1039, GATA at 791, GATA at 526, GATA at 151.
- GATAr9: GATA at 2465, GATA at 2257, GATA at 1663, GATA at 670.
- GATAr1ci: TATC at 2689, TATC at 2335, TATC at 2312, TATC at 1473, TATC at 1028, TATC at 739, TATC at 570, TATC at 430, TATC at 377, TATC at 58.
- GATAr3ci: TATC at 2284, TATC at 1873, TATC at 942, TATC at 770, TATC at 290, TATC at 79.
- GATAr5ci: TATC at 2180, TATC at 1895, TATC at 1678, TATC at 1274, TATC at 950, TATC at 649, TATC at 510, TATC at 386, TATC at 91.
- GATAr7ci: TATC at 2347, TATC at 2020, TATC at 1566, TATC at 1489, TATC at 549.
- GATAr9ci: TATC at 2581, TATC at 2096, TATC at 1990, TATC at 1984, TATC at 1949, TATC at 1604, TATC at 554, TATC at 475.
GATAr arbitrary positive direction (odds) (4050-1) distal promoters
- GATAr1: GATA at 3935, GATA at 3444, GATA at 3382, GATA at 3018, GATA at 2682, GATA at 2173, GATA at 2154, GATA at 1685, GATA at 1302, GATA at 1106, GATA at 751, GATA at 674, GATA at 492, GATA at 162, GATA at 14.
- GATAr3: GATA at 3900, GATA at 3864, GATA at 3229, GATA at 3120, GATA at 2960, GATA at 2773, GATA at 2142, GATA at 1775, GATA at 1708, GATA at 1287, GATA at 940, GATA at 305.
- GATAr5: GATA at 3737, GATA at 3084, GATA at 1142, GATA at 668.
- GATAr7: GATA at 3437, GATA at 3423, GATA at 3304, GATA at 3104, GATA at 3053, GATA at 2825, GATA at 2719, GATA at 2314, GATA at 1859, GATA at 1816, GATA at 1549, GATA at 1468, GATA at 1039, GATA at 791, GATA at 526, GATA at 151.
- GATAr9: GATA at 4019, GATA at 2993, GATA at 2465, GATA at 2257, GATA at 1663, GATA at 670.
- GATAr1ci: TATC at 3944, TATC at 3523, TATC at 3377, TATC at 3146, TATC at 3098, TATC at 2971, TATC at 2689, TATC at 2335, TATC at 2312, TATC at 1473, TATC at 1028, TATC at 739, TATC at 570, TATC at 430, TATC at 377, TATC at 58.
- GATAr3ci: TATC at 3354, TATC at 3122, TATC at 2284, TATC at 1873, TATC at 942, TATC at 770, TATC at 290, TATC at 79.
- GATAr5ci: TATC at 3647, TATC at 3234, TATC at 3141, TATC at 2180, TATC at 1895, TATC at 1678, TATC at 1274, TATC at 950, TATC at 649, TATC at 510, TATC at 386, TATC at 91.
- GATAr7ci: TATC at 3588, TATC at 3432, TATC at 3331, TATC at 2773, TATC at 2347, TATC at 2020, TATC at 1566, TATC at 1489, TATC at 549.
- GATAr9ci: TATC at 3832, TATC at 3774, TATC at 3665, TATC at 2995, TATC at 2894, TATC at 2816, TATC at 2581, TATC at 2096, TATC at 1990, TATC at 1984, TATC at 1949, TATC at 1604, TATC at 554, TATC at 475.
GATAr alternate positive direction (evens) (4050-1) distal promoters
- GATAr0: GATA at 3537, GATA at 3218, GATA at 2990, GATA at 2796, GATA at 2423, GATA at 2419, GATA at 1537, GATA at 1465, GATA at 1405, GATA at 1239, GATA at 849.
- GATAr2: GATA at 3943, GATA at 3534, GATA at 3516, GATA at 3129, GATA at 3040, GATA at 2113, GATA at 1429, GATA at 1137, GATA at 1017, GATA at 987, GATA at 656, GATA at 306.
- GATAr4: GATA at 2227, GATA at 1002, GATA at 941, GATA at 896, GATA at 510, GATA at 179, GATA at 161, GATA at 113.
- GATAr6: GATA at 3789, GATA at 3216, GATA at 2166, GATA at 1999, GATA at 1912, GATA at 1814, GATA at 1241, GATA at 1199, GATA at 1069, GATA at 150.
- GATAr8: GATA at 3979, GATA at 3577, GATA at 3554, GATA at 3467, GATA at 2998, GATA at 1795, GATA at 1700, GATA at 1491, GATA at 1150, GATA at 1029, GATA at 272, GATA at 9.
- GATAr0ci: TATC at 4015, TATC at 4006, TATC at 3963, TATC at 3859, TATC at 3617, TATC at 2895, TATC at 2785, TATC at 2188, TATC at 1544, TATC at 1539, TATC at 707, TATC at 334, TATC at 277, TATC at 156.
- GATAr2ci: TATC at 3575, TATC at 3536, TATC at 3478, TATC at 2853, TATC at 2640, TATC at 2480, TATC at 2414, TATC at 2356, TATC at 2322, TATC at 2179, TATC at 1388.
- GATAr4ci: TATC at 4027, TATC at 3800, TATC at 3282, TATC at 2558, TATC at 1678, TATC at 668, TATC at 641, TATC at 166, TATC at 115.
- GATAr6ci: TATC at 3756, TATC at 3365, TATC at 3181, TATC at 3009, TATC at 2855, TATC at 2807, TATC at 2351, TATC at 2053, TATC at 2037, TATC at 1928, TATC at 1760, TATC at 1750, TATC at 1720, TATC at 626, TATC at 435.
- GATAr8ci: TATC at 3742, TATC at 2416, TATC at 2325, TATC at 1916, TATC at 1365, TATC at 1336, TATC at 1199, TATC at 1076, TATC at 361, TATC at 78.
GATA analysis and results
Although "the P3, P6 substitutions alter the conserved 'GATAAG' I box motif, a 'GATA' motif is present in the introduced EcoRV site. This introduced GATA sequence clearly does not serve as a functional I box [...]."[1]
The GATA box is part of the DAF-16-associated elements TGATAAG, the DNA replication-related elements TATCGATA, F boxes TGATAAG, Iboxes GATAAG, Shoot specific elements GATAATGATG, Cytokinin response regulators (ARR10s) (A/G)GATA(A/C)G, Cytokinin response regulators (ARR12s) (A/G)AGATA, HNF6s (A/G/T)(A/T)(A/G)T(C/T) (A/C/G)AT(A/C/G/T)(A/G/T) Positive strand, negative direction: TAGTTGATAA at 3527, and TAT Boxes TATCCAT Negative strand, negative direction: ATGGATA at 2996.
Reals or randoms | Promoters | direction | Numbers | Strands | Occurrences | Averages (± 0.1) |
---|---|---|---|---|---|---|
Reals | UTR | negative | 15 | 2 | 7.5 | 7.5 ± 1.5 (--9,+-6) |
Randoms | UTR | arbitrary negative | 59 | 10 | 5.9 | 5.5 |
Randoms | UTR | alternate negative | 51 | 10 | 5.1 | 5.5 |
Reals | Core | negative | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary negative | 0 | 10 | 0 | 0.1 |
Randoms | Core | alternate negative | 2 | 10 | 0.2 | 0.1 |
Reals | Core | positive | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary positive | 4 | 10 | 0.4 | 0.5 |
Randoms | Core | alternate positive | 6 | 10 | 0.6 | 0.5 |
Reals | Proximal | negative | 0 | 2 | 0 | 0 |
Randoms | Proximal | arbitrary negative | 4 | 10 | 0.4 | 0.45 |
Randoms | Proximal | alternate negative | 5 | 10 | 0.5 | 0.45 |
Reals | Proximal | positive | 1 | 2 | 0.5 | 0.5 |
Randoms | Proximal | arbitrary positive | 7 | 10 | 0.7 | 0.7 |
Randoms | Proximal | alternate positive | 7 | 10 | 0.7 | 0.7 |
Reals | Distal | negative | 21 | 2 | 10.5 | 10.5 ± 0.5 (--10,+-11) |
Randoms | Distal | arbitrary negative | 74 | 10 | 7.4 | 7.15 |
Randoms | Distal | alternate negative | 69 | 10 | 6.9 | 7.15 |
Reals | Distal | positive | 12 | 2 | 6.0 | 6.0 ± 2.0 (-+4,++8) |
Randoms | Distal | arbitrary positive | 112 | 10 | 11.2 | 11.2 |
Randoms | Distal | alternate positive | 112 | 10 | 11.2 | 11.2 |
Comparison:
The occurrences of real GATA UTRs are systematically greater than the randoms. The positive direction proximal promoters are less than the randoms. The GATA distal promoters are systematically outside the randoms. This suggests that the real GATAs are likely active or activable.
Staschke Gln3 samplings
Copying a responsive elements consensus sequence GAT(A/T)A and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.
For the Basic programs testing consensus sequence GAT(A/T)A (starting with SuccessablesGln3.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
- negative strand, negative direction, looking for GAT(A/T)A, 6, GATAA at 3527, GATAA at 2997, GATAA at 2976, GATTA at 980, GATTA at 601, GATAA at 235.
- positive strand, negative direction, looking for GAT(A/T)A, 3, GATAA at 3656, GATAA at 3536, GATAA at 3361.
- positive strand, positive direction, looking for GAT(A/T)A, 2, GATTA at 2438, GATAA at 1976.
- negative strand, positive direction, looking for GAT(A/T)A, 3, GATTA at 4164, GATTA at 4157, GATTA at 2546.
- inverse complement, negative strand, negative direction, looking for T(A/T)ATC, 15, TAATC at 4228, TTATC at 4078, TAATC at 3455, TAATC at 3176, TAATC at 2999, TAATC at 2978, TTATC at 2498, TAATC at 1887, TTATC at 1709, TAATC at 1234, TAATC at 777, TAATC at 499, TTATC at 250, TAATC at 237, TAATC at 154.
- inverse complement, positive strand, negative direction, looking for T(A/T)ATC, 13, TAATC at 3974, TAATC at 3065, TAATC at 2676, TAATC at 2089, TAATC at 1915, TAATC at 1780, TTATC at 1730, TAATC at 1262, TAATC at 1135, TAATC at 981, TAATC at 805, TAATC at 671, TAATC at 398.
- inverse complement, positive strand, positive direction, looking for T(A/T)ATC, 1, TTATC at 2626.
- inverse complement, negative strand, positive direction, looking for T(A/T)ATC, 3, TAATC at 4146, TTATC at 4123, TTATC at 3383.
Gln3 (4560-2846) UTRs
- Negative strand, negative direction: TAATC at 4228, TTATC at 4078, GATAA at 3527, TAATC at 3455, TAATC at 3176, TAATC at 2999, GATAA at 2997, TAATC at 2978, GATAA at 2976.
- Positive strand, negative direction: TAATC at 3974, GATAA at 3656, GATAA at 3536, GATAA at 3361, TAATC at 3065.
Gln3 negative direction (2811-2596) proximal promoters
- Positive strand, negative direction: TAATC at 2676.
Gln3 positive direction (4265-4050) proximal promoters
- Negative strand, positive direction: GATTA at 4164, GATTA at 4157, TAATC at 4146, TTATC at 4123.
Gln3 negative direction (2596-1) distal promoters
- Negative strand, negative direction: TTATC at 2498, TAATC at 1887, TTATC at 1709, TAATC at 1234, GATTA at 980, TAATC at 777, GATTA at 601, TAATC at 499, TTATC at 250, TAATC at 237, GATAA at 235, TAATC at 154.
- Positive strand, negative direction: TAATC at 2089, TAATC at 1915, TAATC at 1780, TTATC at 1730, TAATC at 1262, TAATC at 1135, TAATC at 981, TAATC at 805, TAATC at 671, TAATC at 398.
Gln3 positive direction (4050-1) distal promoters
- Negative strand, positive direction: TTATC at 3383, GATTA at 2546.
- Positive strand, positive direction: TTATC at 2626, GATTA at 2438, GATAA at 1976.
Gln3 Random dataset samplings
- Gln3r0: 5, GATTA at 4463, GATTA at 3974, GATAA at 3538, GATAA at 1406, GATAA at 1240.
- Gln3r1: 13, GATTA at 4042, GATAA at 3936, GATAA at 3445, GATAA at 3019, GATTA at 2850, GATAA at 2155, GATTA at 1701, GATAA at 1686, GATAA at 1303, GATTA at 1260, GATTA at 1231, GATTA at 362, GATAA at 163.
- Gln3r2: 10, GATAA at 4471, GATAA at 3944, GATTA at 3613, GATTA at 1581, GATAA at 1138, GATAA at 1018, GATAA at 988, GATAA at 657, GATTA at 337, GATAA at 307.
- Gln3r3: 7, GATAA at 3901, GATAA at 3865, GATAA at 3230, GATAA at 2774, GATAA at 1709, GATAA at 1288, GATTA at 709.
- Gln3r4: 5, GATTA at 3752, GATTA at 2646, GATTA at 946, GATTA at 617, GATAA at 180.
- Gln3r5: 5, GATTA at 4203, GATTA at 4062, GATAA at 1143, GATTA at 892, GATTA at 691.
- Gln3r6: 8, GATAA at 4557, GATAA at 3790, GATTA at 3754, GATTA at 3610, GATTA at 3392, GATAA at 2167, GATAA at 2000, GATAA at 1815.
- Gln3r7: 17, GATTA at 4516, GATAA at 4442, GATAA at 4391, GATTA at 4147, GATAA at 4076, GATAA at 4065, GATTA at 3596, GATAA at 3424, GATTA at 3375, GATAA at 3305, GATAA at 3105, GATAA at 2826, GATAA at 2720, GATTA at 2403, GATAA at 1860, GATAA at 792, GATTA at 242.
- Gln3r8: 10, GATAA at 4520, GATTA at 4361, GATAA at 3578, GATTA at 3482, GATTA at 2433, GATTA at 2090, GATTA at 1819, GATTA at 1074, GATAA at 1030, GATAA at 273.
- Gln3r9: 5, GATTA at 4277, GATAA at 4168, GATAA at 2466, GATAA at 1664, GATTA at 240.
- Gln3r0ci: 9, TAATC at 4152, TTATC at 4101, TTATC at 4006, TTATC at 3617, TAATC at 3126, TTATC at 2895, TAATC at 2579, TAATC at 2506, TTATC at 277.
- Gln3r1ci: 9, TAATC at 3459, TTATC at 3377, TTATC at 3146, TTATC at 3098, TAATC at 2376, TTATC at 2312, TTATC at 1473, TTATC at 1028, TAATC at 539.
- Gln3r2ci: 10, TTATC at 4217, TTATC at 3575, TAATC at 3448, TAATC at 3230, TTATC at 2853, TTATC at 2640, TAATC at 2511, TTATC at 2179, TAATC at 827, TAATC at 794.
- Gln3r3ci: 7, TAATC at 3867, TAATC at 2776, TTATC at 1873, TAATC at 1351, TAATC at 1218, TTATC at 770, TTATC at 290.
- Gln3r4ci: 8, TTATC at 4302, TAATC at 4117, TAATC at 3983, TAATC at 3893, TAATC at 1697, TTATC at 1678, TAATC at 776, TTATC at 166.
- Gln3r5ci: 14, TAATC at 4183, TAATC at 3852, TTATC at 3234, TAATC at 3214, TAATC at 3204, TAATC at 2647, TAATC at 2518, TAATC at 2061, TAATC at 2012, TAATC at 1316, TTATC at 1274, TTATC at 950, TTATC at 649, TAATC at 468.
- Gln3r6ci: 8, TTATC at 3756, TTATC at 3365, TAATC at 3128, TTATC at 2807, TAATC at 2317, TTATC at 1750, TAATC at 1186, TAATC at 34.
- Gln3r7ci: 4, TAATC at 3843, TAATC at 1737, TAATC at 210, TAATC at 30.
- Gln3r8ci: 9, TAATC at 3072, TAATC at 2926, TAATC at 2819, TTATC at 2416, TAATC at 1748, TTATC at 1365, TTATC at 1336, TTATC at 1076, TTATC at 78.
- Gln3r9ci: 8, TAATC at 3557, TAATC at 3366, TAATC at 3085, TTATC at 2894, TTATC at 2581, TAATC at 2393, TAATC at 2044, TTATC at 1990.
Gln3r arbitrary (evens) (4560-2846) UTRs
- Gln3r0: GATTA at 4463, GATTA at 3974, GATAA at 3538.
- Gln3r2: GATAA at 4471, GATAA at 3944, GATTA at 3613.
- Gln3r4: GATTA at 3752.
- Gln3r6: GATAA at 4557, GATAA at 3790, GATTA at 3754, GATTA at 3610, GATTA at 3392.
- Gln3r8: GATAA at 4520, GATTA at 4361, GATAA at 3578, GATTA at 3482.
- Gln3r0ci: TAATC at 4152, TTATC at 4101, TTATC at 4006, TTATC at 3617, TAATC at 3126, TTATC at 2895.
- Gln3r2ci: TTATC at 4217, TTATC at 3575, TAATC at 3448, TAATC at 3230, TTATC at 2853.
- Gln3r4ci: TTATC at 4302, TAATC at 4117, TAATC at 3983, TAATC at 3893.
- Gln3r6ci: TTATC at 3756, TTATC at 3365, TAATC at 3128.
- Gln3r8ci: TAATC at 3072, TAATC at 2926.
Gln3r alternate (odds) (4560-2846) UTRs
- Gln3r1: GATTA at 4042, GATAA at 3936, GATAA at 3445, GATAA at 3019, GATTA at 2850.
- Gln3r3: GATAA at 3901, GATAA at 3865, GATAA at 3230.
- Gln3r5: GATTA at 4203, GATTA at 4062.
- Gln3r7: GATTA at 4516, GATAA at 4442, GATAA at 4391, GATTA at 4147, GATAA at 4076, GATAA at 4065, GATTA at 3596, GATAA at 3424, GATTA at 3375, GATAA at 3305, GATAA at 3105.
- Gln3r9: GATTA at 4277, GATAA at 4168.
- Gln3r1ci: TAATC at 3459, TTATC at 3377, TTATC at 3146, TTATC at 3098.
- Gln3r3ci: TAATC at 3867.
- Gln3r5ci: TAATC at 4183, TAATC at 3852, TTATC at 3234, TAATC at 3214, TAATC at 3204.
- Gln3r7ci: TAATC at 3843.
- Gln3r9ci: TAATC at 3557, TAATC at 3366, TAATC at 3085, TTATC at 2894.
Gln3r arbitrary negative direction (evens) (2846-2811) core promoters
- Gln3r8ci: TAATC at 2819.
Gln3r alternate negative direction (odds) (2846-2811) core promoters
- Gln3r7: GATAA at 2826.
Gln3r arbitrary positive direction (odds) (4445-4265) core promoters
- Gln3r7: GATAA at 4442, GATAA at 4391.
- Gln3r9: GATTA at 4277.
Gln3r alternate positive direction (evens) (4445-4265) core promoters
- Gln3r8: GATTA at 4361.
- Gln3r4ci: TTATC at 4302.
Gln3r arbitrary negative direction (evens) (2811-2596) proximal promoters
- Gln3r4: GATTA at 2646.
- Gln3r0ci: TAATC at 2579.
- Gln3r2ci: TTATC at 2640.
- Gln3r6ci: TTATC at 2807.
Gln3r alternate negative direction (odds) (2811-2596) proximal promoters
- Gln3r3: GATAA at 2774.
- Gln3r7: GATAA at 2720.
- Gln3r3ci: TAATC at 2776.
- Gln3r5ci: TAATC at 2647.
Gln3r arbitrary positive direction (odds) (4265-4050) proximal promoters
- Gln3r5: GATTA at 4203, GATTA at 4062.
- Gln3r7: GATTA at 4147, GATAA at 4076, GATAA at 4065.
- Gln3r9: GATAA at 4168.
- Gln3r5ci: TAATC at 4183.
Gln3r alternate positive direction (evens) (4265-4050) proximal promoters
- Gln3r0ci: TAATC at 4152, TTATC at 4101.
- Gln3r2ci: TTATC at 4217.
- Gln3r4ci: TAATC at 4117.
Gln3r arbitrary negative direction (evens) (2596-1) distal promoters
- Gln3r0: GATAA at 1406, GATAA at 1240.
- Gln3r2: GATTA at 1581, GATAA at 1138, GATAA at 1018, GATAA at 988, GATAA at 657, GATTA at 337, GATAA at 307.
- Gln3r4: GATTA at 946, GATTA at 617, GATAA at 180.
- Gln3r6: GATAA at 2167, GATAA at 2000, GATAA at 1815.
- Gln3r8: GATTA at 2433, GATTA at 2090, GATTA at 1819, GATTA at 1074, GATAA at 1030, GATAA at 273.
- Gln3r0ci: TAATC at 2579, TAATC at 2506, TTATC at 277.
- Gln3r2ci: TAATC at 2511, TTATC at 2179, TAATC at 827, TAATC at 794.
- Gln3r4ci: TAATC at 1697, TTATC at 1678, TAATC at 776, TTATC at 166.
- Gln3r6ci: TAATC at 2317, TTATC at 1750, TAATC at 1186, TAATC at 34.
- Gln3r8ci: TTATC at 2416, TAATC at 1748, TTATC at 1365, TTATC at 1336, TTATC at 1076, TTATC at 78.
Gln3r alternate negative direction (odds) (2596-1) distal promoters
- Gln3r1: GATAA at 2155, GATTA at 1701, GATAA at 1686, GATAA at 1303, GATTA at 1260, GATTA at 1231, GATTA at 362, GATAA at 163.
- Gln3r3: GATAA at 1709, GATAA at 1288, GATTA at 709.
- Gln3r5: GATAA at 1143, GATTA at 892, GATTA at 691.
- Gln3r7: GATTA at 2403, GATAA at 1860, GATAA at 792, GATTA at 242.
- Gln3r9: GATAA at 2466, GATAA at 1664, GATTA at 240.
- Gln3r1ci: TAATC at 2376, TTATC at 2312, TTATC at 1473, TTATC at 1028, TAATC at 539.
- Gln3r3ci: TTATC at 1873, TAATC at 1351, TAATC at 1218, TTATC at 770, TTATC at 290.
- Gln3r5ci: TAATC at 2518, TAATC at 2061, TAATC at 2012, TAATC at 1316, TTATC at 1274, TTATC at 950, TTATC at 649, TAATC at 468.
- Gln3r7ci: TAATC at 1737, TAATC at 210, TAATC at 30.
- Gln3r9ci: TTATC at 2581, TAATC at 2393, TAATC at 2044, TTATC at 1990.
Gln3r arbitrary positive direction (odds) (4050-1) distal promoters
- Gln3r1: GATTA at 4042, GATAA at 3936, GATAA at 3445, GATAA at 3019, GATTA at 2850, GATAA at 2155, GATTA at 1701, GATAA at 1686, GATAA at 1303, GATTA at 1260, GATTA at 1231, GATTA at 362, GATAA at 163.
- Gln3r3: GATAA at 3901, GATAA at 3865, GATAA at 3230, GATAA at 2774, GATAA at 1709, GATAA at 1288, GATTA at 709.
- Gln3r5: GATAA at 1143, GATTA at 892, GATTA at 691.
- Gln3r7: GATTA at 3596, GATAA at 3424, GATTA at 3375, GATAA at 3305, GATAA at 3105, GATAA at 2826, GATAA at 2720, GATTA at 2403, GATAA at 1860, GATAA at 792, GATTA at 242.
- Gln3r9: GATAA at 2466, GATAA at 1664, GATTA at 240.
- Gln3r1ci: TAATC at 3459, TTATC at 3377, TTATC at 3146, TTATC at 3098, TAATC at 2376, TTATC at 2312, TTATC at 1473, TTATC at 1028, TAATC at 539.
- Gln3r3ci: TAATC at 3867, TAATC at 2776, TTATC at 1873, TAATC at 1351, TAATC at 1218, TTATC at 770, TTATC at 290.
- Gln3r5ci: TAATC at 3852, TTATC at 3234, TAATC at 3214, TAATC at 3204, TAATC at 2647, TAATC at 2518, TAATC at 2061, TAATC at 2012, TAATC at 1316, TTATC at 1274, TTATC at 950, TTATC at 649, TAATC at 468.
- Gln3r7ci: TAATC at 3843, TAATC at 1737, TAATC at 210, TAATC at 30.
- Gln3r9ci: TAATC at 3557, TAATC at 3366, TAATC at 3085, TTATC at 2894, TTATC at 2581, TAATC at 2393, TAATC at 2044, TTATC at 1990.
Gln3r alternate positive direction (evens) (4050-1) distal promoters
- Gln3r0: GATTA at 3974, GATAA at 3538, GATAA at 1406, GATAA at 1240.
- Gln3r2: GATAA at 3944, GATTA at 3613, GATTA at 1581, GATAA at 1138, GATAA at 1018, GATAA at 988, GATAA at 657, GATTA at 337, GATAA at 307.
- Gln3r4: GATTA at 3752, GATTA at 2646, GATTA at 946, GATTA at 617, GATAA at 180.
- Gln3r6: GATAA at 3790, GATTA at 3754, GATTA at 3610, GATTA at 3392, GATAA at 2167, GATAA at 2000, GATAA at 1815.
- Gln3r8: GATAA at 3578, GATTA at 3482, GATTA at 2433, GATTA at 2090, GATTA at 1819, GATTA at 1074, GATAA at 1030, GATAA at 273.
- Gln3r0ci: TTATC at 4006, TTATC at 3617, TAATC at 3126, TTATC at 2895, TAATC at 2579, TAATC at 2506, TTATC at 277.
- Gln3r2ci: TTATC at 3575, TAATC at 3448, TAATC at 3230, TTATC at 2853, TTATC at 2640, TAATC at 2511, TTATC at 2179, TAATC at 827, TAATC at 794.
- Gln3r4ci: TAATC at 3983, TAATC at 3893, TAATC at 1697, TTATC at 1678, TAATC at 776, TTATC at 166.
- Gln3r6ci: TTATC at 3756, TTATC at 3365, TAATC at 3128, TTATC at 2807, TAATC at 2317, TTATC at 1750, TAATC at 1186, TAATC at 34.
- Gln3r8ci: TAATC at 3072, TAATC at 2926, TAATC at 2819, TTATC at 2416, TAATC at 1748, TTATC at 1365, TTATC at 1336, TTATC at 1076, TTATC at 78.
Gln3 analysis and results
GAT(A/T)A, "GATA (GATAAG, GATAAH, GATTA) motifs in yeast promoters."[9]
Reals or randoms | Promoters | direction | Numbers | Strands | Occurrences | Averages (± 0.1) |
---|---|---|---|---|---|---|
Reals | UTR | negative | 14 | 2 | 7 | 7 ± 2 (--9,+-5) |
Randoms | UTR | arbitrary negative | 36 | 10 | 3.6 | 3.7 |
Randoms | UTR | alternate negative | 38 | 10 | 3.8 | 3.7 |
Reals | Core | negative | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary negative | 1 | 10 | 0.1 | 0.1 |
Randoms | Core | alternate negative | 1 | 10 | 0.1 | 0.1 |
Reals | Core | positive | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary positive | 3 | 10 | 0.3 | 0.25 |
Randoms | Core | alternate positive | 2 | 10 | 0.2 | 0.25 |
Reals | Proximal | negative | 1 | 2 | 0.5 | 0.5 |
Randoms | Proximal | arbitrary negative | 4 | 10 | 0.4 | 0.4 |
Randoms | Proximal | alternate negative | 4 | 10 | 0.4 | 0.4 |
Reals | Proximal | positive | 4 | 2 | 2 | 2 |
Randoms | Proximal | arbitrary positive | 7 | 10 | 0.7 | 0.55 |
Randoms | Proximal | alternate positive | 4 | 10 | 0.4 | 0.55 |
Reals | Distal | negative | 22 | 2 | 11 | 11 ± 1 (--12,+-10) |
Randoms | Distal | arbitrary negative | 42 | 10 | 4.2 | 4.4 |
Randoms | Distal | alternate negative | 46 | 10 | 4.6 | 4.4 |
Reals | Distal | positive | 5 | 2 | 2.5 | 2.5 |
Randoms | Distal | arbitrary positive | 78 | 10 | 7.8 | 7.5 |
Randoms | Distal | alternate positive | 72 | 10 | 7.2 | 7.5 |
Comparison:
The occurrences of real Gln3 UTRs, proximals and the negative direction distals are greater than the randoms, the positive direction distals are less than the randoms. This suggests that the real Gln3s are likely active or activable.
Acknowledgements
The content on this page was first contributed by: Henry A. Hoff.
See also
References
- ↑ 1.0 1.1 1.2 Robert G. K. Donald and Anthony R. Cashmore (1990). "Mutation of either G box or I box sequences profoundly affects expression from the Arabidopsis rbcS‐1A promoter". The EMBO Journal. 9 (6): 1717–1726. doi:10.1002/j.1460-2075.1990.tb08295.x. Retrieved 8 November 2018.
- ↑ 2.0 2.1 Annkatrin Rose, Iris Meier and Udo Wienand (28 October 1999). "The tomato I-box binding factor LeMYBI is a member of a novel class of Myb-like proteins". The Plant Journal. 20 (6): 641–652. doi:10.1046/j.1365-313X.1999.00638.x. Retrieved 8 November 2018.
- ↑ Sepulveda JL, Vlahopoulos S, Iyer D, Belaguli N, Schwartz RJ (July 2002). "Combinatorial expression of GATA4, Nkx2-5, and serum response factor directs early cardiac gene activity". J. Biol. Chem. 277 (28): 25775–82. doi:10.1074/jbc.M203122200. PMID 11983708.
- ↑ Barron MR, Belaguli NS, Zhang SX, Trinh M, Iyer D, Merlo X, Lough JW, Parmacek MS, Bruneau BG, Schwartz RJ (March 2005). "Serum response factor, an enriched cardiac mesoderm obligatory factor, is a downstream gene target for Tbx genes". J. Biol. Chem. 280 (12): 11816–28. doi:10.1074/jbc.M412408200. PMID 15591049.
- ↑ Belaguli NS, Sepulveda JL, Nigam V, Charron F, Nemer M, Schwartz RJ (October 2000). "Cardiac tissue enriched factors serum response factor and GATA-4 are mutual coregulators". Mol. Cell. Biol. 20 (20): 7550–8. doi:10.1128/mcb.20.20.7550-7558.2000. PMC 86307. PMID 11003651.
- ↑ Morin S, Paradis P, Aries A, Nemer M (February 2001). "Serum response factor-GATA ternary complex required for nuclear signaling by a G-protein-coupled receptor". Mol. Cell. Biol. 21 (4): 1036–44. doi:10.1128/MCB.21.4.1036-1044.2001. PMC 99558. PMID 11158291.
- ↑ Guo Qing Tang, Yong Yan Bai & Shi Wei Loo (1 June 1996). "Functional properties of a cabbage chitinase promoter from cabbage (Brassica oleracea var. capitata)". Cell Research. 6 (9): 75–84. Retrieved 24 March 2019.
- ↑ Nibedita Lenka, Aruna Basu, Jayati Mullick, and Narayan G. Avadhani (22 November 1996). "The role of an E box binding basic helix loop helix protein in the cardiac muscle-specific expression of the rat cytochrome oxidase subunit VIII gene" (PDF). The Journal of Biological Chemistry. 271 (47): 30281–30289. doi:10.1074/jbc.271.47.30281. Retrieved 7 February 2019.
- ↑ 9.0 9.1 Kirk A. Staschke, Souvik Dey, John M. Zaborske, Lakshmi Reddy Palam, Jeanette N. McClintick, Tao Pan, Howard J. Edenberg, and Ronald C. Wek (May 28, 2010). "Integration of General Amino Acid Control and Target of Rapamycin (TOR) Regulatory Pathways in Nitrogen Assimilation in Yeast" (PDF). The Journal of Biological Chemistry. 285 (22): 16893–16911. doi:10.1074/jbc.M110.121947. Retrieved 4 January 2021.
- ↑ 10.0 10.1 10.2 10.3 10.4 10.5 10.6 William C. Aird, Jeffrey D. Parvin, Phillip A. Sharp, and Robert D. Rosenberg (14 January 1994). "The Interaction of GATA-binding Proteins and Basal Transcription Factors with GATA Box-containing Core Promoters" (PDF). The Journal of Biological Chemistry. 269 (2): 883–9. Retrieved 2 January 2020.
- ↑ 11.0 11.1 11.2 11.3 11.4 11.5 RefSeq (July 2008). "GATA1 GATA binding protein 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 30 December 2019.
- ↑ 12.0 12.1 12.2 12.3 12.4 12.5 Naoshi Obara, Norio Suzuki, Kibom Kim, Toshiro Nagasawa, Shigehiko Imagawa, and Masayuki Yamamoto (15 May 2008). "Repression via the GATA box is essential for tissue-specific erythropoietin gene expression" (PDF). Blood. 111 (10): 5223–32. doi:10.1182/blood-2007-10-115857. Retrieved 1 January 2020.
- ↑ 13.0 13.1 13.2 Misty R. Riddle, Abraham Weintraub, Ken C. Q. Nguyen, David H. Hall, and Joel H. Rothman (2013). "Transdifferentiation and remodeling of post-embryonic C. elegans cells by a single transcription factor". Development. 140 (24): 4844–9. doi:10.1242/dev.103010. Retrieved 29 December 2019.
- ↑ 14.0 14.1 Li Qian, Yu Huang, C. Ian Spencer, Amy Foley, Vasanth Vedantham, Lei Liu, Simon J. Conway, Ji-dong Fu & Deepak Srivastava (18 April 2012). "In vivo reprogramming of murine cardiac fibroblasts into induced cardiomyocytes". Nature. 485: 593–8. doi:10.1038/nature11044. Retrieved 30 December 2019.
- ↑ Tilanthi M. Jayawardena, Bakytbek Egemnazarov, Elizabeth A. Finch, Lunan Zhang, J. Alan Payne, Kumar Pandya, Zhiping Zhang, Paul Rosenberg, Maria Mirotsou, and Victor J. Dzau. "MicroRNA-Mediated In Vitro and In Vivo Direct Reprogramming of Cardiac Fibroblasts to Cardiomyocytes". Circulation Research. 110 (11). Retrieved 30 December 2019.
- ↑ Chunhui Xu (1 October 2012). "Turning cardiac fibroblasts into cardiomyocytes in vivo". Trends in Molecular Medicine. 18 (10): P575–6. doi:10.1016/j.molmed.2012.06.009. Retrieved 31 December 2019.
- ↑ Ji-Dong Fu, Nicole R. Stone, Lei Liu, C. Ian Spencer, Li Qian, Yohei Hayashi, Paul Delgado-Olguin, Sheng Ding, Benoit G. Bruneau and Deepak Srivastava (10 September 2013). "Direct Reprogramming of Human Fibroblasts toward a Cardiomyocyte-like State". Stem Cell Reports. 1 (3): P235–47. doi:10.1016/j.stemcr.2013.07.005. Retrieved 31 December 2019.
- ↑ Jenny X. Chen, Markus Krane, Marcus-Andre Deutsch, Li Wang, Moshe Rav-Acha, Serge Gregoire, Marc C. Engels, Kuppusamy Rajarajan, Ravi Karra, E. Dale Abel, Joe C. Wu, David Milan, and Sean M. Wu (22 June 2012). "Inefficient Reprogramming of Fibroblasts into Cardiomyocytes Using Gata4, Mef2c, and Tbx5". Circulation Research. 111 (1). doi:10.1161/CIRCRESAHA.112.270264. Retrieved 31 December 2019.
- ↑ Pereira, C. F., Chang, B., Qiu, J., Niu, X., Papatsenko, D., Hendry, C. E., ... & Moore, K. (2013). Induction of a hemogenic program in mouse fibroblasts. Cell stem cell, 13(2), 205-218. DOI: http://dx.doi.org/10.1016/j.stem.2013.05.024
- ↑ Jonah Riddell, Roi Gazit, Brian S. Garrison, et al., & Derrick J. Rossi (2014). Reprogramming Committed Murine Blood Cells to Induced Hematopoietic Stem Cells with Defined Factors. Cell, 157(30, 549–564, DOI: http://dx.doi.org/10.1016/j.cell.2014.04.006
- ↑ Boston Children's Hospital."Blood cells reprogrammed into blood stem cells in mice". ScienceDaily, 24 April 2014
- ↑ SemperBlotto (23 January 2006). "megakaryocyte". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2 January 2020.
- ↑ 23.0 23.1 CGNC (18 April 2019). "HBBA hemoglobin beta, subunit A [ Gallus gallus (chicken) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2 January 2020.
- ↑ 24.0 24.1 24.2 24.3 RefSeq (July 2008). "HBE1 - hemoglobin subunit epsilon 1". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2 January 2020.
- ↑ 25.0 25.1 RefSeq (August 2017). "EPO erythropoietin [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2 January 2020.
- ↑ 26.0 26.1 26.2 26.3 RefSeq (March 2009). "GATA2 GATA binding protein 2 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 30 December 2019.
- ↑ 27.0 27.1 27.2 27.3 27.4 RefSeq (November 2009). "GATA3 GATA binding protein 3 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 30 December 2019.
- ↑ 28.00 28.01 28.02 28.03 28.04 28.05 28.06 28.07 28.08 28.09 28.10 RefSeq (April 2015). "GATA4 GATA binding protein 4 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 30 December 2019.
- ↑ 29.0 29.1 29.2 29.3 29.4 RefSeq (July 2008). "TRPS1 transcriptional repressor GATA binding 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 31 December 2019.
- ↑ HGNC (21 December 2019). "GATAD2A GATA zinc finger domain containing 2A [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 31 December 2019.
- ↑ 31.0 31.1 31.2 31.3 RefSeq (June 2016). "GATAD2B GATA zinc finger domain containing 2B [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 31 December 2019.
- ↑ RefSeq (June 2012). "GATA1 GATA binding protein 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 31 December 2019.
- ↑ 33.0 33.1 33.2 RefSeq (July 2008). "GATA5 GATA binding protein 5 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 30 December 2019.
- ↑ 34.0 34.1 34.2 RefSeq (October 2014). "PF4 platelet factor 4 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2 January 2020.
- ↑ RefSeq (July 2008). STAT1 signal transducer and activator of transcription 1 [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 14 November 2018.
- ↑ RefSeq (March 2010). STAT2 signal transducer and activator of transcription 2 [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 14 November 2018.
- ↑ RefSeq (August 2011). STAT4 signal transducer and activator of transcription 4 [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 14 November 2018.
- ↑ RefSeq (December 2013). STAT5A signal transducer and activator of transcription 5A [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 14 November 2018.
- ↑ RefSeq (July 2008). STAT5B signal transducer and activator of transcription 5B [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 14 November 2018.
- ↑ RefSeq (May 2010). STAT6 signal transducer and activator of transcription 6 [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 14 November 2018.