GATA box gene transcription laboratory

Jump to navigation Jump to search

Associate Editor(s)-in-Chief: Henry A. Hoff

EpoR is thought to contribute to differentiation via multiple signaling pathways including the STAT5 pathway. Credit: Monkeyontheloose.{{free media}}

A laboratory is a specialized activity, a construct, you create where you as a student, teacher, or researcher can have hands-on, or as close to hands-on as possible, experience actively analyzing an entity, source, or object of interest. Usually, there's more to do than just analyzing. The construct is often a room, building or institution equipped for scientific research, experimentation as well as analysis.

This laboratory is a continuation of the previous laboratory.

In this laboratory the general DNA maintained at NCBI for Gene ID:1 A1BG is examined to confirm, especially with the extended data between ZNF497 and A1BG, the presence or absence of GATA boxes primarily in the promoter regions regarding the possible expression of alpha-1-B glycoprotein.

Consensus sequences

"GATA factors bind to a common upstream consensus site T/A(GATA)A/G and activate transcription in cotransfection assays."[1]

GATA1 "binds specifically to DNA consensus sequence [a 'GATA' motif][2] [the GATA box][3] [AT]GATA[AG] promoter elements".[4]

In "response to anemia and hypoxia, erythropoietin (Epo) gene transcription is activated in the kidney and liver (reviewed in Ebert and Bunn1).[5]

"Epo gene expression is regulated by an enhancer located 3'􏰀 to the transcriptional termination site.7 This 3'􏰀 enhancer contains a hypoxia response element (HRE) that has been shown to bind hypoxia-inducible transcription factors (HIFs).7 A binding sequence for nuclear receptor also resides in the enhancer.1,8 Thus, these 2 cis-acting elements may control Epo gene expression in a hypoxia-inducible manner (reviewed in Koury9)."[5]

This "GATA box actively participates in Epo gene regulation. The GATA box acts as a negative regulatory element in the hepatoma cell lines.10 During normoxic conditions, GATA transcription factors bind to the GATA box and repress Epo gene transcription, but when exposed to hypoxia, GATA binding markedly decreases, with a marked increase in Epo gene expression.10,11"[5]

"A GATA factor–binding motif (GATA box) has been identified in the core promoter region of the Epo gene, where a TATA box normally resides.10"[5]

"The wild-type GATA-box in the wt-Epo-GFP transgene" [is] cTgataac.[5]

"Since both GATA-2 and GATA-3 bind to the GATA box in distal tubular cells, both factors are likely to repress constitutively ectopic Epo gene expression in these cells. Thus, GATA-based repression is essential for the inducible and cell type–specific expression of the Epo gene."[5]

Nucleotides

DNA mapping has been performed. Her DNA for A1BG promoters can be found at A1BG gene transcription#Nucleotides.

Programming

Sample programs for preparing test programs are available at A1BG gene transcription programming.

Hypotheses

  1. A1BG is not transcribed by a GATA box.
  2. If a GATA box is present at least one transcription factor uses the GATA box to affect A1BG transcription.

Core promoters

File:Core promoter elements.svg
The diagram shows an overview of the four core promoter elements B recognition element (BRE), TATA box, initiator element (Inr), and downstream promoter element (DPE), with their respective consensus sequences and their distance from the transcription start site.[6] Credit: Jennifer E.F. Butler & James T. Kadonaga.{{free media}}

The core promoter is approximately -34 nts upstream from the TSS.

From the first nucleotide just after ZSCAN22 to the first nucleotide just before A1BG are 4460 nucleotides. The core promoter on this side of A1BG extends from approximately 4425 to the possible transcription start site at nucleotide number 4460.

To extend the analysis from inside and just on the other side of ZNF497 some 3340 nts have been added to the data. This would place the core promoter some 3340 nts further away from the other side of ZNF497. The TSS would be at about 4300 nts with the core promoter starting at 4266.

Def. "the factors, including RNA polymerase II itself, that are minimally essential for transcription in vitro from an isolated core promoter" is called the basal machinery, or basal transcription machinery.[7]

"The core promoter in human genes is the region from −40 to +40 and flanks the transcription start site (TSS) at +1. Although no single core promoter element is contained in all human promoters, many contain one or more of the following core elements [...]: the TATA box, initiator (Inr), TFIIB recognition elements (BREu and BREd), polypyrimidine initiator (TCT), motif ten element (MTE), and downstream core promoter element (DPE) [...]. Of these, the Inr element encompasses the TSS and is thought to be the most common core promoter element, with previous studies estimating that ∼50% of human core promoters contain an Inr (Gershenzon and Ioshikhes 2005; Yang et al. 2007). The commonly used consensus sequence for the human Inr, which was derived from mutational analyses, is YYANWYY from −2 to +5 (where, Y = C/T, W = A/T, N=A/C/G/T, and +1 is [A)] (Javahery et al. 1994; Lo and Smale 1996)."[8]

"Kadonaga and colleagues (Vo ngoc et al. 2017) devised and implemented a novel multistep approach that combines experimental and computational methods to reinvestigate the human Inr consensus sequence. First, they generated two 5′-GRO-seq (5′ end-selected global run-on followed by sequencing) libraries with human MCF-7 cells to identify the 5′ ends of nascent capped transcripts. Second, they developed a peak-calling algorithm named FocusTSS to find transcripts in the 5′-GRO-seq data sets that were initiated at a focused position on the genome, hence identifying clear TSSs to enable analysis of Inr sequences. FocusTSS identified 7678 TSSs that were in both data sets. Third, to identify sequence motifs enriched among the focused TSSs, they used the HOMER motif discovery tool (Heinz et al. 2010), which yielded an Inr-like consensus sequence of BBCABW from −3 to +3 (where, B = C/G/T, W = A/T, and +1 is [A]). Forty percent of the focused TSSs contained a perfect match to the BBCABW consensus Inr."[8]

Proximal promoters

Def. a "promoter region [juxtaposed to the core promoter that] binds transcription factors that modify the affinity of the core promoter for RNA polymerase.[12][13]"[9] is called a proximal promoter.

The proximal sequence upstream of the gene that tends to contain primary regulatory elements is a proximal promoter.

It is approximately 250 base pairs or nucleotides, nts, upstream of the transcription start site.

The proximal promoter begins about nucleotide number 4210 in the negative direction.

The proximal promoter begins about nucleotide number 4195 in the positive direction.

Distal promoters

The "upstream regions of the human [cytochrome P450 family 11 subfamily A] CYP11A and bovine CYP11B genes [have] a distal promoter in each gene. The distal promoters are located at −1.8 to −1.5 kb in the upstream region of the CYP11A gene and −1.5 to −1.1 kb in the upstream region of the CYP11B gene."[10]

"Using cloned chicken βA-globin genes, either individually or within the natural chromosomal locus, enhancer-dependent transcription is achieved in vitro at a distance of 2 kb with developmentally staged erythroid extracts. This occurs by promoter derepression and is critically dependent upon DNA topology. In the presence of the enhancer, genes must exist in a supercoiled conformation to be actively transcribed, whereas relaxed or linear templates are inactive. Distal protein–protein interactions in vitro may be favored on supercoiled DNA because of topological constraints."[11]

Distal promoter regions may be a relatively small number of nucleotides, fairly close to the TSS such as (-253 to -54)[12] or several regions of different lengths, many nucleotides away, such as (-2732 to -2600) and (-2830 to -2800).[13]

The "[d]istal promoter is not a spacer element."[14]

Using an estimate of 2 knts, a distal promoter to A1BG would be expected after nucleotide number 2460.

Any transcription factors before A1BG from the direction of ZN497 may be out to 2300 nts.

Samplings

Regarding hypothesis 1

Hypothesis 1: A1BG is not transcribed by a GATA box.

For the Basic programs testing consensus sequence 3'-(A/C/G)(A/G/T)(GATA)(A/G)(A/C)-5' (starting with SuccessablesGATAbox.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesGATA--.bas, looking for 3'-(A/C/G)(A/G/T)GATA(A/G)(A/C)-5', 0.
  2. negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesGATA-+.bas, looking for 3'-(A/C/G)(A/G/T)GATA(A/G)(A/C)-5', 0.
  3. positive strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesGATA+-.bas, looking for 3'-(A/C/G)(A/G/T)GATA(A/G)(A/C)-5', 2, 3'-GGGATAGA-5', 100, 3'-ATGATAGA-5', 355.
  4. positive strand in the positive direction (from ZSCAN22 to A1BG) is SuccessablesGATA++.bas, looking for 3'-(A/C/G)(A/G/T)GATA(A/G)(A/C)-5', 0.
  5. complement, negative strand, negative direction is SuccessablesGATAc--.bas, looking for 3'-(C/G/T)(A/C/T)CTAT(C/T)(G/T)-5', 2, 3'-CCCTATCT-5', 100, 3'-TACTATCT-5', 355.
  6. complement, negative strand, positive direction is SuccessablesGATAc-+.bas, looking for 3'-(C/G/T)(A/C/T)CTAT(C/T)(G/T)-5', 0.
  7. complement, positive strand, negative direction is SuccessablesGATAc+-.bas, looking for 3'-(C/G/T)(A/C/T)CTAT(C/T)(G/T)-5', 0.
  8. complement, positive strand, positive direction is SuccessablesGATAc++.bas, looking for 3'-(C/G/T)(A/C/T)CTAT(C/T)(G/T)-5', 0.
  9. inverse complement, negative strand, negative direction is SuccessablesGATAci--.bas, looking for 3'-(G/T)(C/T)TATC(A/C/T)(C/G/T)-5', 1, 3'-GTTATCAT-5', 2500.
  10. inverse complement, negative strand, positive direction is SuccessablesGATAci-+.bas, looking for 3'-(G/T)(C/T)TATC(A/C/T)(C/G/T)-5', 2, 3'-GTTATCCC-5', 3385, 3'-TTTATCAC-5', 4125.
  11. inverse complement, positive strand, negative direction is SuccessablesGATAci+-.bas, looking for 3'-(G/T)(C/T)TATC(A/C/T)(C/G/T)-5', 1, 3'-TTTATCTT-5', 1732.
  12. inverse complement, positive strand, positive direction is SuccessablesGATAci++.bas, looking for 3'-(G/T)(C/T)TATC(A/C/T)(C/G/T)-5', 2, 3'-GCTATCAG-5', 1840, 3'-TTTATCTT-5', 2628.
  13. inverse negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesGATAi--.bas, looking for 3'-(A/C)(A/G)ATAG(A/G/T)(A/C/G)-5', 1, 3'-AAATAGAA-5', 1732.
  14. inverse negative strand, positive direction is SuccessablesGATAi-+.bas, looking for 3'-(A/C)(A/G)ATAG(A/G/T)(A/C/G)-5', 2, 3'-CGATAGTC-5', 1840, 3'-AAATAGAA-5', 2628.
  15. inverse positive strand, negative direction is SuccessablesGATAi+-.bas, looking for 3'-(A/C)(A/G)ATAG(A/G/T)(A/C/G)-5', 1, 3'-CAATAGTA-5', 2500.
  16. inverse positive strand, positive direction is SuccessablesGATAi++.bas, looking for 3'-(A/C)(A/G)ATAG(A/G/T)(A/C/G)-5', 2, 3'-CAATAGGG-5', 3385, 3'-AAATAGTG-5', 4125.

Regarding hypothesis 2

Hypothesis 2: If a GATA box is present at least one transcription factor uses the GATA box to affect A1BG transcription.

GATA box and A1BG

A Google Scholar search using "GATA box" and A1BG did not match any articles.

GATA box and transcription factors present in A1BG

The PLAnt Cis-acting Regulatory DNA Elements Database (PLACE), where the "data files of PLACE are: place.dat and place.seq (version 30.0, 469 entries, Jan. 8th, 2007)"[15][16] currently has 512 entries. Some of these appear to correspond to those transcription factors found so far that may occur in the promoters of A1BG. The database falls into two readily accessible parts: place.dat and place.seq. The more recent are the HumanTFDB and the AnimalTFDB2 and AnimalTFDB3.

More recently is "The 26th annual Nucleic Acids Research database issue and Molecular Biology Database Collection".[17]

Verifications

To verify that your sampling has explored something, you may need a control group. Perhaps where, when, or without your entity, source, or object may serve.

Another verifier is reproducibility. Can you replicate something about your entity in your laboratory more than 3 times. Five times is usually a beginning number to provide statistics (data) about it.

For an apparent one time or perception event, document or record as much information coincident as possible. Was there a butterfly nearby?

Has anyone else perceived the entity and recorded something about it?

Gene ID: 1, includes the nucleotides between neighboring genes and A1BG. These nucleotides can be loaded into files from either gene toward A1BG, and from template and coding strands. These nucleotide sequences can be found in Gene transcriptions/A1BG. Copying the above discovered CRE boxes and putting the sequences in "⌘F" locates these sequences in the same nucleotide positions as found by the computer programs.

"In humans, telomerase is composed of a reverse transcriptase (hTERT), which uses the RNA component (hTERC) to dock onto the 3′ single-stranded telomere end. hTERT may then processively synthesise telomeric repeats from the template provided by hTERC, before dissociating7–9. All telomerase RNAs possess a 3′ end element necessary for its stability10. In hTERC, this is two stem-loop structures separated by an H-box (ANANNA) and ACA motif (H/ACA). The binding of telomerase factors dyskerin, NOP10, and NHP2 at the H/ACA motif form the so-called ‘pre-ribonucleoprotein complex’, before GAR1 binds in transition to the mature RNP11,12. hTERC then binds to chaperone TCAB1, which assists its trafficking to the Cajal bodies where the functional telomerase complex localises13. Recruitment to the telomeres in S-phase is mediated by the protective complex shelterin14,15. Correct assembly of the telomerase complex, with appropriate co-factors for maturation, stability, and subcellular localisation, is necessary for its function and thus telomere maintenance."[18]

Core promoter GATA boxes

From the first nucleotide just after ZSCAN22 to the first nucleotide just before A1BG are 4460 nucleotides. The core promoter on this side of A1BG extends from approximately 4425 to the possible transcription start site at nucleotide number 4460.

From the first nucleotide just after ZNF497 to the first nucleotide just before A1BG are 858 nucleotides. The core promoter on this side of A1BG extends from approximately 824 to the possible transcription start site at nucleotide number 858. Nucleotides (nts) have been added from ZNF497 to A1BG. The TSS for A1BG is now at 4300 nts from just on the other side of ZNF497. The core promoter should now be from 4266 to 4300.

Proximal promoter GATA boxes

The proximal promoter begins about nucleotide number 4210 in the negative direction.

Distal promoter GATA boxes

Using an estimate of 2 knts, a distal promoter to A1BG would be expected after nucleotide number 2460 in the negative direction.

Transcribed GATA boxes

Laboratory reports

Below is an outline for sections of a report, paper, manuscript, log book entry, or lab book entry.

Abstract

Introduction

Many transcription factors (TFs) may occur upstream and occasionally downstream of the transcription start site (TSS), in this gene's promoter. The following have been examined so far: (1) AGC boxes (GCC boxes), (2) ATA boxes, (3) CAAT boxes, (4) C and D boxes, (5) CAREs (GA responsive complexes), (6) CArG boxes, (7) CENP-B boxes, (8) CGCG boxes, (9) CRE boxes, (10) DREB boxes, (11) EIF4E basal elements (4EBEs), (12) enhancer boxes (E boxes), (13) E2 boxes, (14) Factor II B recognition elements, (15) GAREs (GA responsive complexes), (16) G boxes, (17) GC boxes, (18) GLM boxes, (19) H boxes (20) HNF6s, (21) HY boxes, (22) Metal responsive elements (MREs), (23) Motif ten elements (MTEs), (24) Pyrimidine boxes (GA responsive complexes), (25) STAT5s, (26) TACTAAC boxes, (27) TATA boxes, (28) TAT boxes (GA responsive complexes), (29) TATCCAC boxes, (30) W boxes (GA responsive complexes), (31) X boxes and (32) Y boxes.

But, no (3) CAAT box, (7) CENP-B box, (8) CGCG boxes are too close to ZSCAN22, (10) no DREB box, (11) EIF4E basal element, (13) E2 boxes, (15) GARE are too close to ZSCAN22, (16) no G box, (18) GLM box, (23) MTE, (26) TACTAAC box, (28) a TAT box, (29) TATCCAC box, (31) X box, or (32) Y box occur.

Interactions may occur with (1) an AGC (GCC) box, (2) an ATA box, (4) C boxes, a D box, but the other C-box and D-box have not been tested, (5) CAREs, (6) CArG boxes, (9) a CRE box, (12) enhancer boxes, (14) a BREu, (17) GC boxes, (19) H box, (20) HNF6s, (21) HY boxes, (22) an MRE, (24) pyrimidine boxes, (25) STAT5s, (27) TATA boxes outside the core promoter, or (30) W boxes.

Experiments

Regarding hypothesis 1: A1BG is not transcribed by a GATA box, if a GATA box is not present in the promoter of A1BG.

The Basic programs (starting with SuccessablesGATAbox.bas) were written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), including the extended number of nts from 958 to 4445, looking for H boxes, their possible complements and inverses, to test the hypothesis that either consensus sequence 5'-(A/C/G)(A/G/T)(GATA)(A/G)(A/C)-3' is not present in the promoter of A1BG.

Results

Hypothesis 1

A1BG is not transcribed by an H box.

ZSCAN22 and A1BG
ZNF497 and A1BG

Discussions

If GATA boxes can occur at additional TSS locations, then A1BG can have multiple TSSs.

Hypothesis 1 discussion

Hypothesis 2 discussion

Conclusions

Laboratory evaluations

No wet chemistry experiments were performed to confirm that Gene ID: 1 may be transcribed from either side using transcription factors in the core, proximal or distal promoters. The NCBI Gene database is generalized, whereas individual human genome testing could demonstrate that A1BG is transcribed from either side using known transcription factors. Sufficient nucleotides have been added to the data sets for the ZNF497 side to confirm likely transcription of A1BG by these known transcription factors.

See also

References

  1. William C. Aird, Jeffrey D. Parvin, Phillip A. Sharp, and Robert D. Rosenberg (14 January 1994). "The Interaction of GATA-binding Proteins and Basal Transcription Factors with GATA Box-containing Core Promoters" (PDF). The Journal of Biological Chemistry. 269 (2): 883–9. Retrieved 2 January 2020.
  2. Robert G. K. Donald and Anthony R. Cashmore (1990). "Mutation of either G box or I box sequences profoundly affects expression from the Arabidopsis rbcS‐1A promoter". The EMBO Journal. 9 (6): 1717–1726. doi:10.1002/j.1460-2075.1990.tb08295.x. Retrieved 8 November 2018.
  3. Annkatrin Rose, Iris Meier and Udo Wienand (28 October 1999). "The tomato I-box binding factor LeMYBI is a member of a novel class of Myb-like proteins". The Plant Journal. 20 (6): 641–652. doi:10.1046/j.1365-313X.1999.00638.x. Retrieved 8 November 2018.
  4. RefSeq (July 2008). "GATA1 GATA binding protein 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 30 December 2019.
  5. 5.0 5.1 5.2 5.3 5.4 5.5 Naoshi Obara, Norio Suzuki, Kibom Kim, Toshiro Nagasawa, Shigehiko Imagawa, and Masayuki Yamamoto (15 May 2008). "Repression via the GATA box is essential for tissue-specific erythropoietin gene expression" (PDF). Blood. 111 (10): 5223–32. doi:10.1182/blood-2007-10-115857. Retrieved 1 January 2020.
  6. Jennifer E.F. Butler, James T. Kadonaga (15 October 2002). "The RNA polymerase II core promoter: a key component in the regulation of gene expression". Genes & Development. 16 (20): 2583–292. doi:10.1101/gad.1026202. PMID 12381658.
  7. Stephen T. Smale and James T. Kadonaga (July 2003). "The RNA Polymerase II Core Promoter" (PDF). Annual Review of Biochemistry. 72 (1): 449–79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739. Retrieved 2012-05-07.
  8. 8.0 8.1 Jennifer F. Kugel and James A. Goodrich (2017). "Finding the start site: redefining the human initiator element" (PDF). Genes & Development. 31 (1–2): 1. doi:10.1101/gad.295980.117. Retrieved 9 May 2019.
  9. Thomas Shafee and Rohan Lowe (9 March 2017). "Eukaryotic and prokaryotic gene structure" (PDF). WikiJournal of Medicine. 4 (1): 2. doi:10.15347/wjm/2017.002. Retrieved 2017-04-06.
  10. Koichi Takayama, Ken-ichirou Morohashi, Shin-ichlro Honda, Nobuyuki Hara and Tsuneo Omura (1 July 1994). "Contribution of Ad4BP, a Steroidogenic Cell-Specific Transcription Factor, to Regulation of the Human CYP11A and Bovine CYP11B Genes through Their Distal Promoters". The Journal of Biochemistry. 116 (1): 193–203. doi:10.1093/oxfordjournals.jbchem.a124493. Retrieved 2017-08-16.
  11. Michelle Craig Barton, Navid Madani, and Beverly M. Emerson (8 July 1997). "Distal enhancer regulation by promoter derepression in topologically constrained DNA in vitro". Proceedings of the National Academy of Sciences of the United States of America. 94 (14): 7257–62. Retrieved 2017-08-16.
  12. A Aoyama, T Tamura, K Mikoshiba (March 1990). "Regulation of brain-specific transcription of the mouse myelin basic protein gene: function of the NFI-binding site in the distal promoter". Biochemical and Biophysical Research Communications. 167 (2): 648–53. doi:10.1016/0006-291X(90)92074-A. Retrieved 2012-12-13.
  13. J Gao and L Tseng (June 1996). "Distal Sp3 binding sites in the hIGBP-1 gene promoter suppress transcriptional repression in decidualized human endometrial stromal cells: identification of a novel Sp3 form in decidual cells". Molecular Endocrinology. 10 (6): 613–21. doi:10.1210/me.10.6.613. Retrieved 2012-12-13.
  14. Peter Pasceri, Dylan Pannell, Xiumei Wu, and James Ellis (July 15, 1998). "Full activity from human β-globin locus control region transgenes requires 5′ HS1, distal β-globin promoter, and 3′ β-globin sequences". Blood. 92 (2): 653–63. Retrieved 2012-12-13.
  15. Kenichi Higo, Yoshihiro Ugawa, Masao Iwamoto, and Tomoko Korenaga (8 January 2007). "NARO DNA Bank". Japan: National Agriculture and Food Research Organization. Retrieved 3 January 2020.
  16. Kenichi Higo, Yoshihiro Ugawa, Masao Iwamoto, and Tomoko Korenaga (1 January 1999). "Plant cis-acting regulatory DNA elements (PLACE) database: 1999". Nucleic Acids Research. 27 (1): 297–300. doi:10.1093/nar/27.1.297. Retrieved 3 January 2020.
  17. Daniel J Rigden, Xosé M Fernández (8 January 2019). "The 26th annual Nucleic Acids Research database issue and Molecular Biology Database Collection". Nucleic Acids Research. 47 (D1): D1–D1101. Retrieved 3 January 2020.
  18. Laura C. Collopy, Tracy L. Ware, Tomas Goncalves, Sunnvør í Kongsstovu, Qian Yang, Hanna Amelina, Corinne Pinder, Ala Alenazi, Vera Moiseeva, Siân R. Pearson, Christine A. Armstrong & Kazunori Tomita (2018). "LARP7 family proteins have conserved function in telomerase assembly" (PDF). Nature Communications. 9 (557): 1–8. doi:10.1038/s41467-017-02296-4. Retrieved 1 August 2019.

External links

Template:Sisterlinks