TATA box heterogeneous nuclear ribonucleoprotein family

Jump to navigation Jump to search

The TATA box (also called Goldberg-Hogness box)[1] is a DNA sequence (cis-regulatory element) found in the promoter region of genes in archaea and eukaryotes;[2] approximately 24% of human genes contain a TATA box within the core promoter.[3]

Human genes

"TATA-containing genes are more often highly regulated, such as by biotic or stress stimuli."[4] Only "∼10% of these TATA-containing promoters have the canonical TATA box (TATAWAWR)."[4]

"SRF-regulated genes of the actin/cytoskeleton/contractile family tend to have a TATA box."[5]

Different "TATA box sequences have different abilities to convey the activating signals of certain enhancers and activators in mammalian cells [...] and in yeast [...]."[5]

"SRF is a well established master regulator of the specific family of genes encoding the actin cytoskeleton and contractile apparatus [...], and we found that ~40% of the core promoters for these genes contain a TATA box, which is a significant enrichment compared to the low overall frequency of TATA-containing promoters in human and mouse genomes (...)."[5] "Global frequencies of core promoter types for human [9010 orthologous mouse-human promoter pairs with 1848 TATA-containing or 7162 TATA-less][6] genes with experimentally validated transcription start sites [are known from 2006]."[5] "The TATA box [...] has a consensus sequence of TATAWAAR [...]."[5] W = A or T and R = A or G. We "estimate that ~17% of promoters contain a TATA box".[6]

Gene ID: 2521

"This gene encodes a multifunctional protein component of the heterogeneous nuclear ribonucleoprotein (hnRNP) complex. The hnRNP complex is involved in pre-mRNA splicing and the export of fully processed mRNA to the cytoplasm. This protein belongs to the FET family of RNA-binding proteins which have been implicated in cellular processes that include regulation of gene expression, maintenance of genomic integrity and mRNA/microRNA processing. Alternative splicing results in multiple transcript variants. Defects in this gene result in amyotrophic lateral sclerosis type 6."[7]

This gene has the apparent TATA box of -33 "GCGTACT" -27 at a likely location for a TATA box with a BRE- upstream and an INR+ and MTE+ downstream.[6]

Gene ID: 3182

"This gene belongs to the subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins (hnRNPs). The hnRNPs are produced by RNA polymerase II and are components of the heterogeneous nuclear RNA (hnRNA) complexes. They are associated with pre-mRNAs in the nucleus and appear to influence pre-mRNA processing and other aspects of mRNA metabolism and transport. While all of the hnRNPs are present in the nucleus, some seem to shuttle between the nucleus and the cytoplasm. The hnRNP proteins have distinct nucleic acid binding properties. The protein encoded by this gene, which binds to one of the components of the multiprotein editosome complex, has two repeats of quasi-RRM (RNA recognition motif) domains that bind to RNAs. Two alternatively spliced transcript variants encoding different isoforms have been described for this gene."[8]

Gene ID: 3183

"This gene belongs to the subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins (hnRNPs). The hnRNPs are RNA binding proteins and they complex with heterogeneous nuclear RNA (hnRNA). These proteins are associated with pre-mRNAs in the nucleus and appear to influence pre-mRNA processing and other aspects of mRNA metabolism and transport. While all of the hnRNPs are present in the nucleus, some seem to shuttle between the nucleus and the cytoplasm. The hnRNP proteins have distinct nucleic acid binding properties. The protein encoded by this gene can act as a tetramer and is involved in the assembly of 40S hnRNP particles. Multiple transcript variants encoding at least two different isoforms have been described for this gene."[9]

This gene does not have a TATA box in its core promoter but does have a BRE- and a DPE+.[6]

Gene ID: 3184

"This gene belongs to the subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins (hnRNPs). The hnRNPs are nucleic acid binding proteins and they complex with heterogeneous nuclear RNA (hnRNA). These proteins are associated with pre-mRNAs in the nucleus and appear to influence pre-mRNA processing and other aspects of mRNA metabolism and transport. While all of the hnRNPs are present in the nucleus, some seem to shuttle between the nucleus and the cytoplasm. The hnRNP proteins have distinct nucleic acid binding properties. The protein encoded by this gene has two repeats of quasi-RRM domains that bind to RNAs. It localizes to both the nucleus and the cytoplasm. This protein is implicated in the regulation of mRNA stability. Alternative splicing of this gene results in four transcript variants."[10]

At the approximate location of the TATA box this gene has -32 "TGGTGCT" -26, and an INR- and a MTE+ in its core promoter.[6]

Gene ID: 3185

"This gene belongs to the subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins (hnRNPs). The hnRNPs are RNA binding proteins that complex with heterogeneous nuclear RNA (hnRNA). These proteins are associated with pre-mRNAs in the nucleus and regulate alternative splicing, polyadenylation, and other aspects of mRNA metabolism and transport. While all of the hnRNPs are present in the nucleus, some seem to shuttle between the nucleus and the cytoplasm. The hnRNP proteins have distinct nucleic acid binding properties. The protein encoded by this gene has three repeats of quasi-RRM domains that bind to RNAs which have guanosine-rich sequences. This protein is very similar to the family member hnRPH. Multiple alternatively spliced variants, encoding the same protein, have been identified."[11]

There is no TATA box in the core promoter of this gene but there is a BRE+.[6]

Gene ID: 3187

"This gene encodes a member of a subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins (hnRNPs). The hnRNPs are RNA binding proteins that complex with heterogeneous nuclear RNA. These proteins are associated with pre-mRNAs in the nucleus and appear to influence pre-mRNA processing and other aspects of mRNA metabolism and transport. While all of the hnRNPs are present in the nucleus, some may shuttle between the nucleus and the cytoplasm. The hnRNP proteins have distinct nucleic acid binding properties. The protein encoded by this gene has three repeats of quasi-RRM domains that bind to RNA and is very similar to the family member HNRPF. This gene may be associated with hereditary lymphedema type I. Alternatively spliced transcript variants have been described."[12]

This gene has -30 TATTTAG -24 at an apparent TATA+ box position with an INR+ and DPE- downstream.[6]

Gene ID: 3188

"This gene belongs to the subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins (hnRNPs). The hnRNPs are RNA binding proteins and they complex with heterogeneous nuclear RNA (hnRNA). These proteins are associated with pre-mRNAs in the nucleus and appear to influence pre-mRNA processing and other aspects of mRNA metabolism and transport. While all of the hnRNPs are present in the nucleus some seem to shuttle between the nucleus and the cytoplasm. The hnRNP proteins have distinct nucleic acid binding properties. The protein encoded by this gene has three repeats of quasi-RRM domains that binds to RNAs. It is very similar to the family member HNRPH1. This gene is thought to be involved in Fabray disease and X-linked agammaglobulinemia phenotype. Alternative splicing results in multiple transcript variants encoding the same protein. Read-through transcription between this locus and the ribosomal protein L36a gene has been observed."[13]

In two version of this gene NM_019597 and U78027, neither has a TATA box in their core promoters but both contain an INR+ and U78027 contains a MTE+.[6]

Gene ID: 3190

"This gene belongs to the subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins (hnRNPs). The hnRNPs are RNA binding proteins and they complex with heterogeneous nuclear RNA (hnRNA). These proteins are associated with pre-mRNAs in the nucleus and appear to influence pre-mRNA processing and other aspects of mRNA metabolism and transport. While all of the hnRNPs are present in the nucleus, some seem to shuttle between the nucleus and the cytoplasm. The hnRNP proteins have distinct nucleic acid binding properties. The protein encoded by this gene is located in the nucleoplasm and has three repeats of KH domains that binds to RNAs. It is distinct among other hnRNP proteins in its binding preference; it binds tenaciously to poly(C). This protein is also thought to have a role during cell cycle progession. Several alternatively spliced transcript variants have been described for this gene, however, not all of them are fully characterized."[14]

This gene in versions CR590155 and X72727 has no TATA box in its core promoter along with an INR+ and MTE- in CR590155 and BRE+, INR- and MTE+.[6]

Gene ID: 3191

"Heterogeneous nuclear RNAs (hnRNAs) which include mRNA precursors and mature mRNAs are associated with specific proteins to form heterogenous ribonucleoprotein (hnRNP) complexes. Heterogeneous nuclear ribonucleoprotein L is among the proteins that are stably associated with hnRNP complexes and along with other hnRNP proteins is likely to play a major role in the formation, packaging, processing, and function of mRNA. Heterogeneous nuclear ribonucleoprotein L is present in the nucleoplasm as part of the HNRP complex. HNRP proteins have also been identified outside of the nucleoplasm. Exchange of hnRNP for mRNA-binding proteins accompanies transport of mRNA from the nucleus to the cytoplasm. Since HNRP proteins have been shown to shuttle between the nucleus and the cytoplasm, it is possible that they also have cytoplasmic functions. Two transcript variants encoding different isoforms have been found for this gene."[15]

In the core promoter of this gene there is at the approximate location of a TATA box the nucleotide sequence -36 "TTTAAGG" -30 with an INR- and a MTE+ further downstream.[6]

Gene ID: 3192

"This gene encodes a member of a family of proteins that bind nucleic acids and function in the formation of ribonucleoprotein complexes in the nucleus with heterogeneous nuclear RNA (hnRNA). The encoded protein has affinity for both RNA and DNA, and binds scaffold-attached region (SAR) DNA. Mutations in this gene have been associated with epileptic encephalopathy, early infantile, 54. A pseudogene of this gene has been identified on chromosome 14."[16]

This gene has no TATA box in its core promoter but it does have a BRE+, an INR- and a MTE-.[6]

Gene ID: 5093

"This intronless gene is thought to have been generated by retrotransposition of a fully processed PCBP-2 mRNA. This gene and PCBP-2 have paralogues (PCBP3 and PCBP4) which are thought to have arisen as a result of duplication events of entire genes. The protein encoded by this gene appears to be multifunctional. It along with PCBP-2 and hnRNPK corresponds to the major cellular poly(rC)-binding protein. It contains three K-homologous (KH) domains which may be involved in RNA binding. This encoded protein together with PCBP-2 also functions as translational coactivators of poliovirus RNA via a sequence-specific interaction with stem-loop IV of the IRES and promote poliovirus RNA replication by binding to its 5'-terminal cloverleaf structure. It has also been implicated in translational control of the 15-lipoxygenase mRNA, human Papillomavirus type 16 L2 mRNA, and hepatitis A virus RNA. The encoded protein is also suggested to play a part in formation of a sequence-specific alpha-globin mRNP complex which is associated with alpha-globin mRNA stability."[17] Also known as HNRPX; HNRPE1; hnRNP-X; and hnRNP-E1.[17]

This gene has no TATA box in its core promoter but there is a MTE- downstream.[6]

Gene ID: 9987

"This gene belongs to the subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins (hnRNPs). The hnRNPs are RNA binding proteins and they complex with heterogeneous nuclear RNA (hnRNA). These proteins are associated with pre-mRNAs in the nucleus and appear to influence pre-mRNA processing and other aspects of mRNA metabolism and transport. While all of the hnRNPs are present in the nucleus, some seem to shuttle between the nucleus and the cytoplasm. The hnRNP proteins have distinct nucleic acid binding properties. The protein encoded by this gene has two RRM domains that bind to RNAs. Three alternatively spliced transcript variants have been described for this gene. One of the variants is probably not translated because the transcript is a candidate for nonsense-mediated mRNA decay. The protein isoforms encoded by this gene are similar to its family member HNRPD."[18]

This gene does not have a TATA box in its core promoter but does have a BRE+ and an INR-.[6]

Gene ID: 10236

"This gene encodes an RNA-binding protein that is a member of the spliceosome C complex, which functions in pre-mRNA processing and transport. The encoded protein also promotes transcription at the c-fos gene. Alternative splicing results in multiple transcript variants. There are pseudogenes for this gene on chromosomes 4, 11, and 10."[19]

This gene has no TATA box in its core promoter but has an INR+ and a MTE+ downstream.[6]

Gene ID: 10492

"This gene encodes a member of the cellular heterogeneous nuclear ribonucleoprotein (hnRNP) family. hnRNPs are RNA binding proteins that complex with heterogeneous nuclear RNA (hnRNA) and regulate alternative splicing, polyadenylation, and other aspects of mRNA metabolism and transport. The encoded protein plays a role in multiple aspects of mRNA maturation and is associated with several multiprotein complexes including the apoB RNA editing-complex and survival of motor neurons (SMN) complex. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene, and a pseudogene of this gene is located on the short arm of chromosome 20."[20] Also known as HNRNPQ; HNRPQ1; and hnRNP-Q.[20]

This gene has no TATA box in its core promoter but does have a MTE- downstream.[6]

Gene ID: 11100

"This gene encodes a nuclear RNA-binding protein of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. This protein binds specifically to adenovirus early-1B-55kDa oncoprotein. It may play an important role in nucleocytoplasmic RNA transport, and its function is modulated by early-1B-55kDa in adenovirus-infected cells."[21]

This gene does not have a TATA box in its core promoter but includes an INR+ and a MTE+.[6]

Gene ID: 22913

"This gene encodes a member of the heterogeneous nuclear ribonucleoprotein (hnRNP) gene family. This protein may play a role in pre-mRNA splicing and in embryonic development. Alternate splicing results in multiple transcript variants."[22] Also known as HNRPCL2.[22]

This gene does not have a TATA box in its core promoter but it does have an INR- and a DPE+.[6]

Gene ID: 27316

"This gene belongs to the RBMY gene family which includes candidate Y chromosome spermatogenesis genes. This gene, an active X chromosome homolog of the Y chromosome RBMY gene, is widely expressed whereas the RBMY gene evolved a male-specific function in spermatogenesis. Pseudogenes of this gene, found on chromosomes 1, 4, 9, 11, and 6, were likely derived by retrotransposition from the original gene. Alternatively spliced transcript variants encoding different isoforms have been identified. A snoRNA gene (SNORD61) is found in one of its introns."[23] Also known as HNRNPG.[23]

This gene does not have a TATA box in its core promoter but does have a BRE+, an INR+ and a MTE+.[6]

Gene ID: 92906

"HNRNPLL is a master regulator of activation-induced alternative splicing in T cells. In particular, it alters splicing of CD45 (PTPRC; MIM 151460), a tyrosine phosphatase essential for T-cell development and activation (Oberdoerffer et al., 2008 [PubMed 18669861])."[24]

This gene has no TATA box in its core promoter but has a MTE-.[6]

Gene ID: 138046

"Enables identical protein binding activity. Located in nucleoplasm."[25] Also known as HNRPCL3.[25]

This gene has an apparent TATA box of -34 "AAAAAAA" -28 in its core promoter and has an INR+ and a MTE+.[6]

Families of TATA box genes

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

References

  1. R. P. Lifton, M. L. Goldberg, R. W. Karp, and D. S. Hogness (1978). "The organization of the histone genes in Drosophila melanogaster: functional and evolutionary implications". Cold Spring Harbor Symposia on Quantitative Biology. 42: 1047–51. doi:10.1101/SQB.1978.042.01.105. PMID 98262.
  2. Stephen T. Smale and James T. Kadonaga (July 2003). "The RNA Polymerase II Core Promoter" (PDF). Annual Review of Biochemistry. 72 (1): 449–79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739. Retrieved 2012-05-07.
  3. C Yang, E Bolotin, T Jiang, FM Sladek, E Martinez (March 2007). "Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. PMID 17123746.
  4. 4.0 4.1 Chuhu Yang, Eugene Bolotin, Tao Jiang, Frances M. Sladek, and Ernest Martinez (10 October 2006). "Prevalence of the Initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. PMID 17123746. Retrieved 2024-06-07.
  5. 5.0 5.1 5.2 5.3 5.4 Muyu Xu, Elsie Gonzalez-Hurtado, and Ernest Martinez (April 2016). "Core promoter-specific gene regulation: TATA box selectivity and Initiator-dependent bi-directionality of serum response factor-activated transcription". Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms. 1859 (4): 553–563. doi:10.1016/j.bbagrm.2016.01.005. Retrieved 2024-06-08.
  6. 6.00 6.01 6.02 6.03 6.04 6.05 6.06 6.07 6.08 6.09 6.10 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19 Victor X Jin, Gregory AC Singer, Francisco J Agosto-Pérez, Sandya Liyanarachchi, and Ramana V Davuluri (2006). "Genome-wide analysis of core promoter elements from conserved human and mouse orthologous pairs". BMC Bioinformatics. 7: 114. doi:10.1186/1471-2105-7-114. Retrieved 2024-06-09.
  7. RefSeq (September 2009). "FUS FUS RNA binding protein [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  8. RefSeq (July 2008). "HNRNPAB heterogeneous nuclear ribonucleoprotein A/B [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-26.
  9. RefSeq (July 2008). "HNRNPC heterogeneous nuclear ribonucleoprotein C [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  10. RefSeq (July 2008). "HNRNPD heterogeneous nuclear ribonucleoprotein D [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  11. RefSeq (July 2008). "HNRNPF heterogeneous nuclear ribonucleoprotein F [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  12. RefSeq (March 2012). "HNRNPH1 heterogeneous nuclear ribonucleoprotein H1 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  13. RefSeq (January 2011). "HNRNPH2 heterogeneous nuclear ribonucleoprotein H2 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  14. RefSeq (July 2008). "HNRNPH2 heterogeneous nuclear ribonucleoprotein H2 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  15. RefSeq (July 2008). "HNRNPL heterogeneous nuclear ribonucleoprotein L [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  16. RefSeq (June 2017). "HNRNPU heterogeneous nuclear ribonucleoprotein U [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  17. 17.0 17.1 RefSeq (July 2008). "PCBP1 poly(rC) binding protein 1 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  18. RefSeq (May 2011). "HNRNPDL heterogeneous nuclear ribonucleoprotein D like [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  19. RefSeq (July 2014). "HNRNPR heterogeneous nuclear ribonucleoprotein R [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  20. 20.0 20.1 RefSeq (December 2011). "SYNCRIP synaptotagmin binding cytoplasmic RNA interacting protein [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  21. RefSeq (March 2016). "HNRNPUL1 heterogeneous nuclear ribonucleoprotein U like 1 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-26.
  22. 22.0 22.1 RefSeq (September 2011). "HNRNPDL heterogeneous nuclear ribonucleoprotein D like [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  23. 23.0 23.1 RefSeq (September 2009). "HNRNPDL heterogeneous nuclear ribonucleoprotein D like [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  24. OMIM (August 2008). "HNRNPLL heterogeneous nuclear ribonucleoprotein L like [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.
  25. 25.0 25.1 Alliance of Genome Resources (April 2022). "RALYL RALY RNA binding protein like [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-07-25.

External links