TATA box gene transcriptions

Jump to navigation Jump to search

Editor-In-Chief: Henry A. Hoff

File:Haloquadratum walsbyi00.jpg
This image is a drawing of Haloquadratum walsbyi. Credit: Rotational.

The TATA box (also called Goldberg-Hogness box)[1] is a DNA sequence (cis-regulatory element) found in the promoter region of genes in archaea and eukaryotes;[2] approximately 24% of human genes contain a TATA box within the core promoter.[3]

The TATA box is a binding site of either general transcription factors or histones.

Consensus sequences

In the direction of transcription along the DNA strand, the TATA box has the core DNA sequence 3'-TATAAA-5' or a variant, which is usually followed by three or more adenine (A) bases, specifically [3'-TATAAA(A)AAA-5' on the template strand].

"[M]ost of the diversity within metazoan core promoters appears to involve the variable occurrence of consensus or near-consensus TATA, Inr, and DPE elements."[4]

The TATA box can be an AT-rich sequence "located at a fixed distance upstream of the transcription start site"[2].

Histones

The binding of a transcription factor blocks the binding of a histone and vice versa.

Gene expressions

Although it is harder to regulate the transcription of genes with multiple transcription start sites, "variations in the expression of a constitutive gene would be minimized by the use of multiple start sites."[5]

Earlier "studies led to the design of a super core promoter (SCP) that contains a TATA, Inr, MTE, and DPE in a single promoter (Juven-Gershon et al., 2006b). The SCP is the strongest core promoter observed in vitro and in cultured cells and yields high levels of transcription in conjunction with transcriptional enhancers. These findings indicate that gene expression levels can be modulated via the core promoter."[5]

Gene ID: 60

The "natural serum-responsive upstream activating sequence (UAS) of the human β-actin gene (ACTB) selectively activates TATA box-dependent but not INR-dependent transcription via a mechanism that involves the serum response factor (SRF) activator and the TATA-binding/bending activity of TBP in live human cells."[6]

Human genes

"TATA-containing genes are more often highly regulated, such as by biotic or stress stimuli."[3] Only "∼10% of these TATA-containing promoters have the canonical TATA box (TATAWAWR)."[3]

"SRF-regulated genes of the actin/cytoskeleton/contractile family tend to have a TATA box."[6]

Different "TATA box sequences have different abilities to convey the activating signals of certain enhancers and activators in mammalian cells [...] and in yeast [...]."[6]

"SRF is a well established master regulator of the specific family of genes encoding the actin cytoskeleton and contractile apparatus [...], and we found that ~40% of the core promoters for these genes contain a TATA box, which is a significant enrichment compared to the low overall frequency of TATA-containing promoters in human and mouse genomes (...)."[6] "Global frequencies of core promoter types for human [9010 orthologous mouse-human promoter pairs with 1848 TATA-containing or 7162 TATA-less][7] genes with experimentally validated transcription start sites [are known from 2006]."[6] "The TATA box [...] has a consensus sequence of TATAWAAR [...]."[6] W = A or T and R = A or G. We "estimate that ~17% of promoters contain a TATA box".[7]

Gene ID: 2

"The protein encoded by this gene is a protease inhibitor and cytokine transporter. It uses a bait-and-trap mechanism to inhibit a broad spectrum of proteases, including trypsin, thrombin and collagenase. It can also inhibit inflammatory cytokines, and it thus disrupts inflammatory cascades. Mutations in this gene are a cause of alpha-2-macroglobulin deficiency. This gene is implicated in Alzheimer's disease (AD) due to its ability to mediate the clearance and degradation of A-beta, the major component of beta-amyloid deposits. A related pseudogene, which is also located on the p arm of chromosome 12, has been identified."[8]

Gene ID: 58

"The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause a variety of myopathies, including nemaline myopathy, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects with manifestations such as hypotonia."[9] It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]

Gene ID: 70

"Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC)."[10]

Gene ID: 133

"The protein encoded by this gene is a preprohormone which is cleaved to form two biologically active peptides, adrenomedullin and proadrenomedullin N-terminal 20 peptide. Adrenomedullin is a 52 aa peptide with several functions, including vasodilation, regulation of hormone secretion, promotion of angiogenesis, and antimicrobial activity. The antimicrobial activity is antibacterial, as the peptide has been shown to kill E. coli and S. aureus at low concentration."[11]

Gene ID: 183

"The protein encoded by this gene, pre-angiotensinogen or angiotensinogen precursor, is expressed in the liver and is cleaved by the enzyme renin in response to lowered blood pressure. The resulting product, angiotensin I, is then cleaved by angiotensin converting enzyme (ACE) to generate the physiologically active enzyme angiotensin II. The protein is involved in maintaining blood pressure, body fluid and electrolyte homeostasis, and in the pathogenesis of essential hypertension and preeclampsia. Mutations in this gene are associated with susceptibility to essential hypertension, and can cause renal tubular dysgenesis, a severe disorder of renal tubular development. Defects in this gene have also been associated with non-familial structural atrial fibrillation, and inflammatory bowel disease."[12] It has a TATA box (TATAAAT) from -32 to -25 nts from the TSS.[7]

Gene ID: 185

"Angiotensin II is a potent vasopressor hormone and a primary regulator of aldosterone secretion. It is an important effector controlling blood pressure and volume in the cardiovascular system. It acts through at least two types of receptors. This gene encodes the type 1 receptor which is thought to mediate the major cardiovascular effects of angiotensin II. This gene may play a role in the generation of reperfusion arrhythmias following restoration of blood flow to ischemic or infarcted myocardium. It was previously thought that a related gene, denoted as AGTR1B, existed; however, it is now believed that there is only one type 1 receptor gene in humans. Alternative splicing of this gene results in multiple transcript variants."[13]

Gene ID: 203

"This gene encodes an adenylate kinase enzyme involved in energy metabolism and homeostasis of cellular adenine nucleotide ratios in different intracellular compartments. This gene is highly expressed in skeletal muscle, brain and erythrocytes. Certain mutations in this gene resulting in a functionally inadequate enzyme are associated with a rare genetic disorder causing nonspherocytic hemolytic anemia. Alternative splicing of this gene results in multiple transcript variants encoding different isoforms. This gene shares readthrough transcripts with the upstream ST6GALNAC6 gene."[14]

Gene ID: 249

"This gene encodes a member of the alkaline phosphatase family of proteins. There are at least four distinct but related alkaline phosphatases: intestinal, placental, placental-like, and liver/bone/kidney (tissue non-specific). The first three are located together on chromosome 2, while the tissue non-specific form is located on chromosome 1. The product of this gene is a membrane bound glycosylated enzyme that is not expressed in any particular tissue and is, therefore, referred to as the tissue-nonspecific form of the enzyme. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature enzyme. This enzyme may play a role in bone mineralization. Mutations in this gene have been linked to hypophosphatasia, a disorder that is characterized by hypercalcemia and skeletal defects."[15] It has a TATA box (TATAAGG) from -31 to -25 nts from the TSS.[7]

Gene ID: 265

"This gene encodes a member of the amelogenin family of extracellular matrix proteins. Amelogenins are involved in biomineralization during tooth enamel development. Mutations in this gene cause X-linked amelogenesis imperfecta. Alternative splicing results in multiple transcript variants encoding different isoforms."[16]

Gene ID: 279

"This gene encodes a member of the alpha-amylase family of proteins. Amylases are secreted proteins that hydrolyze 1,4-alpha-glucoside bonds in oligosaccharides and polysaccharides, catalyzing the first step in digestion of dietary starch and glycogen. This gene and several family members are present in a gene cluster on chromosome 1. This gene encodes an amylase isoenzyme produced by the pancreas."[17] It has a TATA box (TATAAAT) from -27 to -21 nts from the TSS.[7]

Gene ID: 280

"Amylases are secreted proteins that hydrolyze 1,4-alpha-glucoside bonds in oligosaccharides and polysaccharides, and thus catalyze the first step in digestion of dietary starch and glycogen. The human genome has a cluster of several amylase genes that are expressed at high levels in either salivary gland or pancreas. This gene encodes an amylase isoenzyme produced by the pancreas."[18] It has a TATA box (TATAAAT) from -30 to -24 nts from the TSS.[7]

Gene ID: 292

"This gene is a member of the mitochondrial carrier subfamily of solute carrier protein genes. The product of this gene functions as a gated pore that translocates ADP from the cytoplasm into the mitochondrial matrix and ATP from the mitochondrial matrix into the cytoplasm. The protein forms a homodimer embedded in the inner mitochondria membrane. Suppressed expression of this gene has been shown to induce apoptosis and inhibit tumor growth. The human genome contains several non-transcribed pseudogenes of this gene."[19]

Gene ID: 314

"Copper amine oxidases catalyze the oxidative conversion of amines to aldehydes and ammonia in the presence of copper and quinone cofactor. This gene shows high sequence similarity to copper amine oxidases from various species ranging from bacteria to mammals. The protein contains several conserved motifs including the active site of amine oxidases and the histidine residues that likely bind copper. It may be a critical modulator of signal transmission in retina, possibly by degrading the biogenic amines dopamine, histamine, and putrescine. This gene may be a candidate gene for hereditary ocular diseases. Alternate splicing results in multiple transcript variants."[20]

Gene ID: 336

"This gene encodes apolipoprotein (apo-) A-II, which is the second most abundant protein of the high density lipoprotein particles. The protein is found in plasma as a monomer, homodimer, or heterodimer with apolipoprotein D. Defects in this gene may result in apolipoprotein A-II deficiency or hypercholesterolemia."[21] It has a TATA box (TATATAG) from -28 to -22 nts from the TSS.[7]

Gene ID: 338

"This gene product is the main apolipoprotein of chylomicrons and low density lipoproteins (LDL), and is the ligand for the LDL receptor. It occurs in plasma as two main isoforms, apoB-48 and apoB-100: the former is synthesized exclusively in the gut and the latter in the liver. The intestinal and the hepatic forms of apoB are encoded by a single gene from a single, very long mRNA. The two isoforms share a common N-terminal sequence. The shorter apoB-48 protein is produced after RNA editing of the apoB-100 transcript at residue 2180 (CAA->UAA), resulting in the creation of a stop codon, and early translation termination. Mutations in this gene or its regulatory region cause hypobetalipoproteinemia, normotriglyceridemic hypobetalipoproteinemia, and hypercholesterolemia due to ligand-defective apoB, diseases affecting plasma cholesterol and apoB levels."[22]

Gene ID: 358

"This gene encodes a small integral membrane protein with six bilayer spanning domains that functions as a water channel protein. This protein permits passive transport of water along an osmotic gradient. This gene is a possible candidate for disorders involving imbalance in ocular fluid movement."[23]

Gene ID: 359

"This gene encodes a water channel protein located in the kidney collecting tubule. It belongs to the MIP/aquaporin family, some members of which are clustered together on chromosome 12q13. Mutations in this gene have been linked to autosomal dominant and recessive forms of nephrogenic diabetes insipidus."[24]

Gene ID: 467

"This gene encodes a member of the mammalian activation transcription factor/cAMP responsive element-binding (CREB) protein family of transcription factors. This gene is induced by a variety of signals, including many of those encountered by cancer cells, and is involved in the complex process of cellular stress response. Multiple transcript variants encoding different isoforms have been found for this gene. It is possible that alternative splicing of this gene may be physiologically important in the regulation of target genes."[25] It has a TATA box (TATAAAA) from -33 to -27 nts from the TSS.[7]

Gene ID: 481

"The protein encoded by this gene belongs to the family of Na+/K+ and H+/K+ ATPases beta chain proteins, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The beta subunit regulates, through assembly of alpha/beta heterodimers, the number of sodium pumps transported to the plasma membrane. The glycoprotein subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes a beta 1 subunit. Alternatively spliced transcript variants encoding different isoforms have been described, but their biological validity is not known."[26] It has a TATA box (TATATAG) from -28 to -22 nts from the TSS.[7]

Gene ID: 515

"This gene encodes a subunit of mitochondrial ATP synthase. Mitochondrial ATP synthase catalyzes ATP synthesis, utilizing an electrochemical gradient of protons across the inner membrane during oxidative phosphorylation. ATP synthase is composed of two linked multi-subunit complexes: the soluble catalytic core, F1, and the membrane-spanning component, Fo, comprising the proton channel. The catalytic portion of mitochondrial ATP synthase consists of 5 different subunits (alpha, beta, gamma, delta, and epsilon) assembled with a stoichiometry of 3 alpha, 3 beta, and a single representative of the other 3. The proton channel seems to have nine subunits (a, b, c, d, e, f, g, F6 and 8). This gene encodes the b subunit of the proton channel."[27] ATO5PB aka ATP5F1.[27] It has a TATA box (TTTAAAA) from -34 to -28 nts from the TSS.[7]

Gene ID: 676

"BRDT is similar to the RING3 protein family. It possesses 2 bromodomain motifs and a PEST sequence (a cluster of proline, glutamic acid, serine, and threonine residues), characteristic of proteins that undergo rapid intracellular degradation. The bromodomain is found in proteins that regulate transcription. Several transcript variants encoding multiple isoforms have been found for this gene."[28] It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]

Gene ID: 919

"The protein encoded by this gene is T-cell receptor zeta, which together with T-cell receptor alpha/beta and gamma/delta heterodimers, and with CD3-gamma, -delta and -epsilon, forms the T-cell receptor-CD3 complex. The zeta chain plays an important role in coupling antigen recognition to several intracellular signal-transduction pathways. Low expression of the antigen results in impaired immune response. Two alternatively spliced transcript variants encoding distinct isoforms have been found for this gene."[29] It has a TATA box (AATAAAA) from -31 to -25 nts from the TSS.[7]

Gene ID: 1116

"Chitinases catalyze the hydrolysis of chitin, which is an abundant glycopolymer found in insect exoskeletons and fungal cell walls. The glycoside hydrolase 18 family of chitinases includes eight human family members. This gene encodes a glycoprotein member of the glycosyl hydrolase 18 family. The protein lacks chitinase activity and is secreted by activated macrophages, chondrocytes, neutrophils and synovial cells. The protein is thought to play a role in the process of inflammation and tissue remodeling."[30] It has a TATA box (CATAAAA) from -30 to -24 nts from the TSS.[7]

Gene ID: 1188

"The protein encoded by this gene is a member of the family of voltage-gated chloride channels. Chloride channels have several functions, including the regulation of cell volume, membrane potential stabilization, signal transduction and transepithelial transport. This gene is expressed predominantly in the kidney and may be important for renal salt reabsorption. Mutations in this gene are associated with autosomal recessive Bartter syndrome type 3 (BS3). Alternatively spliced transcript variants encoding different isoforms have been found for this gene."[31] It has a TATA box (CATAAAC) from -30 to -24 nts from the TSS.[7]

Gene ID: 1382

"This gene encodes a member of the retinoic acid (RA, a form of vitamin A) binding protein family and lipocalin/cytosolic fatty-acid binding protein family. The protein is a cytosol-to-nuclear shuttling protein, which facilitates RA binding to its cognate receptor complex and transfer to the nucleus. It is involved in the retinoid signaling pathway, and is associated with increased circulating low-density lipoprotein cholesterol. Alternatively spliced transcript variants encoding the same protein have been found for this gene."[32] It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]

Gene ID: 1602

"Nine elements were tested, representing a sampling of elements present in the two gene deserts and DACH introns, spread over a 1530-kb region surrounding the human DACH's TATA box."[33]

Gene ID: 1602 is the human gene DACH1 dachshund homolog 1 also known as DACH.[34] DACH1 has three isoforms: a, b, and c.

"This gene encodes a chromatin-associated protein that associates with other DNA-binding transcription factors to regulate gene expression and cell fate determination during development. The protein contains a Ski domain that is highly conserved from Drosophila to human. Expression of this gene is lost in some forms of metastatic cancer, and is correlated with poor prognosis."[34]

Gene ID: 1805

"Dermatopontin is an extracellular matrix protein with possible functions in cell-matrix interactions and matrix assembly. The protein is found in various tissues and many of its tyrosine residues are sulphated. Dermatopontin is postulated to modify the behavior of TGF-beta through interaction with decorin."[35] It has a TATA box (TATAAAA) from -26 to -20 nts from the TSS.[7]

Gene ID: 1811

The DRA gene (colon mucosa-associated gene) has a TATA box.[36]

Gene ID: 1999

"Enables DNA-binding transcription activator activity, RNA polymerase II-specific and sequence-specific double-stranded DNA binding activity. Involved in inflammatory response; negative regulation of transcription, DNA-templated; and positive regulation of transcription by RNA polymerase II. Located in Golgi apparatus; cytosol; and nucleoplasm." It has a TATA box (TATAAAG) from -31 to -25 nts from the TSS.[7]

Gene ID: 2494

"The protein encoded by this gene is a DNA-binding zinc finger transcription factor and is a member of the fushi tarazu factor-1 subfamily of orphan nuclear receptors. The encoded protein is involved in the expression of genes for hepatitis B virus and cholesterol biosynthesis, and may be an important regulator of embryonic development."[37] It has a TATA box (TATAACA) from -28 to -21 nts from the TSS.[7]

Gene ID: 2752

"The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants."[38] It has a TATA box (GATAAAG) from -30 to -24 nts from the TSS.[7]

Gene ID: 2780

"Transducin is a 3-subunit guanine nucleotide-binding protein (G protein) which stimulates the coupling of rhodopsin and cGMP-phoshodiesterase during visual impulses. The transducin alpha subunits in rods and cones are encoded by separate genes. This gene encodes the alpha subunit in cones."[39] It has a TATA box (TATAAAG) from -30 to -23 nts from the TSS.[7]

Gene ID: 2980

"Predicted to enable guanylate cyclase activator activity. Predicted to be involved in positive regulation of guanylate cyclase activity and signal transduction. Predicted to be located in extracellular region."[40] It has a TATA box (TTTAAAA) from -33 to -27 nts from the TSS.[7]

Gene ID: 2981

"Predicted to enable guanylate cyclase activator activity. Predicted to be involved in positive regulation of guanylate cyclase activity and signal transduction. Predicted to be located in extracellular region."[41] It has a TATA box (TATAAGG) from -30 to -24 nts from the TSS.[7]

Gene ID: 3158

"The protein encoded by this gene belongs to the HMG-CoA synthase family. It is a mitochondrial enzyme that catalyzes the first reaction of ketogenesis, a metabolic pathway that provides lipid-derived energy for various organs during times of carbohydrate deprivation, such as fasting. Mutations in this gene are associated with HMG-CoA synthase deficiency. Alternatively spliced transcript variants encoding different isoforms have been found for this gene."[42] It has a TATA box (TATAAAG) from -30 to -24 nts from the TSS.[7]

Gene ID: 3283

"The protein encoded by this gene is an enzyme that catalyzes the oxidative conversion of delta-5-3-beta-hydroxysteroid precursors into delta-4-ketosteroids, which leads to the production of all classes of steroid hormones. The encoded protein also catalyzes the interconversion of 3-beta-hydroxy- and 3-keto-5-alpha-androstane steroids."[43] It has a TATA box (CATAAAG) from -30 to -24 nts from the TSS.[7]

Gene ID: 3284

"The protein encoded by this gene is a bifunctional enzyme that catalyzes the oxidative conversion of delta(5)-ene-3-beta-hydroxy steroid, and the oxidative conversion of ketosteroids. It plays a crucial role in the biosynthesis of all classes of hormonal steroids. This gene is predominantly expressed in the adrenals and the gonads. Mutations in this gene are associated with 3-beta-hydroxysteroid dehydrogenase, type II, deficiency. Alternatively spliced transcript variants have been found for this gene."[44] It has a TATA box (CATAAAG) from -30 to -24 nts from the TSS.[7]

Gene ID: 3308

The Drosophila hsp70 has a TATA box containing promoter.[45] This suggests that Gene ID: 3308 HSPA4 heat shock 70kDa protein 4 [Homo sapiens], also known as hsp70,[46] has a TATA box in its core promoter.

Gene ID: 3491

"The secreted protein encoded by this gene is growth factor-inducible and promotes the adhesion of endothelial cells. The encoded protein interacts with several integrins and with heparan sulfate proteoglycan. This protein also plays a role in cell proliferation, differentiation, angiogenesis, apoptosis, and extracellular matrix formation."[47] It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]

Gene ID: 3918

"Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins, composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively), have a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. Several isoforms of each chain have been described. Different alpha, beta and gamma chain isomers combine to give rise to different heterotrimeric laminin isoforms which are designated by Arabic numerals in the order of their discovery, i.e. alpha1beta1gamma1 heterotrimer is laminin 1. The biological functions of the different chains and trimer molecules are largely unknown, but some of the chains have been shown to differ with respect to their tissue distribution, presumably reflecting diverse functions in vivo. This gene encodes the gamma chain isoform laminin, gamma 2. The gamma 2 chain, formerly thought to be a truncated version of beta chain (B2t), is highly homologous to the gamma 1 chain; however, it lacks domain VI, and domains V, IV and III are shorter. It is expressed in several fetal tissues but differently from gamma 1, and is specifically localized to epithelial cells in skin, lung and kidney. The gamma 2 chain together with alpha 3 and beta 3 chains constitute laminin 5 (earlier known as kalinin), which is an integral part of the anchoring filaments that connect epithelial cells to the underlying basement membrane. The epithelium-specific expression of the gamma 2 chain implied its role as an epithelium attachment molecule, and mutations in this gene have been associated with junctional epidermolysis bullosa, a skin disease characterized by blisters due to disruption of the epidermal-dermal junction. Two transcript variants resulting from alternative splicing of the 3' terminal exon, and encoding different isoforms of gamma 2 chain, have been described. The two variants are differentially expressed in embryonic tissues, however, the biological significance of the two forms is not known. Transcript variants utilizing alternative polyA_signal have also been noted in literature."[48] It has a TATA box (GATAAAA) from -33 to -27 nts from the TSS.[7]

Gene ID: 3995

TATA box is TATAA.[36]

The protein encoded by this FADS3 fatty acid desaturase 3 gene is "a member of the fatty acid desaturase (FADS) gene family. Desaturase enzymes regulate unsaturation of fatty acids through the introduction of double bonds between defined carbons of the fatty acyl chain. FADS family members are considered fusion products composed of an N-terminal cytochrome b5-like domain and a C-terminal multiple membrane-spanning desaturase portion, both of which are characterized by conserved histidine motifs. This gene is clustered with family members FADS1 and FADS2 at 11q12-q13.1; this cluster is thought to have arisen evolutionarily from gene duplication based on its similar exon/intron organization."[49]

Gene ID: 4014

"This gene encodes loricrin, a major protein component of the cornified cell envelope found in terminally differentiated epidermal cells. Mutations in this gene are associated with Vohwinkel's syndrome and progressive symmetric erythrokeratoderma, both inherited skin diseases."[50] It has a TATA box (TATATATAA) from -40 to -32 nts from the TSS.[7]

Gene ID: 4582

"This gene encodes a membrane-bound protein that is a member of the mucin family. Mucins are O-glycosylated proteins that play an essential role in forming protective mucous barriers on epithelial surfaces. These proteins also play a role in intracellular signaling. This protein is expressed on the apical surface of epithelial cells that line the mucosal surfaces of many different tissues including lung, breast stomach and pancreas. This protein is proteolytically cleaved into alpha and beta subunits that form a heterodimeric complex. The N-terminal alpha subunit functions in cell-adhesion and the C-terminal beta subunit is involved in cell signaling. Overexpression, aberrant intracellular localization, and changes in glycosylation of this protein have been associated with carcinomas. This gene is known to contain a highly polymorphic variable number tandem repeats (VNTR) domain. Alternate splicing results in multiple transcript variants."[51] It has a TATA box (TATAAAG) from -24 to -18 nts from the TSS.[7]

Gene ID: 4653

"MYOC encodes the protein myocilin, which is believed to have a role in cytoskeletal function. MYOC is expressed in many occular tissues, including the trabecular meshwork, and was revealed to be the trabecular meshwork glucocorticoid-inducible response protein (TIGR). The trabecular meshwork is a specialized eye tissue essential in regulating intraocular pressure, and mutations in MYOC have been identified as the cause of hereditary juvenile-onset open-angle glaucoma."[52] It has a TATA box (TATATATAAAC) from -31 to -21 nts from the TSS.[7]

Gene ID: 4878

"The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1."[53] It has a TATA box (TATAAAAAG) from -30 to -22 nts from the TSS.[7]

Gene ID: 5016

"This gene encodes a large, carbohydrate-rich, epithelial glycoprotein with numerous O-glycosylation sites located within threonine, serine, and proline-rich tandem repeats. The gene is similar to members of the mucin and the glycosyl hydrolase 18 gene families. Regulation of expression may be estrogen-dependent. Gene expression and protein secretion occur during late follicular development through early cleavage-stage embryonic development. The protein is secreted from non-ciliated oviductal epithelial cells and associates with ovulated oocytes, blastomeres, and spermatozoan acrosomal regions."[54] It has a TATA box (TATAAAG) from -25 to -19 nts from the TSS.[7]

Gene ID: 5052

"This gene encodes a member of the peroxiredoxin family of antioxidant enzymes, which reduce hydrogen peroxide and alkyl hydroperoxides. The encoded protein may play an antioxidant protective role in cells, and may contribute to the antiviral activity of CD8(+) T-cells. This protein may have a proliferative effect and play a role in cancer development or progression. Four transcript variants encoding the same protein have been identified for this gene."[55] It has a TATA box (TATAAAG) from -31 to -25 nts from the TSS.[7]

Gene ID: 5132

"This gene encodes a phosphoprotein, which is located in the outer and inner segments of the rod cells in the retina. This protein may participate in the regulation of visual phototransduction or in the integration of photoreceptor metabolism. It modulates the phototransduction cascade by interacting with the beta and gamma subunits of the retinal G-protein transducin. This gene is a potential candidate gene for retinitis pigmentosa and Usher syndrome type II. Alternatively spliced transcript variants encoding different isoforms have been identified."[56] It has a TATA box (TTTAAAT) from -32 to -26 nts from the TSS.[7]

Gene ID: 5743

"Prostaglandin-endoperoxide synthase (PTGS), also known as cyclooxygenase, is the key enzyme in prostaglandin biosynthesis, and acts both as a dioxygenase and as a peroxidase. There are two isozymes of PTGS: a constitutive PTGS1 and an inducible PTGS2, which differ in their regulation of expression and tissue distribution. This gene encodes the inducible isozyme. It is regulated by specific stimulatory events, suggesting that it is responsible for the prostanoid biosynthesis involved in inflammation and mitogenesis."[57] It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]

"[T]he human ... prostaglandin-endoperoxide-synthase-2 [gene contains] a canonical TATA box (nucleotide residues at positions -31 to -25 for the human gene)."[58] This is Gene ID: 5743.

Gene ID: 5996

"This gene encodes a member of the regulator of G-protein signalling family. This protein is located on the cytosolic side of the plasma membrane and contains a conserved, 120 amino acid motif called the RGS domain. The protein attenuates the signalling activity of G-proteins by binding to activated, GTP-bound G alpha subunits and acting as a GTPase activating protein (GAP), increasing the rate of conversion of the GTP to GDP. This hydrolysis allows the G alpha subunits to bind G beta/gamma subunit heterodimers, forming inactive G-protein heterotrimers, thereby terminating the signal."[59] It has a TATA box (TATAAAG) from -28 to -22 nts from the TSS.[7]

Gene ID: 5997

"Regulator of G protein signaling (RGS) family members are regulatory molecules that act as GTPase activating proteins (GAPs) for G alpha subunits of heterotrimeric G proteins. RGS proteins are able to deactivate G protein subunits of the Gi alpha, Go alpha and Gq alpha subtypes. They drive G proteins into their inactive GDP-bound forms. Regulator of G protein signaling 2 belongs to this family. The protein acts as a mediator of myeloid differentiation and may play a role in leukemogenesis."[60] It has a TATA box (CATAAAT) from -28 to -22 nts from the TSS.[7]

Gene ID: 6121

"The protein encoded by this gene is a component of the vitamin A visual cycle of the retina which supplies the 11-cis retinal chromophore of the photoreceptors opsin visual pigments. It is a member of the carotenoid cleavage oxygenase superfamily. All members of this superfamily are non-heme iron oxygenases with a seven-bladed propeller fold and oxidatively cleave carotenoid carbon:carbon double bonds. However, the protein encoded by this gene has acquired a divergent function that involves the concerted O-alkyl ester cleavage of its all-trans retinyl ester substrate and all-trans to 11-cis double bond isomerization of the retinyl moiety. As such, it performs the essential enzymatic isomerization step in the synthesis of 11-cis retinal. Mutations in this gene are associated with early-onset severe blinding disorders such as Leber congenital."[61] It has a TATA box (CATAAAA) from -27 to -21 nts from the TSS.[7]

Gene ID: 6232

"Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of four RNA species and approximately 80 structurally distinct proteins. This gene encodes a member of the S27e family of ribosomal proteins and component of the 40S subunit. The encoded protein contains a C4-type zinc finger domain that can bind to zinc and may bind to nucleic acid. Mutations in this gene have been identified in numerous melanoma patients and in at least one patient with Diamond-Blackfan anemia (DBA). Elevated expression of this gene has been observed in various human cancers. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome."[62] It has a TATA box (TATATAA) from -29 to -23 nts from the TSS.[7]

Gene ID: 6279

"The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and as a cytokine. Altered expression of this protein is associated with the disease cystic fibrosis. Multiple transcript variants encoding different isoforms have been found for this gene."[63] It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS, Code score: 1.00; Matrix score: 0.93.[7]

Gene ID: 6518

"The protein encoded by this gene is a fructose transporter responsible for fructose uptake by the small intestine. The encoded protein also is necessary for the increase in blood pressure due to high dietary fructose consumption."[64] It has a TATA box (TATAAAA) from -33 to -27 nts from the TSS, Code score: 1.00; Matrix score: 0.92.[7]

Gene ID: 6548

"This gene encodes a Na+/H+ antiporter that is a member of the solute carrier family 9. The encoded protein is a plasma membrane transporter that is expressed in the kidney and intestine. This protein plays a central role in regulating pH homeostasis, cell migration and cell volume. This protein may also be involved in tumor growth."[65] It has a TATA box (TATAAGT) from -32 to -26 nts from the TSS, Code score: 0.91; Matrix score: 0.84.[7]

Gene ID: 6566

"The protein encoded by this gene is a proton-linked monocarboxylate transporter that catalyzes the movement of many monocarboxylates, such as lactate and pyruvate, across the plasma membrane. Mutations in this gene are associated with erythrocyte lactate transporter defect. Alternatively spliced transcript variants have been found for this gene."[66] It has a TATA box (TATAAGG) from -31 to -25 nts from the TSS, Code score: 0.91; Matrix score: 0.78.[7]

Gene ID: 6698

"A structural constituent of skin epidermis. Involved in keratinocyte differentiation and peptide cross-linking. Located in cornified envelope."[67] It has a TATA box (TATAAAAG) from -30 to -23 nts from the TSS, Code score: 1.00; Matrix score: 0.91.[7]

Gene ID: 26827

RNU6-1 RNA, U6 small nuclear 1 [ Homo sapiens (human) ] is also known as U6 or U6-1.[68] In the flanking region 5' to the gene[U6], there is a Hogness box sequence TATAAAT beginning at position -31 which is boxed in ...."[69]

Gene transcriptions

"From a teleological standpoint, this arrangement [of focused promoters] is consistent with the notion that it would be easier to regulate the transcription of a gene with a single transcription start site than one with multiple start sites."[5]

The TATA box is involved in the process of transcription by RNA polymerase.

Approximately “76% of human core promoters lack TATA-like elements, have a high GC content, and are enriched in Sp1 binding sites.”[3]

"[T]wo motifs - M3 (SCGGAAGY) and M22 (TGCGCANK) - ... occur preferentially in human TATA-less core promoters."[3]

"About 24% of human genes have a TATA-like element and their promoters are generally AT-rich; however, only ~10% of these TATA-containing promoters have the canonical TATA box (TATAWAWR). In contrast, ~46% of human core promoters contain the consensus INR (YYANWYY) and ~30% are INR-containing TATA-less genes."[3] W = A or T, Y = C or T, N = G, A, T, or C, and R = A or G.

Apparently, another ~46% of human promoters lack both TATA-like and consensus INR elements.

Transcription start sites

The consensus sequence is usually located 25 base pairs [(bps) or nucleotides (nts)] upstream [(-)] of the transcription site; i.e., the transcription start site (TSS).

Focused promoters

"In focused transcription, there is either a single major transcription start site or several start sites within a narrow region of several nucleotides. Focused transcription is the predominant mode of transcription in simpler organisms."[5]

"Focused transcription initiation occurs in all organisms, and appears to be the predominant or exclusive mode of transcription in simpler organisms."[5]

"In vertebrates, focused transcription tends to be associated with regulated promoters".[5]

"The analysis of focused core promoters has led to the discovery of sequence motifs such as the TATA box, BREu (upstream TFIIBrecognition element), Inr (initiator), MTE (motif ten element), DPE (downstream promoter element), DCE (downstream core element), and XCPE1 (Xcore promoter element 1) [...]."[5]

Dispersed promoters

"In dispersed transcription, there are several weak transcription start sites over a broad region of about 50 to 100 nucleotides. Dispersed transcription is the most common mode of transcription in vertebrates. For instance, dispersed transcription is observed in about two-thirds of human genes."[5]

In vertebrates, "dispersed transcription is typically observed in constitutive promoters in CpG islands."[5]

Core promoters

"Focused transcription typically initiates within the Inr, and the A nucleotide in the Inr consensus is usually designed as the “+ 1” position, whether or not transcription actually initiates at that particular nucleotide. This convention is useful because other core promoter motifs, such as the MTE and DPE, function with the Inr in a manner that exhibits a strict spacing dependence with the Inr consensus sequence (and hence, the A + 1 nucleotide) rather than the actual transcription start site (Burke and Kadonaga, 1997, Kutach and Kadonaga, 2000 and Lim et al., 2004)."[5]

"With TATA-driven core promoters, transcription can be achieved in vitro with purified RNA polymerase II, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH."[5]

"NC2 (negative cofactor 2; also known as Dr1-Drap1) [...] was identified as repressor of TATA-dependent transcription [...]."[5]

"TBP (TATA box-binding protein) activates TATA transcription [...] The TBP subunit binds to the TATA box [...] TFIIA appears to promote the binding of TBP to the TATA box."[5]

TATA boxes

"The TATA box is the first core promoter motif that was discovered (Goldberg, 1979) as well as the best known core promoter element. The metazoan TATA box consensus is TATAWAAR, where the upstream T is usually located at − 31 or − 30 relative to the A + 1 (or G + 1) position in the Inr (Carninci et al., 2006 and Ponjavic et al., 2006). [The] TATA box is recognized and bound by the TBP subunit of the TFIID complex. Both the TATA box and TBP are conserved from archaebacteria to humans (Reeve, 2003). The TATA box is also present in plants (Molina and Grotewold, 2005, Yamamoto et al., 2007a and Yamamoto et al., 2007b). Although the TATA box is a well known core promoter motif, it is present in only about 10%–15% of mammalian core promoters (Carninci et al., 2006, Kim et al., 2005 and Cooper et al., 2006)."[5]

"The BRE (TFIIBrecognition element) was initially identified as a TFIIB binding sequence that is immediately upstream of a subset (∼ 10%–30%) of TATA box elements (Lagrange et al., 1998). In addition, a second TFIIB recognition site, the BREd (downstream TFIIB recognition element), was found immediately downstream of the TATA box (Deng and Roberts, 2005). The discovery of the BREd led to the renaming of the original BRE as BREu for upstream BRE (reviewed in Deng and Roberts, 2007). Both the BREu and BREd function in conjunction with a TATA box and have been found to increase as well as to decrease the levels of basal transcription ( Lagrange et al., 1998, Evans et al., 2001 and Deng and Roberts, 2005). More recent studies suggest a distinct role for the BREu in transcriptional regulation (Juven-Gershon et al., 2008a; [...])."[5]

"TRF3 (also known as TBP2 and TBPL2) appears to be present only in vertebrates and is the TRF that is most closely related to TBP. TRF3 can bind to TATA boxes and support TATA-dependent transcription (Bártfai et al., 2004 and Jallow et al., 2004). TRF3 was found to be important for embryonic development (Bártfai et al., 2004 and Jallow et al., 2004). In addition, zebrafish embryos that are depleted of TRF3 exhibit multiple developmental defects and fail to undergo hematopoiesis (Hart et al., 2007)."[5]

"The top six sequences form a cohort of the eight-member consensus TATA(A/T)A(A/T)(A/G). Two missing members are TATA(A/T)ATG. This sequence ends in an ATG translational start codon, and thus is expected to be underrepresented in promoters. Since it is nevertheless part of the larger consensus that avidly binds TBP, this sequence was included in the TATA consensus, although it is rarely used."[70]

Hypotheses

  1. A1BG is not transcribed using a TATA box.

TATA box (Butler 2002) samplings

The diagram shows an overview of the four core promoter elements B recognition element (BRE), TATA box, initiator element (Inr), and downstream promoter element (DPE), with their respective consensus sequences and their distance from the transcription start site. Credit: Jennifer E.F. Butler & James T. Kadonaga.

For the Basic programs testing consensus sequence TATAAA (starting with SuccessablesTATAB.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 2, TATAAA at 2852, TATAAA at 1602.
  2. Positive strand, negative direction: 3, TATAAA at 2874, TATAAA at 221, TATAAA at 182.
  3. negative strand, positive direction: 0.
  4. positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 3, TTTATA at 2869, TTTATA at 2638, TTTATA at 1740.
  6. inverse complement, positive strand, negative direction: 1, TTTATA at 219.
  7. inverse complement, negative strand, positive direction: 1, TTTATA at 2588.
  8. inverse complement, positive strand, positive direction: 0.

TATAB (4560-2846) UTRs

  1. Negative strand, negative direction: TATAAA at 2852.
  2. Negative strand, negative direction: TTTATA at 2869.
  3. Positive strand, negative direction: TATAAA at 2874.

TATAB negative direction (2811-2596) proximal promoters

  1. Negative strand, negative direction: TTTATA at 2638.

TATAB negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TATAAA at 1602.
  2. Negative strand, negative direction: TTTATA at 1740.
  3. Positive strand, negative direction: TATAAA at 221, TATAAA at 182.
  4. Positive strand, negative direction: TTTATA at 219.

TATAB positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: TTTATA at 2588.

TATA box (Butler 2002) random dataset samplings

  1. TATABr0: 2, TATAAA at 3565, TATAAA at 499.
  2. TATABr1: 0.
  3. TATABr2: 1, TATAAA at 3856.
  4. TATABr3: 1, TATAAA at 4444.
  5. TATABr4: 2, TATAAA at 3685, TATAAA at 733.
  6. TATABr5: 1, TATAAA at 1563.
  7. TATABr6: 0.
  8. TATABr7: 1, TATAAA at 3629.
  9. TATABr8: 2, TATAAA at 706, TATAAA at 555.
  10. TATABr9: 3, TATAAA at 4219, TATAAA at 3150, TATAAA at 2621.
  11. TATABr0ci: 2, TTTATA at 1139, TTTATA at 497.
  12. TATABr1ci: 0.
  13. TATABr2ci: 0.
  14. TATABr3ci: 0.
  15. TATABr4ci: 1, TTTATA at 4527.
  16. TATABr5ci: 1, TTTATA at 2178.
  17. TATABr6ci: 0.
  18. TATABr7ci: 2, TTTATA at 4252, TTTATA at 2452.
  19. TATABr8ci: 1, TTTATA at 4222.
  20. TATABr9ci: 2, TTTATA at 3988, TTTATA at 162.

TATABr arbitrary (evens) (4560-2846) UTRs

  1. TATABr0: TATAAA at 3565.
  2. TATABr2: TATAAA at 3856.
  3. TATABr4: TATAAA at 3685.
  4. TATABr4ci: TTTATA at 4527.
  5. TATABr8ci: TTTATA at 4222.

TATABr alternate (odds) (4560-2846) UTRs

  1. TATABr3: TATAAA at 4444.
  2. TATABr9: TATAAA at 4219, TATAAA at 3150.
  3. TATABr7ci: TTTATA at 4252, TTTATA at 2452.
  4. TATABr9ci: TTTATA at 3988.

TATABr arbitrary positive direction (odds) (4445-4265) core promoters

  1. TATABr3: TATAAA at 4444.

TATABr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. TATABr9: TATAAA at 4219.
  2. TATABr7ci: TTTATA at 4252.

TATABr alternate positive direction (evens) (4265-4050) proximal promoters

  1. TATABr8ci: TTTATA at 4222.

TATABr arbitrary negative direction (evens) (2596-1) distal promoters

  1. TATABr0: TATAAA at 499.
  2. TATABr4: TATAAA at 733.
  3. TATABr8: TATAAA at 706, TATAAA at 555.
  4. TATABr0ci: TTTATA at 1139, TTTATA at 497.

TATABr alternate negative direction (odds) (2596-1) distal promoters

  1. TATABr5: TATAAA at 1563.
  2. TATABr5ci: TTTATA at 2178.
  3. TATABr7ci: TTTATA at 2452.
  4. TATABr9ci: TTTATA at 162.

TATABr arbitrary positive direction (odds) (4050-1) distal promoters

  1. TATABr5: TATAAA at 1563.
  2. TATABr9: TATAAA at 3150, TATAAA at 2621.
  3. TATABr5ci: TTTATA at 2178.
  4. TATABr7ci: TTTATA at 2452.
  5. TATABr9ci: TTTATA at 3988, TTTATA at 162.

TATABr alternate positive direction (evens) (4050-1) distal promoters

  1. TATABr0: TATAAA at 3565, TATAAA at 499.
  2. TATABr2: TATAAA at 3856.
  3. TATABr4: TATAAA at 3685, TATAAA at 733.
  4. TATABr8: TATAAA at 706, TATAAA at 555.
  5. TATABr0ci: TTTATA at 1139, TTTATA at 497.

TATA box (Butler 2002) analysis and results

TATA box consensus sequence TATAAA.[71]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 3 2 1.5 1.5 ± 0.5 (--2,+-1)
Randoms UTR arbitrary negative 5 10 0.5 0.55
Randoms UTR alternate negative 6 10 0.6 0.55
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.05
Randoms Core alternate positive 0 10 0 0.05
Reals Proximal negative 1 2 0.5 0.5 ± 0.5 (--1,+-0)
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 2 10 0.2 0.15
Randoms Proximal alternate positive 1 10 0.1 0.15
Reals Distal negative 5 2 2.5 2.5 ± 0.5 (--2,+-3)
Randoms Distal arbitrary negative 6 10 0.6 0.5
Randoms Distal alternate negative 4 10 0.4 0.5
Reals Distal positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Distal arbitrary positive 7 10 0.7 0.8
Randoms Distal alternate positive 9 10 0.9 0.8

Comparison:

The occurrences of real TATAB UTRs, proximals and distals are greater than the randoms. This suggests that the real TATABs are likely active or activable.

Butler TATA box genes

  1. Gene ID: 2. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]
  2. Gene ID: 70. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  3. Gene ID: 133. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]
  4. Gene ID: 183. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  5. Gene ID: 185. It has a TATA box (TATAAA) from -6 to -1 nts from the TSS.[7]
  6. Gene ID: 203. It has a TATA box (TATAAA) from -4 to 2 nts from the TSS.[7]
  7. Gene ID: 265. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  8. Gene ID: 279. It has a TATA box (TATAAA) from -27 to -22 nts from the TSS.[7]
  9. Gene ID: 280. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  10. Gene ID: 292. It has a TATA box (TATAAA) from 3 to 8 nts from the TSS.[7]
  11. Gene ID: 314. It has a TATA box (TATAAA) from -41 to -36 nts from the TSS.[7]
  12. Gene ID: 338. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  13. Gene ID: 358. It has a TATA box (TATAAA) from 23 to 28 nts from the TSS.[7]
  14. Gene ID: 358. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  15. Gene ID: 359. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  16. Gene ID: 360. It has a TATA box (TATAAA) from -27 to -22 nts from the TSS.[7]
  17. Gene ID: 374. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  18. Gene ID: 383. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  19. Gene ID: 390. It has a TATA box (TATAAA) from -33 to -28 nts from the TSS.[7]
  20. Gene ID: 468. It has a TATA box (TATAAA) from -27 to -22 nts from the TSS.[7]
  21. Gene ID: 482. It has a TATA box (TATAAA) from -35 to -30 nts from the TSS.[7]
  22. Gene ID: 496. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  23. Gene ID: 677. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  24. Gene ID: 694. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]
  25. Gene ID: 846. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]
  26. Gene ID: 1051. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  27. Gene ID: 1101. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  28. Gene ID: 1116. It has a TATA box (TATAAA) from 18 to 23 nts from the TSS.[7]
  29. Gene ID: 1152. It has a TATA box (TATAAA) from -41 to -36 nts from the TSS.[7]
  30. Gene ID: 1160. It has a TATA box (TATAAA) from -37 to -32 nts from the TSS.[7]
  31. Gene ID: 1180. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  32. Gene ID: 1191. It has a TATA box (TATAAA) from -23 to -18 nts from the TSS.[7]
  33. Gene ID: 1231. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  34. Gene ID: 1278. It has a TATA box (TATAAA) from -34 to -29 nts from the TSS.[7]
  35. Gene ID: 1401. It has a TATA box (TATAAA) from -45 to -40 nts from the TSS.[7]
  36. Gene ID: 1411. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  37. Gene ID: 1427. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  38. Gene ID: 1440. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]
  39. Gene ID: 1543. It has a TATA box (TATAAA) from -25 to -20 nts from the TSS.[7]
  40. Gene ID: 4327. It has a TATA box (TATAAA) from -10 to -5 nts from the TSS.[7]
  41. Gene ID: 1548. It has a TATA box (TATAAA) from -43 to -38 nts from the TSS.[7]
  42. Gene ID: 1553. It has a TATA box (TATAAA) from -45 to -40 nts from the TSS.[7]
  43. Gene ID: 1576. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  44. Gene ID: 1735. It has a TATA box (TATAAA) from -33 to -28 nts from the TSS.[7]
  45. Gene ID: 1831. It has a TATA box (TATAAA) from -26 to -21 nts from the TSS.[7]
  46. Gene ID: 1833. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]
  47. Gene ID: 1974. It has a TATA box (TATAAA) from 25 to 30 nts from the TSS.[7]
  48. Gene ID: 1990. It has a TATA box (TATAAA) from -23 to -18 nts from the TSS.[7]
  49. Gene ID: 1999. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  50. Gene ID: 2206. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  51. Gene ID: 2250. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  52. Gene ID: 2321. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  53. Gene ID: 2542. It has a TATA box (TATAAA) from 28 to 33 nts from the TSS.[7]
  54. Gene ID: 2597. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  55. Gene ID: 2652. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  56. Gene ID: 2669. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  57. Gene ID: 2780. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  58. Gene ID: 2922. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  59. Gene ID: 2947. It has a TATA box (TATAAA) from -43 to -38 nts from the TSS.[7]
  60. Gene ID: 3008. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  61. Gene ID: 3015. It has a TATA box (TATAAA) from -8 to -3 nts from the TSS.[7]
  62. Gene ID: 3133. It has a TATA box (TATAAA) from -33 to -28 nts from the TSS.[7]
  63. Gene ID: 3158. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  64. Gene ID: 3164. It has a TATA box (TATAAA) from -40 to -35 nts from the TSS.[7]
  65. Gene ID: 3171. It has a TATA box (TATAAA) from -27 to -22 nts from the TSS.[7]
  66. Gene ID: 3182. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  67. Gene ID: 3206. It has a TATA box (TATAAA) from -39 to -34 nts from the TSS.[7]
  68. Gene ID: 3222. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  69. Gene ID: 3232. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  70. Gene ID: 3383. It has a TATA box (TATAAA) from -14 to -9 nts from the TSS.[7]
  71. Gene ID: 3458. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  72. Gene ID: 3487. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  73. Gene ID: 3558. It has a TATA box (TATAAA) from -25 to -20 nts from the TSS.[7]
  74. Gene ID: 3630. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  75. Gene ID: 3640. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  76. Gene ID: 3662. It has a TATA box (TATAAA) from -20 to -15 nts from the TSS.[7]
  77. Gene ID: 3726. It has a TATA box (TATAAA) from -42 to -37 nts from the TSS.[7]
  78. Gene ID: 3758. It has a TATA box (TATAAA) from -33 to -28 nts from the TSS.[7]
  79. Gene ID: 3858. It has a TATA box (TATAAA) from -34 to -29 nts from the TSS.[7]
  80. Gene ID: 3861. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  81. Gene ID: 3872. It has a TATA box (TATAAA) from -33 to -28 nts from the TSS.[7]
  82. Gene ID: 3906. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  83. Gene ID: 3938. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  84. Gene ID: 3976. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  85. Gene ID: 4049. It has a TATA box (TATAAA) from -27 to -22 nts from the TSS.[7]
  86. Gene ID: 4144. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  87. Gene ID: 4222. It has a TATA box (TATAAA) from -21 to -16 nts from the TSS.[7]
  88. Gene ID: 4225. It has a TATA box (TATAAA) from -33 to -28 nts from the TSS.[7]
  89. Gene ID: 4284. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  90. Gene ID: 4316. It has a TATA box (TATAAA) from -25 to -20 nts from the TSS.[7]
  91. Gene ID: 4321. It has a TATA box (TATAAA) from 5 to 10 nts from the TSS.[7]
  92. Gene ID: 4327. It has a TATA box (TATAAA) from -10 to -5 nts from the TSS.[7]
  93. Gene ID: 4357. It has a TATA box (TATAAA) from -17 to -12 nts from the TSS.[7]
  94. Gene ID: 4501. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  95. Gene ID: 4582. It has a TATA box (TATAAA) from -24 to -19 nts from the TSS.[7]
  96. Gene ID: 4618. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]
  97. Gene ID: 4624. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  98. Gene ID: 4629. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  99. Gene ID: 4638. It has a TATA box (TATAAA) from 30 to 35 nts from the TSS.[7]
  100. Gene ID: 4653. It has a TATA box (TATAAA) from -27 to -22 nts from the TSS.[7]
  101. Gene ID: 4741. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  102. Gene ID: 4747. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]
  103. Gene ID: 4843. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  104. Gene ID: 4946. It has a TATA box (TATAAA) from -26 to -21 nts from the TSS.[7]
  105. Gene ID: 5004. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  106. Gene ID: 5005. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  107. Gene ID: 5016. It has a TATA box (TATAAA) from -25 to -20 nts from the TSS.[7]
  108. Gene ID: 5034. It has a TATA box (TATAAA) from -27 to -22 nts from the TSS.[7]
  109. Gene ID: 5052. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  110. Gene ID: 5069. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  111. Gene ID: 5132. It has a TATA box (TTTAAA) from -32 to -26 nts from the TSS.[7]
  112. Gene ID: 5155. It has a TATA box (TATAAA) from -25 to -20 nts from the TSS.[7]
  113. Gene ID: 5224. It has a TATA box (TTTAAA) from -26 to -21 nts from the TSS.[7]
  114. Gene ID: 5275. It has a TATA box (TATAAA) from -9 to -4 nts from the TSS.[7]
  115. Gene ID: 5360. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  116. Gene ID: 5449. It has a TATA box (TATAAA) from 8 to 13 nts from the TSS.[7]
  117. Gene ID: 5478. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  118. Gene ID: 5610. It has a TATA box (TATAAA) from 3 to 8 nts from the TSS.[7]
  119. Gene ID: 5617. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  120. Gene ID: 5645. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  121. Gene ID: 5650. It has a TATA box (TATAAA) from -2 to 4 nts from the TSS.[7]
  122. Gene ID: 5950. It has a TATA box (TATAAA) from -2 to 4 nts from the TSS.[7]
  123. Gene ID: 5956. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  124. Gene ID: 5967. It has a TATA box (TATAAA) from 4 to 9 nts from the TSS.[7]
  125. Gene ID: 5996. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]
  126. Gene ID: 6046. It has a TATA box (TATAAA) from -33 to -28 nts from the TSS.[7]
  127. Gene ID: 6280. It has a TATA box (TATAAA) from -23 to -18 nts from the TSS.[7]
  128. Gene ID: 6288. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  129. Gene ID: 6289. It has a TATA box (TATAAA) from -17 to -12 nts from the TSS.[7]
  130. Gene ID: 6349. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  131. Gene ID: 6351. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  132. Gene ID: 6352. It has a TATA box (TATAAA) from -27 to -22 nts from the TSS.[7]
  133. Gene ID: 6364. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  134. Gene ID: 6414. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  135. Gene ID: 6427. It has a TATA box (TATAAA) from 11 to 16 nts from the TSS.[7]
  136. Gene ID: 6428. It has a TATA box (TATAAA) from 19 to 24 nts from the TSS.[7]
  137. Gene ID: 6435. It has a TATA box (TATAAA) from -42 to -37 nts from the TSS.[7]
  138. Gene ID: 6436. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  139. Gene ID: 6446. It has a TATA box (TATAAA) from -17 to -12 nts from the TSS.[7]
  140. Gene ID: 6500. It has a TATA box (TATAAA) from 1 to 6 nts from the TSS.[7]
  141. Gene ID: 6513. It has a TATA box (TATAAA) from 18 to 23 nts from the TSS.[7]
  142. Gene ID: 6519. It has a TATA box (TATAAA) from -26 to -21 nts from the TSS.[7]
  143. Gene ID: 6647. It has a TATA box (TATAAA) from -34 to -29 nts from the TSS.[7]
  144. Gene ID: 6707. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  145. Gene ID: 6783. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  146. Gene ID: 6870. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  147. Gene ID: 6916. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  148. Gene ID: 7037. It has a TATA box (TATAAA) from 4 to 9 nts from the TSS.[7]
  149. Gene ID: 7042. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  150. Gene ID: 7056. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]
  151. Gene ID: 7124. It has a TATA box (TATAAA) from -26 to -21 nts from the TSS.[7]
  152. Gene ID: 7252. It has a TATA box (TATAAA) from -35 to -30 nts from the TSS.[7]
  153. Gene ID: 7262. It has a TATA box (TATAAA) from -31 to -16 nts from the TSS.[7]
  154. Gene ID: 7295. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  155. Gene ID: 7295. It has a TATA box (TATAAA) from 25 to 30 nts from the TSS.[7]
  156. Gene ID: 7432. It has a TATA box (TATAAA) from -45 to -40 nts from the TSS.[7]
  157. Gene ID: 7803. It has a TATA box (TATAAA) from -25 to -20 nts from the TSS.[7]
  158. Gene ID: 7850. It has a TATA box (TATAAA) from 23 to -28 nts from the TSS.[7]
  159. Gene ID: 8000. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  160. Gene ID: 8431. It has a TATA box (TATAAA) from -35 to -30 nts from the TSS.[7]
  161. Gene ID: 8942. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  162. Gene ID: 8991. It has a TATA box (TATAAA) from -26 to -21 nts from the TSS.[7]
  163. Gene ID: 4327. It has a TATA box (TATAAA) from -10 to -5 nts from the TSS.[7]
  164. Gene ID: 9001. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  165. Gene ID: 9518. It has a TATA box (TATAAA) from -24 to -19 nts from the TSS.[7]
  166. Gene ID: 9643. It has a TATA box (TATAAA) from 18 to 23 nts from the TSS.[7]
  167. Gene ID: 9768. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  168. Gene ID: 10057. It has a TATA box (TATAAA) from 8 to 13 nts from the TSS.[7]
  169. Gene ID: 10284. It has a TATA box (TATAAA) from -22 to -17 nts from the TSS.[7]
  170. Gene ID: 10397. It has a TATA box (TATAAA) from -21 to -16 nts from the TSS.[7]
  171. Gene ID: 10458. It has a TATA box (TATAAA) from -34 to -29 nts from the TSS.[7]
  172. Gene ID: 10482. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  173. Gene ID: 10563. It has a TATA box (TATAAA) from 33 to 38 nts from the TSS.[7]
  174. Gene ID: 10631. It has a TATA box (TATAAA) from -27 to -22 nts from the TSS.[7]
  175. Gene ID: 10761. It has a TATA box (TATAAA) from -43 to -38 nts from the TSS.[7]
  176. Gene ID: 10761. It has a TATA box (TATAAA) from -2 to 4 nts from the TSS.[7]
  177. Gene ID: 10930. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  178. Gene ID: 10938. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  179. Gene ID: 11081. It has a TATA box (TATAAA) from -38 to -33 nts from the TSS.[7]
  180. Gene ID: 22928. It has a TATA box (TATAAA) from -33 to -28 nts from the TSS.[7]
  181. Gene ID: 22943. It has a TATA box (TATAAA) from 42 to 47 nts from the TSS.[7]
  182. Gene ID: 25928. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  183. Gene ID: 26287. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  184. Gene ID: 26576. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  185. Gene ID: 26827. There is a Hogness box sequence TATAAA beginning at position -31.
  186. Gene ID: 27129. It has a TATA box (TATAAA) from -15 to -10 nts from the TSS.[7]
  187. Gene ID: 27316. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  188. Gene ID: 51155. It has a TATA box (TATAAA) from -20 to -15 nts from the TSS.[7]
  189. Gene ID: 51203. It has a TATA box (TATAAA) from -20 to -15 nts from the TSS.[7]
  190. Gene ID: 51297. It has a TATA box (TATAAA) from -45 to -40 nts from the TSS.[7]
  191. Gene ID: 51313. It has a TATA box (TATAAA) from 2 to 7 nts from the TSS.[7]
  192. Gene ID: 51582. It has a TATA box (TATAAA) from -26 to -21 nts from the TSS.[7]
  193. Gene ID: 55118. It has a TATA box (TATAAA) from 1 to 6 nts from the TSS.[7]
  194. Gene ID: 55504. It has a TATA box (TATAAA) from -26 to -21 nts from the TSS.[7]
  195. Gene ID: 56642. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  196. Gene ID: 56895. It has a TATA box (TATAAA) from 40 to 45 nts from the TSS.[7]
  197. Gene ID: 57126. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  198. Gene ID: 57152. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  199. Gene ID: 57579. It has a TATA box (TATAAA) from -16 to -11 nts from the TSS.[7]
  200. Gene ID: 80740. It has a TATA box (TATAAA) from -44 to -39 nts from the TSS.[7]
  201. Gene ID: 81606. It has a TATA box (TATAAA) from -29 to -24 nts from the TSS.[7]
  202. Gene ID: 83638. It has a TATA box (TATAAA) from -6 to -1 nts from the TSS.[7]
  203. Gene ID: 84223. It has a TATA box (TATAAA) from 26 to 31 nts from the TSS.[7]
  204. Gene ID: 92736. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  205. Gene ID: 115265. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]
  206. Gene ID: 117156. It has a TATA box (TATAAA) from -31 to -26 nts from the TSS.[7]
  207. Gene ID: 126364. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  208. Gene ID: 145957. It has a TATA box (TATAAA) from -30 to -25 nts from the TSS.[7]
  209. Gene ID: 153020. It has a TATA box (TATAAA) from 8 to 13 nts from the TSS.[7]
  210. Gene ID: 200504. It has a TATA box (TATAAA) from -27 to -22 nts from the TSS.[7]
  211. Gene ID: 200539. It has a TATA box (TATAAA) from -32 to -27 nts from the TSS.[7]
  212. Gene ID: 342574. It has a TATA box (TATAAA) from -28 to -23 nts from the TSS.[7]

TATA boxes (Carninci 2006) samplings

For the Basic programs testing consensus sequence TATAAAA (starting with SuccessablesTATACA--.bas.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction: 1, TATAAAA at 2853.
  2. positive strand, negative direction: 2, TATAAAA at 222, TATAAAA at 183.
  3. negative strand, positive direction: 0.
  4. positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 2, TTTTATA at 2869, TTTTATA at 1740.
  6. inverse complement, positive strand, negative direction: 1, TTTTATA at 219.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

TATAAAA (4560-2846) UTRs

  1. Negative strand, negative direction: TATAAAA at 2853.
  2. Negative strand, negative direction: TTTTATA at 2869.

TATAAAA negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TTTTATA at 1740.
  2. Positive strand, negative direction: TATAAAA at 222, TATAAAA at 183.
  3. Positive strand, negative direction: TTTTATA at 219.

TATA boxes (Carninci 2006) random dataset samplings

  1. TATACr0: 0.
  2. TATACr1: 0.
  3. TATACr2: 0.
  4. TATACr3: 1, TATAAAA at 4445.
  5. TATACr4: 0.
  6. TATACr5: 0.
  7. TATACr6: 0.
  8. TATACr7: 0.
  9. TATACr8: 1, TATAAAA at 707.
  10. TATACr9: 1, TATAAAA at 2622.
  11. TATACr0ci: 1, TTTTATA at 497.
  12. TATACr1ci: 0.
  13. TATACr2ci: 0.
  14. TATACr3ci: 0.
  15. TATACr4ci: 1, TTTTATA at 4527.
  16. TATACr5ci: 0.
  17. TATACr6ci: 0.
  18. TATACr7ci: 0.
  19. TATACr8ci: 0.
  20. TATACr9ci: 1, TTTTATA at 162.

TATACr arbitrary (evens) (4560-2846) UTRs

  1. TATACr4ci: TTTTATA at 4527.

TATACr alternate (odds) (4560-2846) UTRs

  1. TATACr3: TATAAAA at 4445.

TATACr arbitrary positive direction (odds) (4445-4265) core promoters

  1. TATACr3: TATAAAA at 4445.

TATACr alternate negative direction (odds) (2811-2596) proximal promoters

  1. TATACr9: TATAAAA at 2622.

TATACr arbitrary negative direction (evens) (2596-1) distal promoters

  1. TATACr8: TATAAAA at 707.
  2. TATACr0ci: TTTTATA at 497.

TATACr alternate negative direction (odds) (2596-1) distal promoters

  1. TATACr9ci: TTTTATA at 162.

TATACr arbitrary positive direction (odds) (4050-1) distal promoters

  1. TATACr9: TATAAAA at 2622.
  2. TATACr9ci: TTTTATA at 162.

TATACr alternate positive direction (evens) (4050-1) distal promoters

  1. TATACr8: TATAAAA at 707.
  2. TATACr0ci: TTTTATA at 497.

TATAAAA (Carninci 2006) analysis and results

A genome-wide study put the fraction of TATAAAA-dependent human promoters at ~10%.[72]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 2 2 1 1 ± 1 (--2,+-0)
Randoms UTR arbitrary negative 1 10 0.1 0.1
Randoms UTR alternate negative 1 10 0.1 0.1
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.05
Randoms Core alternate positive 0 10 0 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0.05
Randoms Proximal alternate negative 1 10 0.1 0.05
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 4 2 2 2 ± 1 (--1,+-3)
Randoms Distal arbitrary negative 2 10 0.2 0.15
Randoms Distal alternate negative 1 10 0.1 0.15
Reals Distal positive 0 2 0 0
Randoms Distal arbitrary positive 2 10 0.2 0.2
Randoms Distal alternate positive 2 10 0.2 0.2

Comparison:

The occurrences of real TATACs are greater than the randoms. This suggests that the real TATACs are likely active or activable.

Carninci TATA box genes

  1. Gene ID: 19. It has a TATA box (TATAAAA) from -34 to -28 nts from the TSS.[7]
  2. Gene ID: 58. It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]
  3. Gene ID: 60. It has a TATA box (TATAAAA) from -29 to -23 nts from the TSS.[7]
  4. Gene ID: 173. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  5. Gene ID: 174. It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]
  6. Gene ID: 229. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  7. Gene ID: 229. It has a TATA box (TATAAAA) from 23 to 29 nts from the TSS.[7]
  8. Gene ID: 230. It has a TATA box (TATAAAA) from -19 to -13 nts from the TSS.[7]
  9. Gene ID: 301. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  10. Gene ID: 302. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  11. Gene ID: 345. It has a TATA box (TATAAAA) from -29 to -23 nts from the TSS.[7]
  12. Gene ID: 467. It has a TATA box (TATAAAA) from -33 to -27 nts from the TSS.[7]
  13. Gene ID: 676. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  14. Gene ID: 759. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  15. Gene ID: 760. It has a TATA box (TATAAAA) from -33 to -27 nts from the TSS.[7]
  16. Gene ID: 762. It has a TATA box (TATAAAA) from 2 to 8 nts from the TSS.[7]
  17. Gene ID: 767. It has a TATA box (TATAAAA) from -34 to -28 nts from the TSS.[7]
  18. Gene ID: 811. It has a TATA box (TATAAAA) from -29 to -23 nts from the TSS.[7]
  19. Gene ID: 1044. It has a TATA box (TATAAAA) from -34 to -28 nts from the TSS.[7]
  20. Gene ID: 1081. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  21. Gene ID: 1215. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  22. Gene ID: 1277. It has a TATA box (TATAAAA) from -29 to -23 nts from the TSS.[7]
  23. Gene ID: 1356. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  24. Gene ID: 1356. It has a TATA box (TATAAAA) from 27 to 33 nts from the TSS.[7]
  25. Gene ID: 1382. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  26. Gene ID: 1392. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  27. Gene ID: 1393. It has a TATA box (TATAAAA) from 36 to 42 nts from the TSS.[7]
  28. Gene ID: 1393. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  29. Gene ID: 1490. It has a TATA box (TATAAAA) from -27 to -21 nts from the TSS.[7]
  30. Gene ID: 1544. It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]
  31. Gene ID: 1571. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  32. Gene ID: 1581. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  33. Gene ID: 1655. It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]
  34. Gene ID: 1805. It has a TATA box (TATAAAA) from -26 to -20 nts from the TSS.[7]
  35. Gene ID: 1893. It has a TATA box (TATAAAA) from 17 to 23 nts from the TSS.[7]
  36. Gene ID: 1906. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  37. Gene ID: 1938. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  38. Gene ID: 1974. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  39. Gene ID: 2168. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  40. Gene ID: 2353. It has a TATA box (TATAAAA) from -22 to -16 nts from the TSS.[7]
  41. Gene ID: 2538. It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]
  42. Gene ID: 2641. It has a TATA box (TATAAAA) from -25 to -19 nts from the TSS.[7]
  43. Gene ID: 2688. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  44. Gene ID: 3273. It has a TATA box (TATAAAA) from -27 to -21 nts from the TSS.[7]
  45. Gene ID: 3304. It has a TATA box (TATAAAA) from -29 to -23 nts from the TSS.[7]
  46. Gene ID: 3397. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  47. Gene ID: 3491. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  48. Gene ID: 3596. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  49. Gene ID: 3605. It has a TATA box (TATAAAA) from -44 to -38 nts from the TSS.[7]
  50. Gene ID: 3624. It has a TATA box (TATAAAA) from 7 to 13 nts from the TSS.[7]
  51. Gene ID: 3860. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  52. Gene ID: 4025. It has a TATA box (TATAAAA) from 11 to 17 nts from the TSS.[7]
  53. Gene ID: 4254. It has a TATA box (TATAAAA) from -29 to -23 nts from the TSS.[7]
  54. Gene ID: 4256. It has a TATA box (TATAAAA) from 2 to 8 nts from the TSS.[7]
  55. Gene ID: 4319. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  56. Gene ID: 4322. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  57. Gene ID: 4435. It has a TATA box (TATAAAA) from -2 to 5 nts from the TSS.[7]
  58. Gene ID: 4504. It has a TATA box (TATAAAA) from -24 to -18 nts from the TSS.[7]
  59. Gene ID: 4609. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  60. Gene ID: 4616. It has a TATA box (TATAAAA) from -27 to -21 nts from the TSS.[7]
  61. Gene ID: 4744. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  62. Gene ID: 4842. It has a TATA box (TATAAAA) from -27 to -21 nts from the TSS.[7]
  63. Gene ID: 4856. It has a TATA box (TATAAAA) from -291 to -23 nts from the TSS.[7]
  64. Gene ID: 4878. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  65. Gene ID: 5033. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  66. Gene ID: 5054. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  67. Gene ID: 5055. It has a TATA box (TATAAAA) from -26 to -20 nts from the TSS.[7]
  68. Gene ID: 5068. It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]
  69. Gene ID: 5079. It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]
  70. Gene ID: 5225. It has a TATA box (TATAAAA) from -28 to -21 nts from the TSS.[7]
  71. Gene ID: 5406. It has a TATA box (TATAAAA) from -24 to -18 nts from the TSS.[7]
  72. Gene ID: 5408. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  73. Gene ID: 5478. It has a TATA box (TATAAAA) from 8 to 14 nts from the TSS.[7]
  74. Gene ID: 5514. It has a TATA box (TATAAAA) from -33 to -27 nts from the TSS.[7]
  75. Gene ID: 5741. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  76. Gene ID: 5743. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  77. Gene ID: 6175. It has a TATA box (TATAAAA) from -29 to -23 nts from the TSS.[7]
  78. Gene ID: 6187. It has a TATA box (TATAAAA) from -25 to -19 nts from the TSS.[7]
  79. Gene ID: 6279. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  80. Gene ID: 6354. It has a TATA box (TATAAAA) from -17 to -11 nts from the TSS.[7]
  81. Gene ID: 6356. It has a TATA box (TATAAAA) from -44 to -38 nts from the TSS.[7]
  82. Gene ID: 6357. It has a TATA box (TATAAAA) from -36 to -30 nts from the TSS.[7]
  83. Gene ID: 6376. It has a TATA box (TATAAAA) from -12 to -6 nts from the TSS.[7]
  84. Gene ID: 6518. It has a TATA box (TATAAAA) from -33 to -27 nts from the TSS.[7]
  85. Gene ID: 6559. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  86. Gene ID: 6624. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  87. Gene ID: 6698. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  88. Gene ID: 6781. It has a TATA box (TATAAAA) from -29 to -23 nts from the TSS.[7]
  89. Gene ID: 6906. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  90. Gene ID: 6988. It has a TATA box (TATAAAA) from -26 to -20 nts from the TSS.[7]
  91. Gene ID: 7031. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  92. Gene ID: 7032. It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]
  93. Gene ID: 7038. It has a TATA box (TATAAAA) from -35 to -29 nts from the TSS.[7]
  94. Gene ID: 7043. It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]
  95. Gene ID: 7252. It has a TATA box (TATAAAA) from 2 to 8 nts from the TSS.[7]
  96. Gene ID: 7276. It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]
  97. Gene ID: 7280. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  98. Gene ID: 7306. It has a TATA box (TATAAAA) from -27 to -21 nts from the TSS.[7]
  99. Gene ID: 7432. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  100. Gene ID: 7803. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  101. Gene ID: 7852. It has a TATA box (TATAAAA) from -26 to -20 nts from the TSS.[7]
  102. Gene ID: 8288. It has a TATA box (TATAAAA) from -33 to -27 nts from the TSS.[7]
  103. Gene ID: 8339. It has a TATA box (TATAAAA) from -33 to -27 nts from the TSS.[7]
  104. Gene ID: 8564. It has a TATA box (TATAAAA) from -14 to -8 nts from the TSS.[7]
  105. Gene ID: 8832. It has a TATA box (TATAAAA) from -12 to -6 nts from the TSS.[7]
  106. Gene ID: 8969. It has a TATA box (TATAAAA) from -27 to -21 nts from the TSS.[7]
  107. Gene ID: 8970. It has a TATA box (TATAAAA) from -28 to -22 nts from the TSS.[7]
  108. Gene ID: 9510. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  109. Gene ID: 9607. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  110. Gene ID: 9709. It has a TATA box (TATAAAA) from 1 to 7 nts from the TSS.[7]
  111. Gene ID: 10215. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  112. Gene ID: 10350. It has a TATA box (TATAAAA) from 31 to 37 nts from the TSS.[7]
  113. Gene ID: 11067. It has a TATA box (TATAAAA) from 14 to 20 nts from the TSS.[7]
  114. Gene ID: 11082. It has a TATA box (TATAAAA) from -11 to -5 nts from the TSS.[7]
  115. Gene ID: 11169. It has a TATA box (TATAAAA) from -16 to -10 nts from the TSS.[7]
  116. Gene ID: 23645. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  117. Gene ID: 27106. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  118. Gene ID: 50943. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  119. Gene ID: 51050. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  120. Gene ID: 51129. It has a TATA box (TATAAAA) from -29 to -23 nts from the TSS.[7]
  121. Gene ID: 51297. It has a TATA box (TATAAAA) from -45 to -39 nts from the TSS.[7]
  122. Gene ID: 54106. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  123. Gene ID: 55603. It has a TATA box (TATAAAA) from -2 to 5 nts from the TSS.[7]
  124. Gene ID: 56675. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  125. Gene ID: 56829. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  126. Gene ID: 56987. It has a TATA box (TATAAAA) from -35 to -29 nts from the TSS.[7]
  127. Gene ID: 57626. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  128. Gene ID: 57823. It has a TATA box (TATAAAA) from -15 to -9 nts from the TSS.[7]
  129. Gene ID: 64111. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  130. Gene ID: 80177. It has a TATA box (TATAAAA) from -34 to -28 nts from the TSS.[7]
  131. Gene ID: 81285. It has a TATA box (TATAAAA) from -29 to -23 nts from the TSS.[7]
  132. Gene ID: 83998. It has a TATA box (TATAAAA) from -29 to -23 nts from the TSS.[7]
  133. Gene ID: 84107. It has a TATA box (TATAAAA) from 29 to 35 nts from the TSS.[7]
  134. Gene ID: 85235. It has a TATA box (TATAAAA) from -31 to -25 nts from the TSS.[7]
  135. Gene ID: 114899. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]
  136. Gene ID: 116842. It has a TATA box (TATAAAA) from -32 to -26 nts from the TSS.[7]
  137. Gene ID: 130120. It has a TATA box (TATAAAA) from 17 to 23 nts from the TSS.[7]
  138. Gene ID: 145957. It has a TATA box (TATAAAA) from -7 to -1 nts from the TSS.[7]
  139. Gene ID: 147183. It has a TATA box (TATAAAA) from -30 to -24 nts from the TSS.[7]

TATA box (Watson 2014) samplings

For the Basic programs testing consensus sequence TATA(A/T)A(A/T) (starting with SuccessablesTATAW.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 4, TATATAT at 2872, TATAAAA at 2853, TATATAA at 1601, TATATAT at 1599.
  2. Positive strand, negative direction: 3, TATATAA at 2873, TATATAT at 1600, TATAAAA at 222, TATAAAA at 183.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 5, TTATATA at 2871, TTTTATA at 2869, ATTTATA at 2638, TTTTATA at 1740.
  6. inverse complement, positive strand, negative direction: 2, ATATATA at 1599, TTTTATA at 219.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

TATAW (4560-2846) UTRs

  1. Negative strand, negative direction: TATATAT at 2872, TATAAAA at 2853.
  2. Negative strand, negative direction: TTATATA at 2871, TTTTATA at 2869.
  3. Positive strand, negative direction: TATATAA at 2873.

TATAW negative direction (2811-2596) proximal promoters

  1. Negative strand, negative direction: ATTTATA at 2638.

TATAW negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TATATAA at 1601, TATATAT at 1599.
  2. Negative strand, negative direction: TTTTATA at 1740.
  3. Positive strand, negative direction: TATATAT at 1600, TATAAAA at 222, TATAAAA at 183.
  4. Positive strand, negative direction: TTTTATA at 219.

TATA box (Watson 2014) random dataset samplings

  1. TATAWr0: 1, TATAAAT at 3566.
  2. TATAWr1: 1, TATATAT at 139.
  3. TATAWr2: 0.
  4. TATAWr3: 1, TATAAAA at 4445.
  5. TATAWr4: 0.
  6. TATAWr5: 1, TATAAAT at 1564.
  7. TATAWr6: 0.
  8. TATAWr7: 1, TATAAAT at 3630.
  9. TATAWr8: 1, TATAAAA at 707.
  10. TATAWr9: 2, TATATAA at 3149, TATAAAA at 2622.
  11. TATAWr0ci: 2, ATATATA at 2701, TTTTATA at 497.
  12. TATAWr1ci: 1, TTATATA at 1267.
  13. TATAWr2ci: 0.
  14. TATAWr3ci: 1, ATATATA at 3803.
  15. TATAWr4ci: 2, TTTTATA at 4527, TTATATA at 4254.
  16. TATAWr5ci: 0.
  17. TATAWr6ci: 1, TTATATA at 833.
  18. TATAWr7ci: 1, TTATATA at 4254.
  19. TATAWr8ci: 0.
  20. TATAWr9ci: 1, TTTTATA at 162.

TATAWr arbitrary (evens) (4560-2846) UTRs

  1. TATAWr0: TATAAAT at 3566.
  2. TATAWr4ci: TTTTATA at 4527, TTATATA at 4254.

TATAWr alternate (odds) (4560-2846) UTRs

  1. TATAWr3: TATAAAA at 4445.
  2. TATAWr7: TATAAAT at 3630.
  3. TATAWr9: TATATAA at 3149.
  4. TATAWr3ci: ATATATA at 3803.
  5. TATAWr7ci: TTATATA at 4254.

TATAWr arbitrary positive direction (odds) (4445-4265) core promoters

  1. TATAWr3: TATAAAA at 4445.

TATAWr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. TATAWr0ci: ATATATA at 2701.

TATAWr alternate negative direction (odds) (2811-2596) proximal promoters

  1. TATAWr9: TATAAAA at 2622.

TATAWr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. TATAWr7ci: TTATATA at 4254.

TATAWr alternate positive direction (evens) (4265-4050) proximal promoters

  1. TATAWr4ci: TTATATA at 4254.

TATAWr arbitrary negative direction (evens) (2596-1) distal promoters

  1. TATAWr8: TATAAAA at 707.
  2. TATAWr0ci: TTTTATA at 497.
  3. TATAWr6ci: TTATATA at 833.

TATAWr alternate negative direction (odds) (2596-1) distal promoters

  1. TATAWr1: TATATAT at 139.
  2. TATAWr5: TATAAAT at 1564.
  3. TATAWr1ci: TTATATA at 1267.
  4. TATAWr9ci: TTTTATA at 162.

TATAWr arbitrary positive direction (odds) (4050-1) distal promoters

  1. TATAWr1: TATATAT at 139.
  2. TATAWr5: TATAAAT at 1564.
  3. TATAWr7: TATAAAT at 3630.
  4. TATAWr9: TATATAA at 3149, TATAAAA at 2622.
  5. TATAWr1ci: TTATATA at 1267.
  6. TATAWr3ci: ATATATA at 3803.
  7. TATAWr9ci: TTTTATA at 162.

TATAWr alternate positive direction (evens) (4050-1) distal promoters

  1. TATAWr0: TATAAAT at 3566.
  2. TATAWr8: TATAAAA at 707.
  3. TATAWr0ci: ATATATA at 2701, TTTTATA at 497.
  4. TATAWr6ci: TTATATA at 833.

TATA box (Watson 2014) analysis and results

The TATA box is a component of the eukaryotic core promoter and generally contains the consensus sequence TATA(A/T)A(A/T).[73]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 5 2 2.5 2.5 ± 1.5 (--4,+-1)
Randoms UTR arbitrary negative 3 10 0.3 0.4
Randoms UTR alternate negative 5 10 0.5 0.4
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.05
Randoms Core alternate positive 0 10 0 0.05
Reals Proximal negative 1 2 0.5 0.5
Randoms Proximal arbitrary negative 1 10 0.1 0.1
Randoms Proximal alternate negative 1 10 0.1 0.1
Reals Proximal positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Proximal arbitrary positive 1 10 0.1 0.1
Randoms Proximal alternate positive 1 10 0.1 0.1
Reals Distal negative 7 2 3.5 3.5 ± 0.5 (--3,+-4)
Randoms Distal arbitrary negative 3 10 0.3 0.35
Randoms Distal alternate negative 4 10 0.4 0.35
Reals Distal positive 0 2 0 0
Randoms Distal arbitrary positive 8 10 0.8 0.65
Randoms Distal alternate positive 5 10 0.5 0.65

Comparison:

The occurrences of real TATAW UTRs, proximals and distals are greater than the randoms. This suggests that the real TATAWs are likely active or activable.

Watson TATA box genes

  1. Gene ID: 262. It has a TATA box (TATATAAG) from 10 to 17 nts from the TSS.[7]
  2. Gene ID: 292. It has a TATA box (TATATAAA) from 1 to 8 nts from the TSS.[7]
  3. Gene ID: 604. It has a TATA box (TATATATA) from -22 to -15 nts from the TSS.[7]
  4. Gene ID: 794. It has a TATA box (TATATAAG) from -30 to -23 nts from the TSS.[7]
  5. Gene ID: 1116. It has a TATA box (TATATAAA) from 16 to 23 nts from the TSS.[7]
  6. Gene ID: 1153. It has a TATA box (TATATAAG) from -32 to -25 nts from the TSS.[7]
  7. Gene ID: 1158. It has a TATA box (TATATAAG) from -12 to -5 nts from the TSS.[7]
  8. Gene ID: 1410. It has a TATA box (TATATAAG) from -26 to -19 nts from the TSS.[7]
  9. Gene ID: 1581. It has a TATA box (TATATAAA) from -33 to -26 nts from the TSS.[7]
  10. Gene ID: 1728. It has a TATA box (TATATAAG) from -12 to -5 nts from the TSS.[7]
  11. Gene ID: 1805. It has a TATA box (TATATAAA) from -28 to -21 nts from the TSS.[7]
  12. Gene ID: 1811. It has a TATA box (TATATATA) from -19 to -12 nts from the TSS.[7]
  13. Gene ID: 1831. It has a TATA box (TATATAAA) from -28 to -21 nts from the TSS.[7]
  14. Gene ID: 1915. It has a TATA box (TATATAAG) from -31 to -24 nts from the TSS.[7]
  15. Gene ID: 1961. It has a TATA box (TATATAAG) from -31 to -24 nts from the TSS.[7]
  16. Gene ID: 2244. It has a TATA box (TATATATA) from -29 to -22 nts from the TSS.[7]
  17. Gene ID: 2641. It has a TATA box (TATATAAA) from -27 to -20 nts from the TSS.[7]
  18. Gene ID: 2814. It has a TATA box (TATATATG) from -30 to -23 nts from the TSS.[7]
  19. Gene ID: 3006. It has a TATA box (TATATATA) from -37 to -28 nts from the TSS.[7]
  20. Gene ID: 3008. It has a TATA box (TATATAAA) from -31 to -24 nts from the TSS.[7]
  21. Gene ID: 3009. It has a TATA box (TATATAAG) from -38 to -31 nts from the TSS.[7]
  22. Gene ID: 3009. It has a TATA box (TATATAAG) from 29 to 36 nts from the TSS.[7]
  23. Gene ID: 3010. It has a TATA box (TATATAAG) from -32 to -25 nts from the TSS.[7]
  24. Gene ID: 3050. It has a TATA box (TATATAAG) from -31 to -24 nts from the TSS.[7]
  25. Gene ID: 3164. It has a TATA box (TATATAAA) from -42 to -35 nts from the TSS.[7]
  26. Gene ID: 3232. It has a TATA box (TATATAAA) from -33 to -26 nts from the TSS.[7]
  27. Gene ID: 3280. It has a TATA box (TATATATA) from -28 to -21 nts from the TSS.[7]
  28. Gene ID: 3309. It has a TATA box (TATATAAG) from -31 to -24 nts from the TSS.[7]
  29. Gene ID: 3320. It has a TATA box (TATATAAG) from -30 to -23 nts from the TSS.[7]
  30. Gene ID: 3371. It has a TATA box (TATATAAG) from -34 to -27 nts from the TSS.[7]
  31. Gene ID: 3375. It has a TATA box (TATATAAG) from -30 to -23 nts from the TSS.[7]
  32. Gene ID: 3565. It has a TATA box (TATATATA) from -28 to -21 nts from the TSS.[7]
  33. Gene ID: 3593. It has a TATA box (TATATATA) from -28 to -21 nts from the TSS.[7]
  34. Gene ID: 3708. It has a TATA box (TATATAAG) from -30 to -23 nts from the TSS.[7]
  35. Gene ID: 3848. It has a TATA box (TATATAAG) from -43 to -36 nts from the TSS.[7]
  36. Gene ID: 3858. It has a TATA box (TATATAAA) from -36 to -29 nts from the TSS.[7]
  37. Gene ID: 3859. It has a TATA box (TATATAAG) from -45 to -38 nts from the TSS.[7]
  38. Gene ID: 3976. It has a TATA box (TATATAAA) from -32 to -25 nts from the TSS.[7]
  39. Gene ID: 3977. It has a TATA box (TATATATA) from -31 to -24 nts from the TSS.[7]
  40. Gene ID: 4014. It has a TATA box (TATATATA) from -40 to -33 nts from the TSS.[7]
  41. Gene ID: 4316. It has a TATA box (TATATAAA) from -27 to -20 nts from the TSS.[7]
  42. Gene ID: 4321. It has a TATA box (TATATAAA) from 3 to 10 nts from the TSS.[7]
  43. Gene ID: 4618. It has a TATA box (TATATATA) from -32 to -25 nts from the TSS.[7]
  44. Gene ID: 4632. It has a TATA box (TATATATG) from -27 to -20 nts from the TSS.[7]
  45. Gene ID: 4638. It has a TATA box (TATATATG) from 28 to 35 nts from the TSS.[7]
  46. Gene ID: 4653. It has a TATA box (TATATATA) from -31 to -24 nts from the TSS.[7]
  47. Gene ID: 4842. It has a TATA box (TATATAAA) from -29 to -22 nts from the TSS.[7]
  48. Gene ID: 4869. It has a TATA box (TATATATA) from -28 to -21 nts from the TSS.[7]
  49. Gene ID: 4922. It has a TATA box (TATATATA) from -33 to -26 nts from the TSS.[7]
  50. Gene ID: 4982. It has a TATA box (TATATATA) from -28 to -21 nts from the TSS.[7]
  51. Gene ID: 5443. It has a TATA box (TATATAAG) from -28 to -21 nts from the TSS.[7]
  52. Gene ID: 5449. It has a TATA box (TATATATG) from 18 to 25 nts from the TSS.[7]
  53. Gene ID: 5553. It has a TATA box (TATATAAG) from -34 to -27 nts from the TSS.[7]
  54. Gene ID: 5741. It has a TATA box (TATATATA) from -30 to -23 nts from the TSS.[7]
  55. Gene ID: 5744. It has a TATA box (TATATATA) from -30 to -23 nts from the TSS.[7]
  56. Gene ID: 5996. It has a TATA box (TATATAAA) from -30 to -23 nts from the TSS.[7]
  57. Gene ID: 6046. It has a TATA box (TATATATA) from -37 to -30 nts from the TSS.[7]
  58. Gene ID: 6224. It has a TATA box (TATATAAG) from -28 to -21 nts from the TSS.[7]
  59. Gene ID: 6232. It has a TATA box (TATATAAG) from -29 to -22 nts from the TSS.[7]
  60. Gene ID: 6432. It has a TATA box (TATATAAA) from -9 to -2 nts from the TSS.[7]
  61. Gene ID: 6707. It has a TATA box (TATATAAA) from -32 to -25 nts from the TSS.[7]
  62. Gene ID: 7021. It has a TATA box (TATATATG) from -48 to -41 nts from the TSS.[7]
  63. Gene ID: 7167. It has a TATA box (TATATAAG) from -2 to 5 nts from the TSS.[7]
  64. Gene ID: 7316. It has a TATA box (TATATAAG) from -32 to -25 nts from the TSS.[7]
  65. Gene ID: 7369. It has a TATA box (TATATATA) from -31 to -24 nts from the TSS.[7]
  66. Gene ID: 7432. It has a TATA box (TATATAAA) from -47 to -40 nts from the TSS.[7]
  67. Gene ID: 8483. It has a TATA box (TATATAAG) from -30 to -23 nts from the TSS.[7]
  68. Gene ID: 8490. It has a TATA box (TATATATA) from -29 to -22 nts from the TSS.[7]
  69. Gene ID: 8513. It has a TATA box (TATATAAG) from -28 to -21 nts from the TSS.[7]
  70. Gene ID: 8970. It has a TATA box (TATATAAA) from -30 to -23 nts from the TSS.[7]
  71. Gene ID: 9421. It has a TATA box (TATATAAG) from 7 to 14 nts from the TSS.[7]
  72. Gene ID: 9921. It has a TATA box (TATATATA) from -30 to -23 nts from the TSS.[7]
  73. Gene ID: 10631. It has a TATA box (TATATAAA) from -29 to -22 nts from the TSS.[7]
  74. Gene ID: 10761. It has a TATA box (TATATAAA) from -45 to -38 nts from the TSS.[7]
  75. Gene ID: 10769. It has a TATA box (TATATAAG) from -31 to -24 nts from the TSS.[7]
  76. Gene ID: 10912. It has a TATA box (TATATAAG) from -31 to -24 nts from the TSS.[7]
  77. Gene ID: 11009. It has a TATA box (TATATATG) from -30 to -23 nts from the TSS.[7]
  78. Gene ID: 23450. It has a TATA box (TATATATA) from -40 to -33 nts from the TSS.[7]
  79. Gene ID: 25928. It has a TATA box (TATATAAA) from -31 to -24 nts from the TSS.[7]
  80. Gene ID: 27063. It has a TATA box (TATATAAG) from -33 to -26 nts from the TSS.[7]
  81. Gene ID: 51278. It has a TATA box (TATATAAG) from -32 to -25 nts from the TSS.[7]
  82. Gene ID: 51738. It has a TATA box (TATATAAG) from -32 to -25 nts from the TSS.[7]
  83. Gene ID: 54567. It has a TATA box (TATATATA) from -35 to -28 nts from the TSS.[7]
  84. Gene ID: 55504. It has a TATA box (TATATATA) from -30 to -23 nts from the TSS.[7]
  85. Gene ID: 56987. It has a TATA box (TATATAAA) from -37 to -29 nts from the TSS.[7]
  86. Gene ID: 57626. It has a TATA box (TATATAAA) from -32 to -25 nts from the TSS.[7]
  87. Gene ID: 65108. It has a TATA box (TATATATA) from -32 to -25 nts from the TSS.[7]
  88. Gene ID: 79733. It has a TATA box (TATATATA) from -12 to -5 nts from the TSS.[7]
  89. Gene ID: 84328. It has a TATA box (TATATATA) from -45 to -38 nts from the TSS.[7]
  90. Gene ID: 84419. It has a TATA box (TATATATA) from -32 to -25 nts from the TSS.[7]
  91. Gene ID: 84790. It has a TATA box (TATATAAG) from -33 to -26 nts from the TSS.[7]
  92. Gene ID: 84889. It has a TATA box (TATATAAG) from -40 to -33 nts from the TSS.[7]
  93. Gene ID: 131377. It has a TATA box (TATATATA) from -30 to -23 nts from the TSS.[7]
  94. Gene ID: 342574. It has a TATA box (TATATAAA) from -30 to -23 nts from the TSS.[7]
  95. Gene ID: 389125. It has a TATA box (TATATAAG) from 13 to 20 nts from the TSS.[7]

TATA box (Juven-Gershon 2010) samplings

For the Basic programs (starting with SuccessablesTATAJ.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction, looking for TATA(A/T)AA(A/G): 1, TATATAAA at 1602.
  2. Positive strand, negative direction: 3, TATATAAA at 2874, TATAAAAG at 223, TATAAAAG at 184.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction, looking for (C/T)TT(A/T)TATA: 1, TTTATATA at 2871.
  6. inverse complement, positive strand, negative direction: 1, TTTTTATA at 219.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

TATAJ (4560-2846) UTRs

  1. Negative strand, negative direction: TTTATATA at 2871.
  2. Positive strand, negative direction: TATATAAA at 2874.

TATAJ negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TATATAAA at 1602.
  2. Positive strand, negative direction: TATAAAAG at 223, TATAAAAG at 184.
  3. Positive strand, negative direction: TTTTTATA at 219.

TATA box (Juven-Gershon 2010) random dataset samplings

  1. TATAJr0: 0.
  2. TATAJr1: 0.
  3. TATAJr2: 0.
  4. TATAJr3: 0.
  5. TATAJr4: 0.
  6. TATAJr5: 0.
  7. TATAJr6: 0.
  8. TATAJr7: 0.
  9. TATAJr8: 1, TATAAAAA at 708.
  10. TATAJr9: 1, TATATAAA at 3150.
  11. TATAJr0ci: 0.
  12. TATAJr1ci: 0.
  13. TATAJr2ci: 0.
  14. TATAJr3ci: 0.
  15. TATAJr4ci: 2, CTTTTATA at 4527, CTTATATA at 4254.
  16. TATAJr5ci: 0.
  17. TATAJr6ci: 1, CTTATATA at 833.
  18. TATAJr7ci: 1, TTTATATA at 4254.
  19. TATAJr8ci: 0.
  20. TATAJr9ci: 0.

TATAJr arbitrary (evens) (4560-2846) UTRs

  1. TATAJr4ci: CTTTTATA at 4527, CTTATATA at 4254.

TATAJr alternate (odds) (4560-2846) UTRs

  1. TATAJr9: TATATAAA at 3150.
  2. TATAJr7ci: TTTATATA at 4254.

TATAJr alternate positive direction (evens) (4445-4265) core promoters

  1. TATAJr4ci: CTTATATA at 4254.

TATAJr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. TATAJr7ci: TTTATATA at 4254.

TATAJr alternate positive direction (evens) (4265-4050) proximal promoters

  1. TATAJr4ci: CTTATATA at 4254.

TATAJr arbitrary negative direction (evens) (2596-1) distal promoters

  1. TATAJr8: TATAAAAA at 708.
  2. TATAJr6ci: CTTATATA at 833.

TATAJr arbitrary positive direction (odds) (4050-1) distal promoters

  1. TATAJr9: TATATAAA at 3150.

TATAJr alternate positive direction (evens) (4050-1) distal promoters

  1. TATAJr8: TATAAAAA at 708.
  2. TATAJr6ci: CTTATATA at 833.

TATA box (Juven-Gershon 2010) analysis and results

"The metazoan TATA box consensus is TATAWAAR [...]."[5]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 2 2 1 1 ± 0 (--1,+-1)
Randoms UTR arbitrary negative 2 10 0.2 0.2
Randoms UTR alternate negative 2 10 0.2 0.2
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0.05
Randoms Core alternate positive 1 10 0.1 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 1 10 0.1 0.1
Randoms Proximal alternate positive 1 10 0.1 0.1
Reals Distal negative 4 2 2 2 ± 1 (--1,+-3)
Randoms Distal arbitrary negative 2 10 0.2 0.1
Randoms Distal alternate negative 0 10 0 0.1
Reals Distal positive 0 2 0 0
Randoms Distal arbitrary positive 1 10 0.1 0.15
Randoms Distal alternate positive 2 10 0.2 0.15

Comparison:

The occurrences of real TATAJ UTRs and distals are greater than the randoms. This suggests that the real TATAJs are likely active or activable.

Juven-Gershon TATA box genes

  1. Gene ID: 19. It has a TATA box (TATAAAAG) from -34 to -27 nts from the TSS.[7]
  2. Gene ID: 174. It has a TATA box (TATAAAAG) from -28 to -21 nts from the TSS.[7]
  3. Gene ID: 302. It has a TATA box (TATAAAAG) from -30 to -23 nts from the TSS.[7]
  4. Gene ID: 467. It has a TATA box (TATAAAAG) from -33 to -26 nts from the TSS.[7]
  5. Gene ID: 676. It has a TATA box (TATAAAAG) from -31 to -24 nts from the TSS.[7]
  6. Gene ID: 760. It has a TATA box (TATAAAAG) from -33 to -26 nts from the TSS.[7]
  7. Gene ID: 811. It has a TATA box (TATAAAAG) from -29 to -22 nts from the TSS.[7]
  8. Gene ID: 1044. It has a TATA box (TATAAAAG) from -34 to -27 nts from the TSS.[7]
  9. Gene ID: 1081. It has a TATA box (TATAAAAG) from -30 to -23 nts from the TSS.[7]
  10. Gene ID: 1277. It has a TATA box (TATAAAAG) from -29 to -22 nts from the TSS.[7]
  11. Gene ID: 1382. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  12. Gene ID: 1393. It has a TATA box (TATAAAAG) from 36 to 43 nts from the TSS.[7]
  13. Gene ID: 1393. It has a TATA box (TATAAAAG) from -31 to -24 nts from the TSS.[7]
  14. Gene ID: 1490. It has a TATA box (TATAAAAG) from -27 to -20 nts from the TSS.[7]
  15. Gene ID: 1655. It has a TATA box (TATAAAAG) from -28 to -21 nts from the TSS.[7]
  16. Gene ID: 1893. It has a TATA box (TATAAAAG) from 17 to 24 nts from the TSS.[7]
  17. Gene ID: 2353. It has a TATA box (TATAAAAG) from -22 to -15 nts from the TSS.[7]
  18. Gene ID: 2641. It has a TATA box (TATAAAAG) from -25 to -18 nts from the TSS.[7]
  19. Gene ID: 3304. It has a TATA box (TATAAAAG) from -29 to -22 nts from the TSS.[7]
  20. Gene ID: 3397. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  21. Gene ID: 3491. It has a TATA box (TATAAAAG) from -30 to -23 nts from the TSS.[7]
  22. Gene ID: 3596. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  23. Gene ID: 3860. It has a TATA box (TATAAAAG) from -31 to -24 nts from the TSS.[7]
  24. Gene ID: 4025. It has a TATA box (TATAAAAG) from 11 to 18 nts from the TSS.[7]
  25. Gene ID: 4254. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  26. Gene ID: 4319. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  27. Gene ID: 4322. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  28. Gene ID: 4504. It has a TATA box (TATAAAAG) from -24 to -17 nts from the TSS.[7]
  29. Gene ID: 4609. It has a TATA box (TATAAAAG) from -30 to -23 nts from the TSS.[7]
  30. Gene ID: 4616. It has a TATA box (TATAAAAG) from -27 to -20 nts from the TSS.[7]
  31. Gene ID: 4744. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  32. Gene ID: 4842. It has a TATA box (TATAAAAG) from -27 to -20 nts from the TSS.[7]
  33. Gene ID: 5033. It has a TATA box (TATAAAAG) from -33 to -21 nts from the TSS.[7]
  34. Gene ID: 5054. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  35. Gene ID: 5079. It has a TATA box (TATAAAAG) from -28 to -25 nts from the TSS.[7]
  36. Gene ID: 5225. It has a TATA box (TATAAAAG) from -28 to -21 nts from the TSS.[7]
  37. Gene ID: 5406. It has a TATA box (TATAAAAG) from -24 to -21 nts from the TSS.[7]
  38. Gene ID: 5408. It has a TATA box (TATAAAAG) from -31 to -24 nts from the TSS.[7]
  39. Gene ID: 5478. It has a TATA box (TATAAAAG) from 8 to 15 nts from the TSS.[7]
  40. Gene ID: 5514. It has a TATA box (TATAAAAG) from -28 to -22 nts from the TSS.[7]
  41. Gene ID: 5741. It has a TATA box (TATAAAAG) from -29 to -21 nts from the TSS.[7]
  42. Gene ID: 6175. It has a TATA box (TATAAAAG) from -29 to -22 nts from the TSS.[7]
  43. Gene ID: 6187. It has a TATA box (TATAAAAG) from -25 to -18 nts from the TSS.[7]
  44. Gene ID: 6354. It has a TATA box (TATAAAAG) from -17 to -10 nts from the TSS.[7]
  45. Gene ID: 6356. It has a TATA box (TATAAAAG) from -44 to -37 nts from the TSS.[7]
  46. Gene ID: 6357. It has a TATA box (TATAAAAG) from -36 to -29 nts from the TSS.[7]
  47. Gene ID: 6698. It has a TATA box (TATAAAAG) from -30 to -23 nts from the TSS.[7]
  48. Gene ID: 6988. It has a TATA box (TATAAAAG) from -26 to -19 nts from the TSS.[7]
  49. Gene ID: 7038. It has a TATA box (TATAAAAG) from -35 to -28 nts from the TSS.[7]
  50. Gene ID: 7276. It has a TATA box (TATAAAAG) from -30 to -21 nts from the TSS.[7]
  51. Gene ID: 7280. It has a TATA box (TATAAAAG) from -31 to -24 nts from the TSS.[7]
  52. Gene ID: 7803. It has a TATA box (TATAAAAG) from -30 to -23 nts from the TSS.[7]
  53. Gene ID: 7852. It has a TATA box (TATAAAAG) from -26 to -19 nts from the TSS.[7]
  54. Gene ID: 8288. It has a TATA box (TATAAAAG) from -33 to -26 nts from the TSS.[7]
  55. Gene ID: 8339. It has a TATA box (TATAAAAG) from -33 to -26 nts from the TSS.[7]
  56. Gene ID: 8970. It has a TATA box (TATAAAAG) from -28 to -21 nts from the TSS.[7]
  57. Gene ID: 9510. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  58. Gene ID: 9607. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  59. Gene ID: 9709. It has a TATA box (TATAAAAG) from 1 to 8 nts from the TSS.[7]
  60. Gene ID: 11067. It has a TATA box (TATAAAAG) from 14 to 21 nts from the TSS.[7]
  61. Gene ID: 23645. It has a TATA box (TATAAAAG) from -31 to -24 nts from the TSS.[7]
  62. Gene ID: 27106. It has a TATA box (TATAAAAG) from -31 to -24 nts from the TSS.[7]
  63. Gene ID: 27106. It has a TATA box (TATAAAAG) from -31 to -24 nts from the TSS.[7]
  64. Gene ID: 50943. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  65. Gene ID: 51297. It has a TATA box (TATAAAAG) from -45 to -38 nts from the TSS.[7]
  66. Gene ID: 56675. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]
  67. Gene ID: 80177. It has a TATA box (TATAAAAG) from -34 to -27 nts from the TSS.[7]
  68. Gene ID: 81285. It has a TATA box (TATAAAAG) from -29 to -22 nts from the TSS.[7]
  69. Gene ID: 83998. It has a TATA box (TATAAAAG) from -29 to -22 nts from the TSS.[7]
  70. Gene ID: 85235. It has a TATA box (TATAAAAG) from -31 to -24 nts from the TSS.[7]
  71. Gene ID: 116842. It has a TATA box (TATAAAAG) from -32 to -25 nts from the TSS.[7]

TATA box (Basehoar 2004) samplings

For the Basic programs (starting with SuccessablesTATA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction, looking for TATA(A/T)A(A/T)(A/G): 2, TATATAAA at 1602, TATATATA at 1600.
  2. Positive strand, negative direction: 3, TATATAAA at 2874, TATAAAAG at 223, TATAAAAG at 184.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 2, TTTATATA at 2871, TATATATA at 1600.
  6. inverse complement, positive strand, negative direction: 1, TTTTTATA at 219.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

TATA (4560-2846) UTRs

  1. Negative strand, negative direction: TTTATATA at 2871.
  2. Positive strand, negative direction: TATATAAA at 2874.

TATA negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TATATAAA at 1602, TATATATA at 1600.
  2. Positive strand, negative direction: TATAAAAG at 223, TATAAAAG at 184.
  3. Positive strand, negative direction: TTTTTATA at 219.

TATA box (Basehoar 2004) random dataset samplings

  1. TATAr0: 0.
  2. TATAr1: 0.
  3. TATAr2: 0.
  4. TATAr3: 0.
  5. TATAr4: 0.
  6. TATAr5: 1, TATAAATG at 1565.
  7. TATAr6: 0.
  8. TATAr7: 1, TATAAATA at 3631.
  9. TATAr8: 1, TATAAAAA at 708.
  10. TATAr9: 1, TATATAAA at 3150.
  11. TATAr0ci: 0.
  12. TATAr1ci: 0.
  13. TATAr2ci: 0.
  14. TATAr3ci: 0.
  15. TATAr4ci: 2, CTTTTATA at 4527, CTTATATA at 4254.
  16. TATAr5ci: 0.
  17. TATAr6ci: 1, CTTATATA at 833.
  18. TATAr7ci: 1, TTTATATA at 4254.
  19. TATAr8ci: 0.
  20. TATAr9ci: 0.

TATAr arbitrary (evens) (4560-2846) UTRs

  1. TATAr4ci: CTTTTATA at 4527, CTTATATA at 4254.

TATAr alternate (odds) (4560-2846) UTRs

  1. TATAr7: TATAAATA at 3631.
  2. TATAr9: TATATAAA at 3150.
  3. TATAr7ci: TTTATATA at 4254.

TATAr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. TATAr7ci: TTTATATA at 4254.

TATAr alternate positive direction (evens) (4265-4050) proximal promoters

  1. TATAr4ci: CTTATATA at 4254.

TATAr arbitrary negative direction (evens) (2596-1) distal promoters

  1. TATAr8: TATAAAAA at 708.
  2. TATAr6ci: CTTATATA at 833.

TATAr alternate negative direction (odds) (2596-1) distal promoters

  1. TATAr5: TATAAATG at 1565.

TATAr arbitrary positive direction (odds) (4050-1) distal promoters

  1. TATAr5: TATAAATG at 1565.
  2. TATAr7: TATAAATA at 3631.
  3. TATAr9: TATATAAA at 3150.

TATAr alternate positive direction (evens) (4050-1) distal promoters

  1. TATAr8: TATAAAAA at 708.
  2. TATAr6ci: CTTATATA at 833.

TATA box (Basehoar 2004) analysis and results

"About 24% of human genes have a TATA-like element and their promoters are generally AT-rich; however, only ~10% of these TATA-containing promoters have the canonical TATA box (TATAWAWR)."[3] Several Saccharomyces genomes had the consensus sequence TATA(A/T)A(A/T)(A/G), yet only about 20% of yeast genes even contained the TATA sequence.[70]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 2 2 1 1 (--1,+-1)
Randoms UTR arbitrary negative 2 10 0.2 0.25
Randoms UTR alternate negative 3 10 0.3 0.25
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 1 10 0.1 0
Randoms Proximal alternate positive 1 10 0.1 0
Reals Distal negative 5 2 2.5 2.5 ± 0.5 (--2,+-3)
Randoms Distal arbitrary negative 2 10 0.2 0.15
Randoms Distal alternate negative 1 10 0.1 0.15
Reals Distal positive 0 2 0 0
Randoms Distal arbitrary positive 3 10 0.3 0.25
Randoms Distal alternate positive 2 10 0.2 0.25

Comparison:

The occurrences of real TATA UTRs and distals are greater than the randoms. This suggests that the real TATAs are likely active or activable.

Basehoar-1 TATA boxes

  1. Gene ID: 173. It has a TATA box (TATAAAAA) from -30 to -23 nts from the TSS.[7]
  2. Gene ID: 229. It has a TATA box (TATAAAAA) from -32 to -24 nts from the TSS.[7]
  3. Gene ID: 229. It has a TATA box (TATAAAAA) from 23 to 30 nts from the TSS.[7]
  4. Gene ID: 759. It has a TATA box (TATAAAAA) from -31 to -24 nts from the TSS.[7]
  5. Gene ID: 1544. It has a TATA box (TATAAAAA) from -28 to -21 nts from the TSS.[7]
  6. Gene ID: 1571. It has a TATA box (TATAAAAA) from -30 to -23 nts from the TSS.[7]
  7. Gene ID: 1581. It has a TATA box (TATAAAAA) from -31 to -24 nts from the TSS.[7]
  8. Gene ID: 1906. It has a TATA box (TATAAAAA) from -32 to -25 nts from the TSS.[7]
  9. Gene ID: 1938. It has a TATA box (TATAAAAA) from -31 to -24 nts from the TSS.[7]
  10. Gene ID: 1974. It has a TATA box (TATAAAAA) from -30 to -23 nts from the TSS.[7]
  11. Gene ID: 2688. It has a TATA box (TATAAAAA) from -31 to -24 nts from the TSS.[7]
  12. Gene ID: 3605. It has a TATA box (TATAAAAA) from -44 to -37 nts from the TSS.[7]
  13. Gene ID: 4256. It has a TATA box (TATAAAAA) from 2 to 9 nts from the TSS.[7]
  14. Gene ID: 4435. It has a TATA box (TATAAAAA) from -2 to 6 nts from the TSS.[7]
  15. Gene ID: 4878. It has a TATA box (TATAAAAA) from -30 to -22 nts from the TSS.[7]
  16. Gene ID: 5743. It has a TATA box (TATAAAAA) from -31 to -25 nts from the TSS.[7]
  17. Gene ID: 6279. It has a TATA box (TATAAAAA) from -30 to -24 nts from the TSS.[7]
  18. Gene ID: 6376. It has a TATA box (TATAAAAA) from -12 to -5 nts from the TSS.[7]
  19. Gene ID: 6513. It has a TATA box (TATAAAAA) from 18 to 24 nts from the TSS.[7]
  20. Gene ID: 6518. It has a TATA box (TATAAAAA) from -31 to -24 nts from the TSS.[7]
  21. Gene ID: 6906. It has a TATA box (TATAAAAA) from -32 to -25 nts from the TSS.[7]
  22. Gene ID: 8564. It has a TATA box (TATAAAAA) from -14 to -7 nts from the TSS.[7]
  23. Gene ID: 8832. It has a TATA box (TATAAAAA) from -12 to -5 nts from the TSS.[7]
  24. Gene ID: 8969. It has a TATA box (TATAAAAA) from -27 to -20 nts from the TSS.[7]
  25. Gene ID: 10215. It has a TATA box (TATAAAAA) from -31 to -24 nts from the TSS.[7]
  26. Gene ID: 27159. It has a TATA box (TATAAAAA) from -32 to -25 nts from the TSS.[7]
  27. Gene ID: 51129. It has a TATA box (TATAAAAA) from -29 to -22 nts from the TSS.[7]
  28. Gene ID: 54106. It has a TATA box (TATAAAAA) from -30 to -23 nts from the TSS.[7]
  29. Gene ID: 55603. It has a TATA box (TATAAAAA) from -2 to 6 nts from the TSS.[7]
  30. Gene ID: 64111. It has a TATA box (TATAAAAA) from -32 to -25 nts from the TSS.[7]
  31. Gene ID: 130120. It has a TATA box (TATAAAAA) from 17 to 24 nts from the TSS.[7]
  32. Gene ID: 147183. It has a TATA box (TATAAAAA) from -30 to -23 nts from the TSS.[7]

Basehoar-2 TATA boxes

  1. Gene ID: 2. It has a TATA box (TATAAATA) from -28 to -21 nts from the TSS.[7]
  2. Gene ID: 183. It has a TATA box (TATAAATA) from -32 to -25 nts from the TSS.[7]
  3. Gene ID: 203. It has a TATA box (TATAAATA) from -4 to 4 nts from the TSS.[7]
  4. Gene ID: 279. It has a TATA box (TATAAATA) from -27 to -20 nts from the TSS.[7]
  5. Gene ID: 280. It has a TATA box (TATAAATA) from -30 to -23 nts from the TSS.[7]
  6. Gene ID: 358. It has a TATA box (TATAAATA) from 23 to 30 nts from the TSS.[7]
  7. Gene ID: 358. It has a TATA box (TATAAATA) from -31 to -24 nts from the TSS.[7]
  8. Gene ID: 846. It has a TATA box (TATAAATA) from -28 to -25 nts from the TSS.[7]
  9. Gene ID: 1051. It has a TATA box (TATAAATA) from -29 to -22 nts from the TSS.[7]
  10. Gene ID: 1152. It has a TATA box (TATAAATA) from -41 to -34 nts from the TSS.[7]
  11. Gene ID: 1191. It has a TATA box (TATAAATA) from -23 to -16 nts from the TSS.[7]
  12. Gene ID: 1278. It has a TATA box (TATAAATA) from -34 to -27 nts from the TSS.[7]
  13. Gene ID: 2250. It has a TATA box (TATAAATA) from -32 to -21 nts from the TSS.[7]
  14. Gene ID: 3458. It has a TATA box (TATAAATA) from -29 to -22 nts from the TSS.[7]
  15. Gene ID: 3906. It has a TATA box (TATAAATA) from -31 to -24 nts from the TSS.[7]
  16. Gene ID: 4144. It has a TATA box (TATAAATA) from -30 to -23 nts from the TSS.[7]
  17. Gene ID: 4747. It has a TATA box (TATAAATA) from -28 to -21 nts from the TSS.[7]
  18. Gene ID: 4843. It has a TATA box (TATAAATA) from -31 to -24 nts from the TSS.[7]
  19. Gene ID: 5449. It has a TATA box (TATAAATA) from -31 to -24 nts from the TSS.[7]
  20. Gene ID: 6288. It has a TATA box (TATAAATA) from -30 to -23 nts from the TSS.[7]
  21. Gene ID: 6289. It has a TATA box (TATAAATA) from -17 to -10 nts from the TSS.[7]
  22. Gene ID: 6364. It has a TATA box (TATAAATA) from -30 to -23 nts from the TSS.[7]
  23. Gene ID: 6414. It has a TATA box (TATAAATA) from -31 to -24 nts from the TSS.[7]
  24. Gene ID: 6428. It has a TATA box (TATAAATA) from 19 to 26 nts from the TSS.[7]
  25. Gene ID: 6707. It has a TATA box (TATAAATA) from -30 to -23 nts from the TSS.[7]
  26. Gene ID: 8431. It has a TATA box (TATAAATA) from -35 to -28 nts from the TSS.[7]
  27. Gene ID: 10563. It has a TATA box (TATAAATA) from 32 to 40 nts from the TSS.[7]
  28. Gene ID: 12723. It has a TATA box (TATAAATA) from -32 to -25 nts from the TSS.[7]
  29. Gene ID: 22928. It has a TATA box (TATAAATA) from -33 to -26 nts from the TSS.[7]
  30. Gene ID: 51582. It has a TATA box (TATAAATA) from -26 to -19 nts from the TSS.[7]
  31. Gene ID: 81606. It has a TATA box (TATAAATA) from -29 to -22 nts from the TSS.[7]
  32. Gene ID: 84223. It has a TATA box (TATAAATA) from 26 to 33 nts from the TSS.[7]
  33. Gene ID: 117158. It has a TATA box (TATAAATA) from -31 to -24 nts from the TSS.[7]

Basehoar-3 TATA boxes

  1. Gene ID: 338. It has a TATA box (TATAAATG) from -29 to -22 nts from the TSS.[7]
  2. Gene ID: 359. It has a TATA box (TATAAATG) from -30 to -23 nts from the TSS.[7]
  3. Gene ID: 383. It has a TATA box (TATAAATG) from -29 to -22 nts from the TSS.[7]
  4. Gene ID: 1427. It has a TATA box (TATAAATG) from -32 to -25 nts from the TSS.[7]
  5. Gene ID: 4357. It has a TATA box (TATAAATG) from -17 to -10 nts from the TSS.[7]
  6. Gene ID: 4741. It has a TATA box (TATAAATG) from -32 to -25 nts from the TSS.[7]
  7. Gene ID: 6280. It has a TATA box (TATAAATG) from -23 to -16 nts from the TSS.[7]
  8. Gene ID: 6435. It has a TATA box (TATAAATG) from -42 to -35 nts from the TSS.[7]
  9. Gene ID: 6916. It has a TATA box (TATAAATG) from -30 to -23 nts from the TSS.[7]
  10. Gene ID: 10482. It has a TATA box (TATAAATG) from -31 to -24 nts from the TSS.[7]
  11. Gene ID: 51313. It has a TATA box (TATAAATG) from 2 to 9 nts from the TSS.[7]
  12. Gene ID: 126393. It has a TATA box (TATAAATG) from -30 to -23 nts from the TSS.[7]

M3 motif samplings

For the Basic programs testing consensus sequence (C/G)CGGAAG(C/T) (starting with SuccessablesM3.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 0.
  2. Positive strand, negative direction: 1, GCGGAAGT at 2731.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 7, CCGGAAGC at 1517, GCGGAAGC at 1306, CCGGAAGT at 1265, CCGGAAGC at 1013, CCGGAAGT at 929, CCGGAAGT at 829, CCGGAAGC at 593.
  5. inverse complement, negative strand, negative direction: 1, GCTTCCGT at 1558.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

M3 negative direction (2811-2596) proximal promoters

  1. Positive strand, negative direction: GCGGAAGT at 2731.

M3 negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: GCTTCCGT at 1558.

M3 positive direction (4050-1) distal promoters

  1. Positive strand, positive direction: CCGGAAGC at 1517, GCGGAAGC at 1306, CCGGAAGT at 1265, CCGGAAGC at 1013, CCGGAAGT at 929, CCGGAAGT at 829, CCGGAAGC at 593.

M3 random dataset samplings

  1. M3r0: 0.
  2. M3r1: 0.
  3. M3r2: 1, GCGGAAGT at 3374.
  4. M3r3: 0.
  5. M3r4: 1, GCGGAAGT at 2555.
  6. M3r5: 0.
  7. M3r6: 2, GCGGAAGC at 2580, CCGGAAGT at 1193.
  8. M3r7: 0.
  9. M3r8: 1, GCGGAAGC at 2757.
  10. M3r9: 0.
  11. M3r0ci: 0.
  12. M3r1ci: 0.
  13. M3r2ci: 1, ACTTCCGG at 3397.
  14. M3r3ci: 1, GCTTCCGC at 1824.
  15. M3r4ci: 0.
  16. M3r5ci: 0.
  17. M3r6ci: 0.
  18. M3r7ci: 0.
  19. M3r8ci: 1, ACTTCCGC at 4254.
  20. M3r9ci: 1, GCTTCCGG at 3340.

M3r arbitrary (evens) (4560-2846) UTRs

  1. M3r2: GCGGAAGT at 3374.
  2. M3r2ci: ACTTCCGG at 3397.
  3. M3r8ci: ACTTCCGC at 4254.

M3r alternate (odds) (4560-2846) UTRs

  1. M3r9ci: GCTTCCGG at 3340.

M3r arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. M3r8: GCGGAAGC at 2757.

M3r alternate positive direction (evens) (4265-4050) proximal promoters

  1. M3r8ci: ACTTCCGC at 4254.

M3r arbitrary negative direction (evens) (2596-1) distal promoters

  1. M3r4: GCGGAAGT at 2555.
  2. M3r6: GCGGAAGC at 2580, CCGGAAGT at 1193.

M3r alternate negative direction (odds) (2596-1) distal promoters

  1. M3r3ci: GCTTCCGC at 1824.

M3r arbitrary positive direction (odds) (4050-1) distal promoters

  1. M3r3ci: GCTTCCGC at 1824.
  2. M3r9ci: GCTTCCGG at 3340.

M3r alternate positive direction (evens) (4050-1) distal promoters

  1. M3r2: GCGGAAGT at 3374.
  2. M3r4: GCGGAAGT at 2555.
  3. M3r6: GCGGAAGC at 2580, CCGGAAGT at 1193.
  4. M3r8: GCGGAAGC at 2757.
  5. M3r2ci: ACTTCCGG at 3397.

M3 analysis and results

M3 (SCGGAAGY) occurs preferentially in human TATA-less core promoters.[3]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 3 10 0.3 0.2 ± 0.1
Randoms UTR alternate negative 1 10 0.1 0.2 ± 0.1
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 1 2 0.5 0.5 ± 0.5 (--0,+-1)
Randoms Proximal arbitrary negative 1 10 0.1 0.05
Randoms Proximal alternate negative 0 10 0 0.05
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0.05
Randoms Proximal alternate positive 1 10 0.1 0.05
Reals Distal negative 1 2 0.5 0.5 ± (--1,+-0)
Randoms Distal arbitrary negative 3 10 0.3 0.2 ± 0.1
Randoms Distal alternate negative 1 10 0.1 0.2 ± 0.1
Reals Distal positive 7 2 3.5 3.5 ± 3.5 (-+0,++7)
Randoms Distal arbitrary positive 2 10 0.2 0.4 ± 0.2
Randoms Distal alternate positive 6 10 0.6 0.4 ± 0.2

Comparison:

The occurrences of real M3 proximal and distals are greater than the randoms. This suggests that the real M3s are likely active or activable.

M22 samplings

For the Basic programs testing consensus sequence TGCGCAN(G/T) (starting with SuccessablesM22.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 0.
  2. Positive strand, negative direction: 0.
  3. Negative strand, positive direction: 1, TGCGCAAG at 1525.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 0.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 3, CGTGCGCA at 1523, CCTGCGCA at 1414, CCTGCGCA at 1314.
  8. inverse complement, positive strand, positive direction: 0.

M22 positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: TGCGCAAG at 1525.
  2. Negative strand, positive direction: CGTGCGCA at 1523, CCTGCGCA at 1414, CCTGCGCA at 1314.

M22 random dataset samplings

  1. M22r0: 0.
  2. M22r1: 1, TGCGCAAT at 1589.
  3. M22r2: 0.
  4. M22r3: 0.
  5. M22r4: 0.
  6. M22r5: 0.
  7. M22r6: 0.
  8. M22r7: 0.
  9. M22r8: 0.
  10. M22r9: 0.
  11. M22r0ci: 0.
  12. M22r1ci: 0.
  13. M22r2ci: 0.
  14. M22r3ci: 0.
  15. M22r4ci: 0.
  16. M22r5ci: 0.
  17. M22r6ci: 0.
  18. M22r7ci: 0.
  19. M22r8ci: 0.
  20. M22r9ci: 0.

M22r alternate negative direction (odds) (2596-1) distal promoters

  1. M22r1: TGCGCAAT at 1589.

M22r arbitrary positive direction (odds) (4050-1) distal promoters

  1. M22r1: TGCGCAAT at 1589.

M22 analysis and results

M22 (TGCGCANK), where K = (G/T) occurs preferentially in human TATA-less core promoters.[3]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 0 10 0 0
Randoms UTR alternate negative 0 10 0 0
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal arbitrary negative 0 10 0 0.05
Randoms Distal alternate negative 1 10 0.1 0.05
Reals Distal positive 4 2 2 2 ± 2 (-+4,++0)
Randoms Distal arbitrary positive 1 10 0.1 0.05
Randoms Distal alternate positive 0 10 0 0.05

Comparison:

The occurrences of real M22 distals are greater than the randoms. This suggests that the real M22s are likely active or activable.

Comparisons of TATA boxes for UTRs nn

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(4560-2846) (4560-2846) (4560-2846) (4560-2846) (4560-2846)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
- - TATATAT at 2872 - -
- - ciTTATATA at 2871 ciTTTATATA at 2871 ciTTTATATA at 2871
ciTTTATA at 2869 ciTTTTATA at 2869 ciTTTTATA at 2869 - -
TATAAA at 2852 TATAAAA at 2853 TATAAAA at 2853 - -

Comparisons of TATA boxes for UTRs pn

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(4560-2846) (4560-2846) (4560-2846) (4560-2846) (4560-2846)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
TATAAA at 2874 - TATATAA at 2873 TATATAAA at 2874 TATATAAA at 2874

Comparisons of TATA boxes for proximals nn

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(2811-2596) (2811-2596) (2811-2596) (2811-2596) (2811-2596)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
ciTTTATA at 2638 - ciATTTATA at 2638 - -


Comparisons of TATA boxes for distals nn

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(2596-1) (2596-1) (2596-1) (2596-1) (2596-1)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
ciTTTATA at 1740 ciTTTTATA at 1740 ciTTTTATA at 1740 - -
TATAAA at 1602 - TATATAA at 1601 TATATAAA at 1602 TATATAAA at 1602
- - TATATAT at 1599 - TATATATA at 1600

Comparisons of TATA boxes for distals pn

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(2596-1) (2596-1) (2596-1) (2596-1) (2596-1)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
- - TATATAT at 1600 - -
TATAAA at 221 TATAAAA at 222 TATAAAA at 222 TATAAAAG at 223 TATAAAAG at 223
ciTTTATA at 219 ciTTTTATA at 219 ciTTTTATA at 219 ciTTTTTATA at 219 ciTTTTTATA at 219
TATAAA at 182 TATAAAA at 183 TATAAAA at 183 TATAAAAG at 184 ciTATAAAAG at 184

Comparisons of TATA boxes for distals np

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(4050-1) (4050-1) (4050-1) (4050-1) (4050-1)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
ciTTTATA at 2588 - - - -

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

References

  1. R. P. Lifton, M. L. Goldberg, R. W. Karp, and D. S. Hogness (1978). "The organization of the histone genes in Drosophila melanogaster: functional and evolutionary implications". Cold Spring Harbor Symposia on Quantitative Biology. 42: 1047–51. doi:10.1101/SQB.1978.042.01.105. PMID 98262.
  2. 2.0 2.1 Stephen T. Smale and James T. Kadonaga (July 2003). "The RNA Polymerase II Core Promoter" (PDF). Annual Review of Biochemistry. 72 (1): 449–79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739. Retrieved 2012-05-07.
  3. 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 C Yang, E Bolotin, T Jiang, FM Sladek, E Martinez (March 2007). "Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. PMID 17123746.
  4. Stephen T. Smale (October 1, 2001). "Core promoters: active contributors to combinatorial gene regulation". Genes & Development. 15 (19): 2503–8. doi:10.1101/gad.937701. Retrieved 2012-04-28.
  5. 5.00 5.01 5.02 5.03 5.04 5.05 5.06 5.07 5.08 5.09 5.10 5.11 5.12 5.13 5.14 5.15 5.16 Tamar Juven-Gershon and James T. Kadonaga (15 March 2010). "Regulation of gene expression via the core promoter and the basal transcriptional machinery". Developmental Biology. 339 (2): 225–9. doi:10.1016/j.ydbio.2009.08.009. Retrieved 2016-01-16.
  6. 6.0 6.1 6.2 6.3 6.4 6.5 Muyu Xu, Elsie Gonzalez-Hurtado, and Ernest Martinez (April 2016). "Core promoter-specific gene regulation: TATA box selectivity and Initiator-dependent bi-directionality of serum response factor-activated transcription". Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms. 1859 (4): 553–563. doi:10.1016/j.bbagrm.2016.01.005. Retrieved 2024-06-08.
  7. 7.000 7.001 7.002 7.003 7.004 7.005 7.006 7.007 7.008 7.009 7.010 7.011 7.012 7.013 7.014 7.015 7.016 7.017 7.018 7.019 7.020 7.021 7.022 7.023 7.024 7.025 7.026 7.027 7.028 7.029 7.030 7.031 7.032 7.033 7.034 7.035 7.036 7.037 7.038 7.039 7.040 7.041 7.042 7.043 7.044 7.045 7.046 7.047 7.048 7.049 7.050 7.051 7.052 7.053 7.054 7.055 7.056 7.057 7.058 7.059 7.060 7.061 7.062 7.063 7.064 7.065 7.066 7.067 7.068 7.069 7.070 7.071 7.072 7.073 7.074 7.075 7.076 7.077 7.078 7.079 7.080 7.081 7.082 7.083 7.084 7.085 7.086 7.087 7.088 7.089 7.090 7.091 7.092 7.093 7.094 7.095 7.096 7.097 7.098 7.099 7.100 7.101 7.102 7.103 7.104 7.105 7.106 7.107 7.108 7.109 7.110 7.111 7.112 7.113 7.114 7.115 7.116 7.117 7.118 7.119 7.120 7.121 7.122 7.123 7.124 7.125 7.126 7.127 7.128 7.129 7.130 7.131 7.132 7.133 7.134 7.135 7.136 7.137 7.138 7.139 7.140 7.141 7.142 7.143 7.144 7.145 7.146 7.147 7.148 7.149 7.150 7.151 7.152 7.153 7.154 7.155 7.156 7.157 7.158 7.159 7.160 7.161 7.162 7.163 7.164 7.165 7.166 7.167 7.168 7.169 7.170 7.171 7.172 7.173 7.174 7.175 7.176 7.177 7.178 7.179 7.180 7.181 7.182 7.183 7.184 7.185 7.186 7.187 7.188 7.189 7.190 7.191 7.192 7.193 7.194 7.195 7.196 7.197 7.198 7.199 7.200 7.201 7.202 7.203 7.204 7.205 7.206 7.207 7.208 7.209 7.210 7.211 7.212 7.213 7.214 7.215 7.216 7.217 7.218 7.219 7.220 7.221 7.222 7.223 7.224 7.225 7.226 7.227 7.228 7.229 7.230 7.231 7.232 7.233 7.234 7.235 7.236 7.237 7.238 7.239 7.240 7.241 7.242 7.243 7.244 7.245 7.246 7.247 7.248 7.249 7.250 7.251 7.252 7.253 7.254 7.255 7.256 7.257 7.258 7.259 7.260 7.261 7.262 7.263 7.264 7.265 7.266 7.267 7.268 7.269 7.270 7.271 7.272 7.273 7.274 7.275 7.276 7.277 7.278 7.279 7.280 7.281 7.282 7.283 7.284 7.285 7.286 7.287 7.288 7.289 7.290 7.291 7.292 7.293 7.294 7.295 7.296 7.297 7.298 7.299 7.300 7.301 7.302 7.303 7.304 7.305 7.306 7.307 7.308 7.309 7.310 7.311 7.312 7.313 7.314 7.315 7.316 7.317 7.318 7.319 7.320 7.321 7.322 7.323 7.324 7.325 7.326 7.327 7.328 7.329 7.330 7.331 7.332 7.333 7.334 7.335 7.336 7.337 7.338 7.339 7.340 7.341 7.342 7.343 7.344 7.345 7.346 7.347 7.348 7.349 7.350 7.351 7.352 7.353 7.354 7.355 7.356 7.357 7.358 7.359 7.360 7.361 7.362 7.363 7.364 7.365 7.366 7.367 7.368 7.369 7.370 7.371 7.372 7.373 7.374 7.375 7.376 7.377 7.378 7.379 7.380 7.381 7.382 7.383 7.384 7.385 7.386 7.387 7.388 7.389 7.390 7.391 7.392 7.393 7.394 7.395 7.396 7.397 7.398 7.399 7.400 7.401 7.402 7.403 7.404 7.405 7.406 7.407 7.408 7.409 7.410 7.411 7.412 7.413 7.414 7.415 7.416 7.417 7.418 7.419 7.420 7.421 7.422 7.423 7.424 7.425 7.426 7.427 7.428 7.429 7.430 7.431 7.432 7.433 7.434 7.435 7.436 7.437 7.438 7.439 7.440 7.441 7.442 7.443 7.444 7.445 7.446 7.447 7.448 7.449 7.450 7.451 7.452 7.453 7.454 7.455 7.456 7.457 7.458 7.459 7.460 7.461 7.462 7.463 7.464 7.465 7.466 7.467 7.468 7.469 7.470 7.471 7.472 7.473 7.474 7.475 7.476 7.477 7.478 7.479 7.480 7.481 7.482 7.483 7.484 7.485 7.486 7.487 7.488 7.489 7.490 7.491 7.492 7.493 7.494 7.495 7.496 7.497 7.498 7.499 7.500 7.501 7.502 7.503 7.504 7.505 7.506 7.507 7.508 7.509 7.510 7.511 7.512 7.513 7.514 7.515 7.516 7.517 7.518 7.519 7.520 7.521 7.522 7.523 7.524 7.525 7.526 7.527 7.528 7.529 7.530 7.531 7.532 7.533 7.534 7.535 7.536 7.537 7.538 7.539 7.540 7.541 7.542 7.543 7.544 7.545 7.546 7.547 7.548 7.549 7.550 7.551 7.552 7.553 7.554 7.555 7.556 7.557 7.558 7.559 7.560 7.561 7.562 7.563 7.564 7.565 7.566 7.567 7.568 7.569 7.570 7.571 7.572 7.573 7.574 7.575 7.576 7.577 7.578 7.579 7.580 7.581 7.582 7.583 7.584 7.585 7.586 7.587 7.588 7.589 7.590 7.591 7.592 7.593 7.594 7.595 7.596 7.597 7.598 7.599 7.600 7.601 7.602 7.603 7.604 7.605 7.606 7.607 7.608 7.609 7.610 7.611 7.612 7.613 7.614 7.615 7.616 7.617 7.618 7.619 7.620 7.621 7.622 7.623 7.624 7.625 7.626 7.627 7.628 7.629 7.630 7.631 7.632 7.633 7.634 7.635 7.636 7.637 Victor X Jin, Gregory AC Singer, Francisco J Agosto-Pérez, Sandya Liyanarachchi, and Ramana V Davuluri (2006). "Genome-wide analysis of core promoter elements from conserved human and mouse orthologous pairs". BMC Bioinformatics. 7: 114. doi:10.1186/1471-2105-7-114. Retrieved 2024-06-09.
  8. RefSeq (November 2016). "A2M alpha-2-macroglobulin [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-14.
  9. RefSeq (September 2019). "ACTA1 actin alpha 1, skeletal muscle [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  10. RefSeq (July 2008). "ACTC1 actin alpha cardiac muscle 1 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-14.
  11. RefSeq (August 2014). "ADM adrenomedullin [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-18.
  12. RefSeq (November 2019). "AGT angiotensinogen [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  13. RefSeq (August 2020). "AGTR1 angiotensin II receptor type 1 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-18.
  14. RefSeq (January 2022). "AK1 adenylate kinase 1 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-18.
  15. RefSeq (October 2015). "ALPL alkaline phosphatase, biomineralization associated [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  16. RefSeq (July 2008). "AMELX amelogenin X-linked [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-18.
  17. RefSeq (January 2015). "AMY2A amylase alpha 2A [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  18. RefSeq (June 2013). "AMY2B amylase alpha 2B [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  19. RefSeq (June 2013). "SLC25A5 solute carrier family 25 member 5 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-18.
  20. RefSeq (July 2008). "AOC2 amine oxidase copper containing 2 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-18.
  21. RefSeq (July 2008). "APOA2 apolipoprotein A2 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  22. RefSeq (December 2019). "APOB apolipoprotein B [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-18.
  23. RefSeq (August 2016). "AQP1 aquaporin 1 (Colton blood group) [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-18.
  24. RefSeq (October 2008). "AQP2 aquaporin 2 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-18.
  25. RefSeq (April 2011). "ATF3 activating transcription factor 3 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  26. RefSeq (March 2010). "ATP1B1 ATPase Na+/K+ transporting subunit beta 1 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  27. 27.0 27.1 RefSeq (July 2008). "ATP5PB ATP synthase peripheral stalk-membrane subunit b [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  28. RefSeq (June 2011). "BRDT bromodomain testis associated [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  29. RefSeq (July 2008). "CD247 CD247 molecule [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  30. RefSeq (September 2009). "CHI3L1 chitinase 3 like 1 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-09.
  31. RefSeq (September 2009). "CLCNKB chloride voltage-gated channel Kb [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  32. RefSeq (December 2010). "CRABP2 cellular retinoic acid binding protein 2 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  33. Marcelo A. Nobrega, Ivan Ovcharenko, Veena Afzal, and Edward M. Rubin (October 2003). "Scanning human gene deserts for long-range enhancers". Science. 302 (5644): 413. doi:10.1126/science.1088328. PMID 14563999. Retrieved 2012-12-26.
  34. 34.0 34.1 HGNC (December 20, 2012). "DACH1 dachshund homolog 1 (Drosophila) [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2012-12-26.
  35. RefSeq (July 2008). "DPT dermatopontin [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  36. 36.0 36.1 Yutaka Suzuki, Tatsuhiko Tsunoda, Jun Sese, Hirotoshi Taira, Junko Mizushima-Sugano, Hiroko Hata, Toshio Ota, Takao Isogai, Toshihiro Tanaka, Yusuke Nakamura, Akira Suyama, Yoshiyuki Sakaki, Shinichi Morishita, Kousaku Okubo, and Sumio Sugano (11 April 2001). "Identification and Characterization of the Potential Promoter Regions of 1031 Kinds of Human Genes". Genome Research. 11 (5): 677-684. doi:10.1101/gr.164001.
  37. RefSeq (June 2016). "NR5A2 nuclear receptor subfamily 5 group A member 2 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  38. RefSeq (December 2014). "GLUL glutamate-ammonia ligase [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  39. RefSeq (July 2008). "GNAT2 G protein subunit alpha transducin 2 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  40. Alliance of Genome Resources (April 2022). "GUCA2A guanylate cyclase activator 2A [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  41. RefSeq (November 2015). "GUCA2B guanylate cyclase activator 2B [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  42. RefSeq (October 2009). "HMGCS2 3-hydroxy-3-methylglutaryl-CoA synthase 2 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  43. RefSeq (June 2016). "HSD3B1 hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta-isomerase 1 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  44. RefSeq (October 2009). "HSD3B2 hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta-isomerase 2 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  45. Thomas W. Burke and James T. Kadonaga (November 15, 1997). "The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAFII60 of Drosophila". Genes & Development. 11 (22): 3020–31. doi:10.1101/gad.11.22.3020. PMC 316699. PMID 9367984.
  46. HGNC (February 3, 2013). "HSPA4 heat shock 70kDa protein 4 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2013-02-07.
  47. RefSeq (September 2011). "CCN1 cellular communication network factor 1 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  48. RefSeq (August 2011). "LAMC2 laminin subunit gamma 2 [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2024-06-10.
  49. HGNC:3576 (June 6, 2024). "FADS3 fatty acid desaturase 3 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-07.
  50. RefSeq (July 2008). "LORICRIN loricrin cornified envelope precursor protein [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-10.
  51. RefSeq (February 2011). "MUC1 mucin 1, cell surface associated [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-10.
  52. RefSeq (July 2008). "MYOC myocilin [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-10.
  53. RefSeq (October 2015). "NPPA natriuretic peptide A [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-10.
  54. RefSeq (July 2008). "OVGP1 oviductal glycoprotein 1 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-10.
  55. RefSeq (January 2011). "PRDX1 peroxiredoxin 1 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-10.
  56. RefSeq (July 2008). "PDC phosducin [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-10.
  57. RefSeq (February 2009). "PTGS2 prostaglandin-endoperoxide synthase 2 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-10.
  58. Tetsuya Kosaka, Atsuro Miyata, Hayato Ihara, Shuntaro Hara, Tamiko Sugimoto, Osamu Takeda, Ei-ichi Takahashi, Tadashi Tanabe (May 1994). "Characterization of the human gene (PTGS2) encoding prostaglandin‐endoperoxide synthase 2". European Journal of Biochemistry. 221 (3): 889–97. doi:10.1111/j.1432-1033.1994.tb18804.x. Retrieved 2012-12-26.
  59. RefSeq (July 2008). "RGS1 regulator of G protein signaling 1 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-10.
  60. RefSeq (August 2009). "RGS2 regulator of G protein signaling 2 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-11.
  61. RefSeq (October 2017). "RPE65 retinoid isomerohydrolase RPE65 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-11.
  62. RefSeq (July 2018). "RPS27 ribosomal protein S27 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-11.
  63. RefSeq (January 2016). "S100A8 S100 calcium binding protein A8 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-11.
  64. RefSeq (June 2016). "SLC2A5 solute carrier family 2 member 5 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-11.
  65. RefSeq (September 2011). "SLC9A1 solute carrier family 9 member A1 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-11.
  66. RefSeq (October 2009). "SLC16A1 solute carrier family 16 member 1 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-11.
  67. Alliance of Genome Resources (April 2022). "SPRR1A small proline rich protein 1A [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-11.
  68. HGNC:10227 (May 13, 2024). "RNU6-1 RNA, U6 small nuclear 1 [ Homo sapiens (human) ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2024-06-08.
  69. Y Ohshima, N Okada, T Tani, Y Itoh, and M Itoh (10 October 1981). "Nucleotide sequences of mouse genomic loci including a gene or pseudogene for U6 (4.8S) nuclear RNA". Nucleic Acids Research. 9 (19): 5145–5158. doi:10.1093/nar/9.19.5145. PMID 6171774. Retrieved 2024-06-08.
  70. 70.0 70.1 Basehoar, Andrew D.; Zanton, Sara J.; Pugh, B. Franklin (2004-03-05). "Identification and distinct regulation of yeast TATA box-containing genes". Cell. 116 (5): 699–709. ISSN 0092-8674. PMID 15006352.
  71. Jennifer E.F. Butler, James T. Kadonaga (October 15, 2002). "The RNA polymerase II core promoter: a key component in the regulation of gene expression". Genes & Development. 16 (20): 2583–292. doi:10.1101/gad.1026202. PMID 12381658.
  72. Carninci P., Sandelin A., Lenhard B., Katayama S., Shimokawa K., Ponjavic J., Semple C.A., Taylor M.S., Engström P.G., Frith M.C., Forrest A.R., Alkema W.B., Tan S.L., Plessy C., Kodzius R., Ravasi T., Kasukawa T., Fukuda S., Kanamori-Katayama M., Kitazume Y., Kawaji H., Kai C., Nakamura M., Konno H., Nakano K., Mottagui-Tabar S., Arner P., Chesi A., Gustincich S., Persichetti F., Suzuki H., Grimmond S.M., Wells C.A., Orlando V., Wahlestedt C., Liu E.T., Harbers M., Kawai J., Bajic V.B., Hume D.A., Hayashizaki Y. (2006). "Genome-wide analysis of mammalian promoter architecture and evolution". Nat. Genet. 38 (6): 626–35. doi:10.1038/ng1789. PMID 16645617.
  73. Molecular biology of the gene. Watson, James D., 1928- (Seventh ed.). Boston. ISBN 9780321762436. OCLC 824087979.

Further reading

External links

{{Phosphate biochemistry}}