Kozak sequence gene transcriptions: Difference between revisions

Jump to navigation Jump to search
Marshallsumter (talk | contribs)
Marshallsumter (talk | contribs)
Line 105: Line 105:
See [[AGC box gene transcriptions#GCC box samplings|GCC box samplings]] to see that GCCGCC is present in A1BG promoters but not TSS ± 50.
See [[AGC box gene transcriptions#GCC box samplings|GCC box samplings]] to see that GCCGCC is present in A1BG promoters but not TSS ± 50.


==CCA box==
==CCA box samplings==
 
For the Basic programs testing consensus sequence CCATGG (starting with SuccessablesCCA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
# negative strand, negative direction, looking for CCATGG, 0.
# positive strand, negative direction, looking for CCATGG, 0.
# positive strand, positive direction, looking for CCATGG, 0.
# negative strand, positive direction, looking for CCATGG, 2, CCATGG at 4222, CCATGG at 3581.
# complement, negative strand, negative direction, looking for GGTACC, 0.
# complement, positive strand, negative direction, looking for CGGCGG(C/T)GGTACC, 0.
# complement, positive strand, positive direction, looking for CGGCGG(C/T)GGTACC, 0.
# complement, negative strand, positive direction, looking for CGGCGG(C/T)GGTACC, 0.
# inverse complement, negative strand, negative direction, looking for CCATGG(C/T)GGCGGC, 0.
# inverse complement, positive strand, negative direction, looking for CCATGG(C/T)GGCGGC, 0.
# inverse complement, positive strand, positive direction, looking for CCATGG(C/T)GGCGGC, 0.
# inverse complement, negative strand, positive direction, looking for CCATGG(C/T)GGCGGC, 0.
# inverse positive strand, negative direction, looking for GGTACC(A/G)CCGCCG, 0.
# inverse negative strand, negative direction, looking for GGTACC(A/G)CCGCCG, 0.
# inverse positive strand, positive direction, looking for GGTACC(A/G)CCGCCG, 0.
# inverse negative strand, positive direction, looking for GGTACC(A/G)CCGCCG, 0.


==(Kozak) samplings==
==(Kozak) samplings==

Revision as of 03:49, 26 September 2024

Associate Editor(s)-in-Chief: Henry A. Hoff

The Kozak sequence is a nucleic acid motif that functions as the protein translation initiation site in most eukaryotic mRNA transcripts.[1] Regarded as the optimum sequence for initiating translation in eukaryotes, the sequence is an integral aspect of protein regulation and overall cellular health as well as having implications in human disease.[1][2]

A wrong start site can result in non-functional proteins.[3]

As it has become more studied, expansions of the nucleotide sequence, bases of importance, and notable exceptions have arisen.[1][4][5]

The sequence was discovered through a detailed analysis of DNA genomic sequences.[6]

The Kozak Sequence was determined by sequencing of 699 vertebrate mRNAs and verified by site-directed mutagenesis.[7] While initially limited to a subset of vertebrates (i.e. human, cow, cat, dog, chicken, guinea pig, hamster, mouse, pig, rabbit, sheep, and Xenopus), subsequent studies confirmed its conservation in higher eukaryotes generally.[1] The sequence was defined as 5'-(gcc)gccRccATGG-3' IUPAC nucleobase notation.[7]

Human genes

Consensus sequences

Kozak consensus sequence is GAAAATGG.[8]

Consensus sequence for the Kozak is 5'-(GCC)GCC(A/G)CCATGG-3'.[7]

GCC box

See GCC box samplings to see that GCCGCC is present in A1BG promoters but not TSS ± 50.

CCA box samplings

For the Basic programs testing consensus sequence CCATGG (starting with SuccessablesCCA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CCATGG, 0.
  2. positive strand, negative direction, looking for CCATGG, 0.
  3. positive strand, positive direction, looking for CCATGG, 0.
  4. negative strand, positive direction, looking for CCATGG, 2, CCATGG at 4222, CCATGG at 3581.
  5. complement, negative strand, negative direction, looking for GGTACC, 0.
  6. complement, positive strand, negative direction, looking for CGGCGG(C/T)GGTACC, 0.
  7. complement, positive strand, positive direction, looking for CGGCGG(C/T)GGTACC, 0.
  8. complement, negative strand, positive direction, looking for CGGCGG(C/T)GGTACC, 0.
  9. inverse complement, negative strand, negative direction, looking for CCATGG(C/T)GGCGGC, 0.
  10. inverse complement, positive strand, negative direction, looking for CCATGG(C/T)GGCGGC, 0.
  11. inverse complement, positive strand, positive direction, looking for CCATGG(C/T)GGCGGC, 0.
  12. inverse complement, negative strand, positive direction, looking for CCATGG(C/T)GGCGGC, 0.
  13. inverse positive strand, negative direction, looking for GGTACC(A/G)CCGCCG, 0.
  14. inverse negative strand, negative direction, looking for GGTACC(A/G)CCGCCG, 0.
  15. inverse positive strand, positive direction, looking for GGTACC(A/G)CCGCCG, 0.
  16. inverse negative strand, positive direction, looking for GGTACC(A/G)CCGCCG, 0.

(Kozak) samplings

Copying an apparent consensus sequence for the Kozak sequence of (GCC)GCC(A/G)CCATGG or GCCACCAT and putting it in "⌘F" finds none located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence GCCGCC(A/G)CCATGG (starting with SuccessablesKoz.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for GCCGCC(A/G)CCATGG, 0.
  2. positive strand, negative direction, looking for GCCGCC(A/G)CCATGG, 0.
  3. positive strand, positive direction, looking for GCCGCC(A/G)CCATGG, 0.
  4. negative strand, positive direction, looking for GCCGCC(A/G)CCATGG, 0.
  5. complement, negative strand, negative direction, looking for CGGCGG(C/T)GGTACC, 0.
  6. complement, positive strand, negative direction, looking for CGGCGG(C/T)GGTACC, 0.
  7. complement, positive strand, positive direction, looking for CGGCGG(C/T)GGTACC, 0.
  8. complement, negative strand, positive direction, looking for CGGCGG(C/T)GGTACC, 0.
  9. inverse complement, negative strand, negative direction, looking for CCATGG(C/T)GGCGGC, 0.
  10. inverse complement, positive strand, negative direction, looking for CCATGG(C/T)GGCGGC, 0.
  11. inverse complement, positive strand, positive direction, looking for CCATGG(C/T)GGCGGC, 0.
  12. inverse complement, negative strand, positive direction, looking for CCATGG(C/T)GGCGGC, 0.
  13. inverse positive strand, negative direction, looking for GGTACC(A/G)CCGCCG, 0.
  14. inverse negative strand, negative direction, looking for GGTACC(A/G)CCGCCG, 0.
  15. inverse positive strand, positive direction, looking for GGTACC(A/G)CCGCCG, 0.
  16. inverse negative strand, positive direction, looking for GGTACC(A/G)CCGCCG, 0.

(Matsumoto) samplings

Copying an apparent consensus sequence for the Kozak sequence of GAAAATGG and putting it in "⌘F" finds none located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence GAAAATGG (starting with SuccessablesKozM.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for GAAAATGG, 0.
  2. positive strand, negative direction, looking for GAAAATGG, 0.
  3. positive strand, positive direction, looking for GAAAATGG, 0.
  4. negative strand, positive direction, looking for GAAAATGG, 0.
  5. complement, negative strand, negative direction, looking for CTTTTACC, 0.
  6. complement, positive strand, negative direction, looking for CTTTTACC, 0.
  7. complement, positive strand, positive direction, looking for CTTTTACC, 0.
  8. complement, negative strand, positive direction, looking for CTTTTACC, 0.
  9. inverse complement, negative strand, negative direction, looking for CCATTTTC, 0.
  10. inverse complement, positive strand, negative direction, looking for CCATTTTC, 0.
  11. inverse complement, positive strand, positive direction, looking for CCATTTTC, 0.
  12. inverse complement, negative strand, positive direction, looking for CCATTTTC, 0.
  13. inverse negative strand, negative direction, looking for GGTAAAAG, 0.
  14. inverse positive strand, negative direction, looking for GGTAAAAG, 0.
  15. inverse positive strand, positive direction, looking for GGTAAAAG, 0.
  16. inverse negative strand, positive direction, looking for GGTAAAAG, 0.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

See also

References

  1. 1.0 1.1 1.2 1.3 Kozak, Marilyn (February 1989). "The scanning model for translation: an update". The Journal of Cell Biology. 108 (2): 229–241. doi:10.1083/jcb.108.2.229. ISSN 0021-9525. PMID 2645293.
  2. Kozak, Marilyn (2002-10-16). "Pushing the limits of the scanning mechanism for initiation of translation". Gene. 299 (1): 1–34. doi:10.1016/S0378-1119(02)01056-9. ISSN 0378-1119. PMID 12459250.
  3. Kozak, Marilyn (1999-07-08). "Initiation of translation in prokaryotes and eukaryotes". Gene. 234 (2): 187–208. doi:10.1016/S0378-1119(99)00210-3. ISSN 0378-1119. PMID 10395892.
  4. De Angioletti M, Lacerra G, Sabato V, Carestia C (2004). "Beta+45 G --> C: a novel silent beta-thalassaemia mutation, the first in the Kozak sequence". British Journal of Haematology. 124 (2): 224–31. doi:10.1046/j.1365-2141.2003.04754.x. PMID 14687034.
  5. Hernández, Greco; Osnaya, Vincent G.; Pérez-Martínez, Xochitl (2019-07-25). "Conservation and Variability of the AUG Initiation Codon Context in Eukaryotes". Trends in Biochemical Sciences. 44 (12): 1009–1021. doi:10.1016/j.tibs.2019.07.001. ISSN 0968-0004. PMID 31353284.
  6. Kozak, Marilyn (1984-01-25). "Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs". Nucleic Acids Research. 12 (2): 857–872. doi:10.1093/nar/12.2.857. ISSN 0305-1048. PMID 6694911.
  7. 7.0 7.1 7.2 Kozak Marilyn (October 1987). "An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs". Nucleic Acids Research. 15 (20): 8125–8148. doi:10.1093/nar/15.20.8125. PMID 3313277.
  8. Takuya Matsumoto, Saemi Kitajima, Chisato Yamamoto, Mitsuru Aoyagi, Yoshiharu Mitoma, Hiroyuki Harada and Yuji Nagashima (9 August 2020). "Cloning and tissue distribution of the ATP-binding cassette subfamily G member 2 gene in the marine pufferfish Takifugu rubripes" (PDF). Fisheries Science. 86: 873–887. doi:10.1007/s12562-020-01451-z. Retrieved 27 September 2020.

External links