C2orf81
C2orf81 is a human gene encoding protein c2orf81, which is predicted to have nuclear localization.
Gene
C2orf81's aliases are LOC388963 and hCG40743.[2] The gene spans from bases 74,414,176 to 74,421,591 on the minus (-) strand of chromosome 2, and contains 4 exons.[1] The coding region is 2086 base pairs, and the protein sequence contains 615 amino acids.[3]
Expression
The protein encoded by c2orf81 is expressed highly in testis, kidneys, and about 18 other tissues in humans.[4] Disease states in which it is expressed include in gliomas, non-neoplasia, skin tumors, and lymphoma.[5]
Transcription Variants
Only a few mutations have been documented to occur in c2orf81. Three common missense mutations occur in the 3’ UTR and in the coding sequence which change serine to leucine in the protein. Nonsense mutations have been documented as well, occurring exclusively in the codon for proline.
mRNA
The mRNA sequence contains and 2086 base pairs and 4 isoforms.
Protein
Properties and Composition
C2orf81 has a molecular weight of 66.6 kDa and its isoelectric point is 5.32.[7] It contains a high amount of prolines in the human protein and most mammalian homologs, but a higher amount of glutamic acid residues in non-mammalian vertebrate homologs.[8] C2orf81 has 4 isoforms and its most common isoform contains 615 amino acids. Isoforms 2 through 4 have 566, 520 and 588 amino acids respectively.[3] C2orf81 is the only member of superfamily cl25621.[9]
Domains
Domain of unknown function (DUF) 4639 is unique to the c2orf81 protein and is conserved in eukaryotes.[10] DUF 4639 spans from amino acid 17 to the end of the protein in human c2orf81.
Subcellular Localization
C2orf81 is primarily predicted to be nuclear, but potentially also cytoplasmic and mitochondrial.[11]
Interacting proteins
C2orf81 protein is predicted to interact highly with enoyl-CoA hydratase and hydroxyacyl-CoA dehydrogenase, based on textmining and database searches.[12] Other predicted interacting proteins are acetyl-CoA carboxylases A and B, glycine dehydrogenase, 3-oxoacid CoA transferase 2.
Structure
The c2orf81 is composed mainly of alpha helices. It contains fewer beta pleated sheets, turns, and coils.[13]
Function
Despite consisting almost entirely of domain of unknown function, the c2orf81 gene has been analyzed in a study of sites prone to DNA methylation.[4] Another study found the gene c2orf81 to overlap with other genes.[15] Genes from its loci have been related to Alstrom syndrome, cleft palate, neurodevelopmental delays, macrocephaly, and Perry syndrome.
Post-translational modifications
In human c2orf81, phosphorylation is expected to be undergone only in serines, but not in any threonines or tyrosines.[16] O-linked glycosylation is predicted to occur at 3 sites toward the C-terminus.[17] These sites are well-conserved in all homologs. C2orf81 contains one potential SUMOylation site towards the end of the protein with the sequence GKAE.[18]
Homology
Paralogs
C2orf81 was found to have one paralog, Homo sapiens BAC clone RP11-523H20.[19]
Homologs
The c2orf81 protein is conserved highly in primates and other mammals, but less so in non-mammalian vertebrates. Its most distant homolog is in the Asian swamp eel[20]. Below is a table showing homologs of c2orf81 and their date of divergence and percent identity to the c2orf81 protein sequence.
Species | Date of divergence (mya) | Protein identity |
---|---|---|
Bonobo | 6.4 | 99% |
Gorilla | 8.61 | 94% |
Orangutan | 15.2 | 95% |
Macaque | 28.1 | 92% |
Lemur | 82 | 72% |
Mouse | 88 | 52% |
Minke whale | 94 | 69% |
Cow | 94 | 66% |
Pig | 94 | 64% |
Chinese softshell turtle | 320 | 69% |
Ostrich | 320 | 62% |
American golden eagle | 320 | 42% |
Asian swamp eel | 432 | 35% |
Evolution
C2orf81 is has evolved quickly over time.[21] The N-terminus of the protein has evolved less quickly than the rest of the protein.
References
- ↑ 1.0 1.1 "Chromosome 2: 74,414,176-74,421,591 - Region in detail - Homo sapiens - Ensembl genome browser 92". useast.ensembl.org. Retrieved 2018-05-06.
- ↑ "Gene Cards".
- ↑ 3.0 3.1 "NCBI Protein c2orf81".
- ↑ 4.0 4.1 Seow, W. J., Kile, M. L., Baccarelli, A. A., Pan, W.-C., Byun, H.-M., Mostofa, G., Quamruzzaman, Q., Rahman, M., Lin, X. and Christiani, D. C. (2014), Epigenome-wide DNA methylation changes with development of arsenic -induced skin lesions in Bangladesh: A case–control follow-up study. Environ. Mol. Mutagen., 55: 449 –456. doi:10.1002/em.21860
- ↑ Group, Schuler. "EST Profile - Hs.445377". www.ncbi.nlm.nih.gov. Retrieved 2018-05-06.
- ↑ "C2orf81 chromosome 2 open reading frame 81 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-05-06.
- ↑ Kozlowski, Lukasz P. "CALCULATION OF PROTEIN ISOELECTRIC POINT". isoelectric.org. Retrieved 2018-05-06.
- ↑ "Composition/Molecular Weight Calculation [PIR - Protein Information Resource]". pir.georgetown.edu. Retrieved 2018-05-06.
- ↑ group, NIH/NLM/NCBI/IEB/CDD. "NCBI CDD Conserved Protein Domain DUF4639". www.ncbi.nlm.nih.gov. Retrieved 2018-05-06.
- ↑ "DUF4639". pfam.xfam.org. Retrieved 2018-05-06.
- ↑ "PSORTII".
- ↑ "STRING".
- ↑ Kumar, Prof. T. Ashok. "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2018-05-06.
- ↑ "Phyre 2 Results for Undefined". www.sbg.bio.ic.ac.uk. Retrieved 2018-05-06.
- ↑ "Figure 5: Genic alleles in the DAnc(YRI, Europe, UI) tail and overlapping genes". www.nature.com. Retrieved 2018-05-11.
- ↑ "DISPHOS 1.3". www.dabi.temple.edu. Retrieved 2018-05-06.
- ↑ "DictyOGlyc 1.1". www.cbs.dtu.dk. Retrieved 2018-05-06.
- ↑ "SUMOplot™ Analysis Program | Abgent". www.abgent.com. Retrieved 2018-05-06.
- ↑ Database, GeneCards Human Gene. "C2orf81 Gene - GeneCards | CB081 Protein | CB081 Antibody". www.genecards.org. Retrieved 2018-05-06.
- ↑ "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2018-05-06.
- ↑ "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2018-05-06.