C1orf106

VALUE_ERROR (nil)
Identifiers
Aliases
External IDs	GeneCards: [1]
Orthologs
	n/a
	n/a
	n/a
	n/a
	n/a
	n/a
	n/a
	n/a
	n/a
	n/a
	Wikidata
View/Edit Human

Uncharacterized protein C1orf106, sometimes referred to as hypothetical protein LOC55765, is a protein of unknown function that in humans is encoded by the C1orf106 gene.^[1] Less common gene aliases include FLJ10901 and MGC125608.

Gene

Location

File:Gene location.png

C1orf106 location on chromosome 1

In humans, C1orf106 is located on the long arm of chromosome 1 at locus 1q32.1. It spans from 200,891,499 to 200,915,736 (24.238 kb) on the plus strand.^[1]

Gene neighborhood

File:C1orf106 gene neighborhood.png

Gene neighborhood

C1orf106 is flanked by G protein-coupled receptor 25 (upstream) and maestro heat-like repeat family member 3 (MROH3P), a predicted downstream pseudogene. Ribosomal protein L34 pseudogene 6 (RPL34P6) is further upstream and kinesin family member 21B is further downstream.^[1]

Promoter

File:Promoter.pdf

Predicted C1orf106 promoter region with putative transcription factor binding sites

There are seven predicted promoters for C1orf106, and experimental evidence suggests that isoform 1 and 2, the most common isoforms, are transcribed using different promoters.^[2] MatInspector, a tool available through Genomatix, was used to predict transcription factor binding sites within potential promoter regions. The transcription factors that are predicted to target the anticipated promoter for isoform 1 are expressed in a range of tissues. The most common tissues of expression are the urogenital system, nervous system and bone marrow. This coincides with expression data for the C1orf106 protein, which is highly expressed in the kidney and bone marrow.^[3] A diagram of the predicted promoter region, with highlighted transcription factor binding sites, is shown to the right. The factors that are predicted to bind to the promoter region of isoform 2 differ, and twelve of the top twenty predicted factors are expressed in blood cells and/or tissues of the cardiovascular system.

Expression

C1orf106 is expressed in a wide range of tissues. Expression data from GEO profiles is shown below. The sites of highest expression, are listed in the table. Expression is moderate in the placenta, prostate, testis, lung, salivary glands and dendritic cells. It is low in the brain, most immune cells, the adrenal gland, uterus, heart and adipocytes.^[3] Expression data, from various experiments, found on GEO profiles suggests that C1orf106 expression is up-regulated in several cancers including: lung, ovarian, colorectal and breast.

File:C1orf106 Expression data.png

C1orf106 expression data from GEO profiles

Tissue	Percentile rank
B lymphocytes	90
Trachea	89
Skin	88
Human bronchial epithelial cells	88
Colorectal adenocarcinoma	87
Kidney	87
Tongue	85
Pancreas	84
Appendix	82
Bone marrow	80

mRNA

Isoforms

Nine putative isoforms are produced from the C1orf106 gene, seven of which are predicted to encode proteins.^[4] Isoform 1 and 2, shown below, are the most common isoforms.

File:C1orf106 isoforms.png

Most common C1orf106 isoforms

Isoform 1, which is the longest, is accepted as the canonical isoform. It contains ten exons, which encode a protein that is 677 amino acids long, depending on the source. Some sources report that the protein is only 663 amino acids due to the use of a start codon that is forty-two nucleotides downstream. According to NCBI, this isoform has only been predicted computationally.^[1] This may be because the Kozak sequence surrounding the downstream start codon is more similar to the consensus Kozak sequence as shown in the table below. Softberry was used to obtain the sequence of the predicted isoform.^[5] Isoform 2 is shorter due to a truncated N-terminus. Both isoforms have an alternative polyadenylation site.^[4]

Surrounding sequence of start codons compared to Kozak consensus sequence

miRNA regulation

miRNA-24 was identified as a microRNA that could potentially target C1orf106 mRNA.^[6] The binding site, which is located in the 5' untranslated region is shown.

Protein

General properties

File:C1orf106 diagram.png

C1of106 protein (isoform 1)

Isoform 1, diagramed below, contains a DUF3338 domain, two low complexity regions and a proline rich region. The protein is arginine and proline rich, and has a lower than average amount of asparagine and hydrophobic amino acids, specifically phenylalanine and isoleucine.^[7] The isoelectric point is 9.58, and the molecular weight of the unmodified protein is 72.9 kdal.^[8] The protein is not predicted to have an N-terminal signal peptide, but there are predicted nuclear localization signals (NLS) and a leucine rich nuclear export signal.^[9]^[10]^[11]

Modifications

C1orf106 is predicted to be highly phosphorylated.^[12]^[13] Phosphoylation sites predicted by PROSITE are shown in the table below. NETPhos predictions are illustrated in the diagram. Each line points to a predicted phosphorylation site, and connects to a letter which represents either serine (S), threonine (T) or tyrosine (Y).

Phosphorylation sites predicted by PROSITE

File:Phosphodiagram.png

Phosphorylation sites predicted by NETPhos. Letter corresponds to serine (S), threonine (T) or tyrosine (Y).

Structure

Coiled-coils are predicted to span from residue 130-160 and 200-260.^[14] The secondary composition was predicted to be about 60% random coils, 30% alpha helices and 10% beta sheets.^[15]

Interactions

The proteins with which the C1orf106 protein interacts are not well characterized. Text mining evidence suggests C1orf106 may interact with the following proteins: DNAJC5G, SLC7A13, PIEZO2, MUC19.^[16] Experimental evidence, from a yeast two hybrid screen, suggests the C1orf106 protein interacts with 14-3-3 protein sigma, which is an adaptor protein.^[17]

Homology

C1orf106 is well conserved in vertebrates as shown in the table below. Sequences were retrieved from BLAST^[18] and BLAT.^[19]

	Sequence	Genus and species	Common name	NCBI accession	Length(aa)	Sequence identity	Time since divergence (Mya)
*	C1orf106	Homo sapiens	Human	NP_060735.3	667	100%	NA
*	C1orf106	Macaca fascicularis	Crab-eating macaque	XP_005540414.1	703	97%	29.0
*	LOC289399	Rattus norvegicus	Norway rat	NP_001178750.1	667	86%	92.3
*	Predicted C1orf106 homolog	Odobenus rosmarus divergens	Walrus	XP_004392787.1	672	85%	94.2
*	C1orf106-like	Loxodonta africana	Elephant	XP_003410255.1	663	84%	98.7
*	Predicted C1orf106 homolog	Dasypus novemcinctus	Nine-banded armadillo	XP_004478752.1	676	81%	104.2
*	Predicted C1orf106 homolog	Ochotona princeps	American pika	XP_004578841.1	681	78%	92.3
*	Predicted C1orf106 homolog	Monodelphis domestica	Gray short-tailed opossum	XP_001367913.2	578	76%	162.2
*	Predicted C1orf106 homolog	Chrysemys picta bellii	Painted turtle	XP_005313167.1	602	56%	296.0
*	Predicted C1orf106 homolog	Geospiza fortis	Medium ground finch	XP_005426868.1	542	50%	296.0
*	Predicted C1orf106 homolog	Alligator mississippiensis	Alligator	XP_006278041.1	547	49%	296.0
*	Predicted C1orf106 homolog	Ficedula albicollis	Collared flycatcher	XP_005059352.1	542	49%	296.0
	Predicted C1orf106 homolog	Latimeria chalumnae	West Indian Ocean coelacanth	XP_005988436.1	613	46%	414.9
*	Predicted C1orf106 homolog	Lepisosteus oculatus	Spotted gar	XP_006628420.1	637	44%	400.1
*	FERM domain containing 4A	Xenopus (Silurana) tropicalis	Western clawed frog	XP_002935289.2	695	43%	371.2
*	Predicted C1orf106 homolog	Oreochromis niloticus	Nile tilapia	XP_005478188.1	576	40%	400.1
	Predicted C1orf106 homolog	Haplochromis burtoni	Astatotilapia burtoni	XP_005914919.1	576	40%	400.1
	Predicted C1orf106 homolog	Pundamilia nyererei	Haplochromis nyererei	XP_005732720.1	577	40%	400.1
*	LOC563192	Danio rerio	Zebrafish	NP_001073474.1	612	37%	400.1
	LOC101161145	Oryzias latipes	Japanese rice fish	XP_004069287.1	612	33%	400.1

A graph of the sequence identity versus the time since divergence for the asteriked entries is shown below. The colors correspond to degree of relatedness (green = closely related, purple = distantly related).

File:C1orf106Conservation.png

Percent sequence identity in relation to species relatedness

Paralogs

Proteins that are considered to be C1orf106 paralogs are not consistent between databases. A multiple sequence alignment (MSA) of potentially paralogous proteins was made to determine the likelihood of a truly paralogous relationship.^[20] The sequences were retrieved from a BLAST search in humans with the C1orf106 protein. The MSA suggests the proteins share a homologous domain, DUF3338, which is found in eukaryotes. A portion of the multiple sequence alignment is shown below. Apart from the DUF domain (boxed in green), there was little conservation. The DUF3338 domain does not have any extraordinary physical properties, however, one notable finding is that each of the proteins in the MSA is predicted to have two nuclear localization signals. The proteins in the MSA are all predicted to localize to the nucleus.^[9] A comparison of the physical properties of the proteins was also conducted using SAPS and is shown in the table.^[7]

File:DUF3338 Domain.png

Conservation of the DUF3338 domain in humans

Physical properties of potential paralogs

Clinical significance

A total of 556 single nucleotide polymorphisms (SNPs) have been identified in the gene region of C1orf106, 96 of which are associated with a clinical source.^[21] Rivas et al.^[22] identified four SNPs, shown in the table below, that may be associated with inflammatory bowel disease and Crohn's disease. According to GeneCards, other disease associations may include multiple sclerosis and ulcerative colitis.^[23]

Residue	Change	Notes
333 (rs41313912)	Tyrosine ⇒ phenylalanine	Phosphorylated, moderate conservation
376	Arginine ⇒ cysteine	Moderate conservation
397	Arginine ⇒ threonine	Not conserved
554 (rs61745433)	Arginine ⇒ cysteine	Moderate conservation

Model organisms

Model organisms have been used in the study of C1orf106 function. A conditional knockout mouse line called 5730559C18Rik^{tm2a(EUCOMM)Wtsi} was generated at the Wellcome Trust Sanger Institute.^[24] Male and female animals underwent a standardized phenotypic screen^[25] to determine the effects of deletion.^[26]^[27]^[28]^[29] Additional screens performed: - In-depth immunological phenotyping^[30] - in-depth bone and cartilage phenotyping^[31]

*5730559C18Rik* knockout mouse phenotype
Characteristic	Phenotype
All data available at.^[25]^[30]^[31]
Peripheral blood leukocytes 6 weeks	Normal
Insulin	Normal
Haematology 6 weeks	Normal
Homozygous viability at P14	Normal
Homozygous fertility	Normal
Body weight	Normal
Neurological assessment	Normal
Grip strength	Normal
Dysmorphology	Normal
Indirect calorimetry	Normal
Glucose tolerance test	Normal
Auditory brainstem response	Normal
DEXA	Normal
Radiography	Normal
Eye morphology	Normal
Clinical chemistry	Normal
Haematology 16 weeks	Normal
Peripheral blood leukocytes 16 weeks	Normal
Heart weight	Normal
Salmonella infection	Normal
Cytotoxic T cell function	Normal
Spleen immunophenotyping	Normal
Mesenteric lymph node immunophenotyping	Normal
Bone marrow immunophenotyping	Normal
Epidermal immune composition	Normal
Influenza challenge	Normal

References

↑ ^1.0 ^1.1 ^1.2 ^1.3 "NCBI Gene 55765". Retrieved 10 February 2014.
↑ "Genomatix: MatInspector". Retrieved 6 March 2014.
↑ ^3.0 ^3.1 "GEO Profiles". Retrieved 6 March 2014.
↑ ^4.0 ^4.1 "Aceview". Retrieved 6 March 2014.
↑ "Softberry". Retrieved 20 April 2014.
↑ "TargetScanHuman 6.2". Retrieved 15 April 2014.
↑ ^7.0 ^7.1 "Statistical Analysis of Protein Sequences". Retrieved 20 April 2014.
↑ "Compute pI/Mw tool". Retrieved 10 April 2014.
↑ ^9.0 ^9.1 "PSORTII". Retrieved 20 April 2014.
↑ "cNLS Mapper". Retrieved 20 April 2014.
↑ "NetNES". Retrieved 20 April 2014.
↑ "NETPhos". Retrieved 20 April 2014.
↑ "Swiss Institute of Bioinformatics: PROSITE".
↑ "ExPASy COILS". Retrieved 20 April 2014.
↑ "SOPMA". Retrieved 27 April 2014.
↑ "STRING". Retrieved 15 April 2014.
↑ "MINT". Retrieved 15 April 2014.
↑ "BLAST". Retrieved 8 March 2014.
↑ "BLAT". Retrieved 8 March 2014.
↑ "SDSC Biology Workbench: ClustalW". Retrieved 12 March 2014.
↑ "dbSNP". Retrieved 22 April 2014.
↑ Rivas MA; et al. (2011). "Deep resequencing of GWAS loci identifies independent rare variants associated with inflammatory bowel disease". Nature Genetics. 43 (11): 1066–1073. doi:10.1038/ng.952. PMC 3378381. PMID 21983784. |access-date= requires |url= (help)
↑ "GeneCards". Retrieved 1 May 2014.
↑ Gerdin AK (2010). "The Sanger Mouse Genetics Programme: high throughput characterisation of knockout mice". Acta Ophthalmologica. 88: 925–7. doi:10.1111/j.1755-3768.2010.4142.x.
↑ ^25.0 ^25.1 "International Mouse Phenotyping Consortium".
↑ Skarnes WC, Rosen B, West AP, Koutsourakis M, Bushell W, Iyer V, Mujica AO, Thomas M, Harrow J, Cox T, Jackson D, Severin J, Biggs P, Fu J, Nefedov M, de Jong PJ, Stewart AF, Bradley A (Jun 2011). "A conditional knockout resource for the genome-wide study of mouse gene function". Nature. 474 (7351): 337–42. doi:10.1038/nature10163. PMC 3572410. PMID 21677750.
↑ Dolgin E (Jun 2011). "Mouse library set to be knockout". Nature. 474 (7351): 262–3. doi:10.1038/474262a. PMID 21677718.
↑ Collins FS, Rossant J, Wurst W (Jan 2007). "A mouse for all reasons". Cell. 128 (1): 9–13. doi:10.1016/j.cell.2006.12.018. PMID 17218247.
↑ White JK, Gerdin AK, Karp NA, Ryder E, Buljan M, Bussell JN, Salisbury J, Clare S, Ingham NJ, Podrini C, Houghton R, Estabel J, Bottomley JR, Melvin DG, Sunter D, Adams NC, Sanger Institute Mouse Genetics Project, Tannahill D, Logan DW, Macarthur DG, Flint J, Mahajan VB, Tsang SH, Smyth I, Watt FM, Skarnes WC, Dougan G, Adams DJ, Ramirez-Solis R, Bradley A, Steel KP (2013). "Genome-wide generation and systematic phenotyping of knockout mice reveals new roles for many genes". Cell. 154 (2): 452–64. doi:10.1016/j.cell.2013.06.022. PMC 3717207. PMID 23870131.
↑ ^30.0 ^30.1 "Infection and Immunity Immunophenotyping (3i) Consortium".
↑ ^31.0 ^31.1 "OBCD Consortium".

External links

Human C1orf106 genome location and C1orf106 gene details page in the UCSC Genome Browser.

[NCBI-1] 1.0 ^1.1 ^1.2 ^1.3 "NCBI Gene 55765". Retrieved 10 February 2014.

[2] "Genomatix: MatInspector". Retrieved 6 March 2014.

[GEO_Profiles-3] 3.0 ^3.1 "GEO Profiles". Retrieved 6 March 2014.

[Aceview-4] 4.0 ^4.1 "Aceview". Retrieved 6 March 2014.

[5] "Softberry". Retrieved 20 April 2014.

[6] "TargetScanHuman 6.2". Retrieved 15 April 2014.

[SAPS-7] 7.0 ^7.1 "Statistical Analysis of Protein Sequences". Retrieved 20 April 2014.

[8] "Compute pI/Mw tool". Retrieved 10 April 2014.

[PSORT-9] 9.0 ^9.1 "PSORTII". Retrieved 20 April 2014.

[10] "cNLS Mapper". Retrieved 20 April 2014.

[11] "NetNES". Retrieved 20 April 2014.

[12] "NETPhos". Retrieved 20 April 2014.

[13] "Swiss Institute of Bioinformatics: PROSITE".

[14] "ExPASy COILS". Retrieved 20 April 2014.

[15] "SOPMA". Retrieved 27 April 2014.

[16] "STRING". Retrieved 15 April 2014.

[17] "MINT". Retrieved 15 April 2014.

[18] "BLAST". Retrieved 8 March 2014.

[19] "BLAT". Retrieved 8 March 2014.

[20] "SDSC Biology Workbench: ClustalW". Retrieved 12 March 2014.

[21] "dbSNP". Retrieved 22 April 2014.

[22] Rivas MA; et al. (2011). "Deep resequencing of GWAS loci identifies independent rare variants associated with inflammatory bowel disease". Nature Genetics. 43 (11): 1066–1073. doi:10.1038/ng.952. PMC 3378381. PMID 21983784. |access-date= requires |url= (help)

[23] "GeneCards". Retrieved 1 May 2014.

[mgp_reference-24] Gerdin AK (2010). "The Sanger Mouse Genetics Programme: high throughput characterisation of knockout mice". Acta Ophthalmologica. 88: 925–7. doi:10.1111/j.1755-3768.2010.4142.x.

[IMPCsearch_ref-25] 25.0 ^25.1 "International Mouse Phenotyping Consortium".

[pmid21677750-26] Skarnes WC, Rosen B, West AP, Koutsourakis M, Bushell W, Iyer V, Mujica AO, Thomas M, Harrow J, Cox T, Jackson D, Severin J, Biggs P, Fu J, Nefedov M, de Jong PJ, Stewart AF, Bradley A (Jun 2011). "A conditional knockout resource for the genome-wide study of mouse gene function". Nature. 474 (7351): 337–42. doi:10.1038/nature10163. PMC 3572410. PMID 21677750.

[mouse_library-27] Dolgin E (Jun 2011). "Mouse library set to be knockout". Nature. 474 (7351): 262–3. doi:10.1038/474262a. PMID 21677718.

[mouse_for_all_reasons-28] Collins FS, Rossant J, Wurst W (Jan 2007). "A mouse for all reasons". Cell. 128 (1): 9–13. doi:10.1016/j.cell.2006.12.018. PMID 17218247.

[pmid23870131-29] White JK, Gerdin AK, Karp NA, Ryder E, Buljan M, Bussell JN, Salisbury J, Clare S, Ingham NJ, Podrini C, Houghton R, Estabel J, Bottomley JR, Melvin DG, Sunter D, Adams NC, Sanger Institute Mouse Genetics Project, Tannahill D, Logan DW, Macarthur DG, Flint J, Mahajan VB, Tsang SH, Smyth I, Watt FM, Skarnes WC, Dougan G, Adams DJ, Ramirez-Solis R, Bradley A, Steel KP (2013). "Genome-wide generation and systematic phenotyping of knockout mice reveals new roles for many genes". Cell. 154 (2): 452–64. doi:10.1016/j.cell.2013.06.022. PMC 3717207. PMID 23870131.

[iii_ref-30] 30.0 ^30.1 "Infection and Immunity Immunophenotyping (3i) Consortium".

[obcd_ref-31] 31.0 ^31.1 "OBCD Consortium".

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

C1orf106

Contents

Gene

Location

Gene neighborhood

Promoter

Expression

mRNA

Isoforms

miRNA regulation

Protein

General properties

Modifications

Structure

Interactions

Homology

Paralogs

Clinical significance

Model organisms

References

External links

Navigation menu

C1orf106

Gene

Location

Gene neighborhood

Promoter

Expression

mRNA

Isoforms

miRNA regulation

Protein

General properties

Modifications

Structure

Interactions

Homology

Paralogs

Clinical significance

Model organisms

References

External links

Navigation menu

Search