O’Sullivan F, Keenan J, Aherne S, O’Neill F, Clarke C, Henry M, Meleady P, Breen L, Barron N, Clynes M, Horgan K, Doolan P, Murphy R. Parallel mRNA, proteomics and miRNA expression analysis in cell line models of the intestine. World J Gastroenterol 2017; 23(41): 7369-7386 [PMID: 29151691 DOI: 10.3748/wjg.v23.i41.7369]
Corresponding Author of This Article
Finbarr O’Sullivan, PhD, Associate Director, National Institute for Cellular Biotechnology, Dublin City University, Dublin D09 W6Y4, Ireland. finbarr.osullivan@dcu.ie
Research Domain of This Article
Cell Biology
Article-Type of This Article
Basic Study
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Finbarr O’Sullivan, Joanne Keenan, Sinead Aherne, Fiona O’Neill, Michael Henry, Paula Meleady, Laura Breen, Niall Barron, Martin Clynes, Padraig Doolan, National Institute for Cellular Biotechnology, Dublin City University, Dublin D09 W6Y4, Ireland
Colin Clarke, National Institute for Bioprocessing Research & Training, Blackrock, Dublin A94 X099, Ireland
Karina Horgan, Richard Murphy, Alltech, Dunboyne, Meath, A86 X006, Ireland
Author contributions: O'Sullivan F was involved at all stages of the research and co-wrote the manuscript; Doolan P coordinated the analysis of the triomic dataset and co-wrote the manuscript; Keenan J and Breen L performed the cell culture and prepared samples for analysis; Aherne S and O’Neill F performed the microarray experiments; Henry M and Meleady P performed the proteomic experiments; Clarke C and Barron N performed bioinformatics analysis on the data; Murphy R designed and coordinated the research programme with help from Clynes M and Horgan K; All authors read and reviewed the manuscript; Doolan P and Murphy R contributed equally to this work and are joint-last Authors.
Supported by A Strategic Alliance Programme between Alltech Ltd. and DCU and also Enterprise Ireland Innovation Partnership Grant (IP 2015 0375).
Conflict-of-interest statement: The authors declare they have no conflicts of interest.
Data sharing statement: All data generated or analysed during this study are included in this published article and its supplementary information files.
Open-Access: This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Correspondence to: Finbarr O’Sullivan, PhD, Associate Director, National Institute for Cellular Biotechnology, Dublin City University, Dublin D09 W6Y4, Ireland. finbarr.osullivan@dcu.ie
Telephone: +353-1-7005700
Received: May 16, 2017 Peer-review started: May 18, 2017 First decision: June 22, 2017 Revised: July 7, 2017 Accepted: August 8, 2017 Article in press: August 8, 2017 Published online: November 7, 2017 Processing time: 173 Days and 13.9 Hours
Abstract
AIM
To identify miRNA-regulated proteins differentially expressed between Caco2 and HT-29: two principal cell line models of the intestine.
METHODS
Exponentially growing Caco-2 and HT-29 cells were harvested and prepared for mRNA, miRNA and proteomic profiling. mRNA microarray profiling analysis was carried out using the Affymetrix GeneChip Human Gene 1.0 ST array. miRNA microarray profiling analysis was carried out using the Affymetrix Genechip miRNA 3.0 array. Quantitative Label-free LC-MS/MS proteomic analysis was performed using a Dionex Ultimate 3000 RSLCnano system coupled to a hybrid linear ion trap/Orbitrap mass spectrometer. Peptide identities were validated in Proteome Discoverer 2.1 and were subsequently imported into Progenesis QI software for further analysis. Hierarchical cluster analysis for all three parallel datasets (miRNA, proteomics, mRNA) was conducted in the R software environment using the Euclidean distance measure and Ward’s clustering algorithm. The prediction of miRNA and oppositely correlated protein/mRNA interactions was performed using TargetScan 6.1. GO biological process, molecular function and cellular component enrichment analysis was carried out for the DE miRNA, protein and mRNA lists via the Pathway Studio 11.3 Web interface using their Mammalian database.
RESULTS
Differential expression (DE) profiling comparing the intestinal cell lines HT-29 and Caco-2 identified 1795 Genes, 168 Proteins and 160 miRNAs as DE between the two cell lines. At the gene level, 1084 genes were upregulated and 711 were downregulated in the Caco-2 cell line relative to the HT-29 cell line. At the protein level, 57 proteins were found to be upregulated and 111 downregulated in the Caco-2 cell line relative to the HT-29 cell line. Finally, at the miRNAs level, 104 were upregulated and 56 downregulated in the Caco-2 cell line relative to the HT-29 cell line. Gene ontology (GO) analysis of the DE mRNA identified cell adhesion, migration and ECM organization, cellular lipid and cholesterol metabolic processes, small molecule transport and a range of responses to external stimuli, while similar analysis of the DE protein list identified gene expression/transcription, epigenetic mechanisms, DNA replication, differentiation and translation ontology categories. The DE protein and gene lists were found to share 15 biological processes including for example epithelial cell differentiation [P value ≤ 1.81613E-08 (protein list); P ≤ 0.000434311 (gene list)] and actin filament bundle assembly [P value ≤ 0.001582797 (protein list); P ≤ 0.002733714 (gene list)]. Analysis was conducted on the three data streams acquired in parallel to identify targets undergoing potential miRNA translational repression identified 34 proteins, whose respective mRNAs were detected but no change in expression was observed. Of these 34 proteins, 27 proteins downregulated in the Caco-2 cell line relative to the HT-29 cell line and predicted to be targeted by 19 unique anti-correlated/upregulated microRNAs and 7 proteins upregulated in the Caco-2 cell line relative to the HT-29 cell line and predicted to be targeted by 15 unique anti-correlated/downregulated microRNAs.
CONCLUSION
This first study providing “tri-omics” analysis of the principal intestinal cell line models Caco-2 and HT-29 has identified 34 proteins potentially undergoing miRNA translational repression.
Core tip: Unique triomics analysis of Caco-2 and HT-29, two commonly used in vitro cell lines models of the intestine, was conducted. This analysis not only provided data on differentially expressed mRNAs, miRNAs and proteins but also allowed the identification of miRNA-regulated proteins differentially expressed between these two cell lines.
Citation: O’Sullivan F, Keenan J, Aherne S, O’Neill F, Clarke C, Henry M, Meleady P, Breen L, Barron N, Clynes M, Horgan K, Doolan P, Murphy R. Parallel mRNA, proteomics and miRNA expression analysis in cell line models of the intestine. World J Gastroenterol 2017; 23(41): 7369-7386
In recent years both research and industry, particularly in the areas of pharmacology and nutrition, have steadily increased their use of in vitro cell models as an alternative to animal testing. This increase has been driven in part by the principles of the 3Rs (Replacement, Reduction and Refinement) and associated legislation, for example Article 4 of EU Directive 2010/63/EU. In addition there has been a growing use of automated high throughput techniques in the laboratory which has also increased the use of such in vitro cell lines. Thus there has been increased interest in the development of suitable intestinal in vitro models. These models are key to obtaining new information on intestinal toxicity, microbial infections, bioavailability of new food additives or new drugs as well as intestinal-related diseases[1-5].
The intestinal epithelium contains a number of specialised cell types; absorptive enterocytes (the principle intestinal cell type), goblet cells (mucin secreting), enteroendocrine cells, Paneth cells, tuft cells and M(or microfold) cells[6]. However the in vitro primary culture of intestinal epithelium is difficult, hence the use of intestinal cell lines allow for robust, reproducible in vitro culture models. Two cell lines, derived human from colonic adenocarcinomas, commonly used for the creation of such models are Caco-2 and HT-29.
The Caco-2 cell line can differentiate spontaneously to yield a polarised monolayer shown to express several characteristics and markers of enterocytes[7,8]. Thus Caco-2 cells have become a cell line of choice in the making of in vitro models of the intestine[9,10]. HT29 cell line is heterogeneous, consisting of a main population of undifferentiated cells and a smaller subpopulation capable of producing mucus[11-13]. Using glucose-free conditions, the HT-29 line can be differentiated to an enterocyte phenotype. In addition various selection procedures have generated homogeneous mucin-secreting populations[11,13-15]. These two cell lines form the basis of numerous models of the intestine either as single cell models or advanced multicellular models[16-18]. However, the creation of suitable complex in vitro models requires that cell lines used are well characterised; this includes an understanding of the molecular controls within the cells.
There is an increasing recognition of the importance of the post-transcriptional control by miRNA of cell processes such as proliferation, differentiation, and apoptosis. The importance of miRNA in intestinal development and behaviour has been implicated in a number of studies[19,20]. In addition, the miRNAs miR-99b, miR-125a-5p and miR-1269 have been identified as having a potential role in determining Caco-2 and HT-29 behaviour[21].
miRNA exerts its control by either interfering with protein initiation and elongation or by the degradation of the target mRNA[22-25]. An additional complexity of this control mechanism is that a single mRNA can be co-operatively targeted by multiple miRNAs, while a single miRNA can target multiple mRNAs[26,27]. A number of algorithms exist to predict the potential targets of miRNAs for example miRanda, TargetScan and PicTar[28-30]. These algorithms generally function by their ability to recognise regions of sequence complementarity to miRNA seed regions (nucleotides 2-7 of the miRNA) in the 3’UTR of target mRNAs and the thermodynamic feasibility of such binding.
However, validation by experimental means is often required to determine the most significant miRNAs responsible for an observed effect within a biological system. This study uniquely uses a tri-omics approach combining the expression profiles of mRNA, miRNA and proteomics generated in parallel to identify potential miRNAs controlling translational repression and thus cell behaviour. As we have previously noted[31], the use of solitary profiling methodologies (including proteomic mass spectrometry, mRNA microarrays etc.) is prone to several distinct disadvantages when used in isolation. For example, gene expression analysis using microarrays is capable of identifying a wide coverage of mRNAs but post-transcriptional processes will not be captured using just this method. The availability of a combined profiling approach could also reduce the possibilities of false positive/negative rates associated with purely in-silico prediction analyses. Additionally, studies examining the role of miRNAs typically rely heavily on computational methods to predict miRNA interaction and prioritise potential direct targets.
This study uses the availability of data on multiple levels to identify potential interacting miRNA-protein networks that would not have been identified by a single dataset, allowing us to provide new information for the characterising of these two cell lines.
MATERIALS AND METHODS
Cell culture
The human colon carcinoma cell lines Caco-2 (cat. HTB37) was obtained from the American Type Culture Collection) and HT-29 (cat. 91072201) was obtained from Public Health England Culture collection. HT-29 and Caco-2 were maintained in MEM supplemented with 1% L-glutamine and 5% or 10% FBS respectively under normal conditions (37 °C, 5% CO2). Caco-2 cells were allowed to reach a maximum confluency of 60%-70% before trypsinizing to ensure that spontaneous differentiation was not occurring. Both cell lines were tested and found to be mycoplasma negative. All chemicals (unless otherwise stated), glutamine and cell culture media were obtained from Sigma (Poole, United Kingdom). FBS and HBSS were obtained from Invitrogen-ThermoFisher Scientific. Exponentially growing cells were set up in 25 cm2 flasks (Corning) for RNA extraction or 75 cm2 flasks (Corning) for protein extraction.
Affymetrix miRNA microarray analysis of miRNA expression
Total RNA was extracted from triplicate biological replicate samples using the Qiagen miRNeasy kit according to the manufacturer’s instructions. RNA quantity was assessed using the NanoDrop ND-1000 spectrophotometer and RNA quality assayed using the Agilent RNA 6000 NANO KIT and Agilent Bioanalyzer (Agilent, Santa Clara, CA, United States). The FlashTag Biotin HSR RNA Labelling Kit (Affymetrix, Santa Clara, CA, United States) was used to label a total of 500 ng of total RNA according to manufacturer’s procedure. The GeneChip Scanner 3000 7G System and reagents from Affymetrix were used to hybridize, wash, stain and scan the Genechip miRNA 3.0 arrays (Affymetrix, Santa Clara, CA, United States).
Microarray profiling and associated bioinformatics analysis
Gene expression analysis was carried out on the Affymetrix GeneChip Human Gene 1.0 ST array according to the manufacturer’s instructions (Affymetrix, Santa Clara, CA, United States). The methodology and criteria used for total RNA purification, ssDNA sample processing and hybridisation to human microarrays have been previously described[32]. All microarray data were pre-processed as described previously. Prior to data analysis, probesets that did not reach the detection threshold (fluorescence level ≥ log2 (100) for at least 1 sample) were identified and designated undetected. The remaining probesets were considered differentially expressed between the two cell types if a fold change ≥ 1.2 in either direction along with a BH adjusted P value ≤ 0.05 was observed
Proteomics profiling and associated bioinformatics analysis
Exponentially growing cells were harvested and pelleted by centrifugation. Cells were washed with ice cold PBS, cell pellets were then snap frozen and stored at -80 °C until required. Cell lysis was carried out using a buffer containing 7 mol/L Urea, 2 mol/L Thiourea, 4% CHAPS, 30 mmol/L Tris, pH 8.5, 1x HALT protease inhibitors (Thermo Fisher Scientific) and incubated on ice for 20 min with occasional vortexing. Sample lysates were centrifuged at 14000 × for 15 min at 4 °C. The supernatant was transferred to a new microcentrifuge tube and protein concentration was determined using the Quick StartTM Bradford Protein Assay (BioRad).
Protein samples from the cell lysates were cleaned up using the ReadyPrep 2D Cleanup Kit (BioRad). Pellets were then resuspended in a buffer containing 6 M Urea, 2 mol/L Thiourea, 30 mmol/L Tris, pH 8.5. 20 μg from each sample was reduced with dithiothreitol to a final concentration of 5 mmol/L for 20 min at 56 °C followed by alkylation with iodoacetamide to a final concentration of 15 mmol/L, and incubated in the dark for 20 min at room temperature. Protein samples were tryptically digested (sequence grade, Promega) at a ratio of 50:1 protein/enzyme w/v at 37 °C overnight. Trifluoroacetic acid (TFA) was added to a final concentration of 0.5% to each sample to inactivate the trypsin. The digested peptide samples were then concentrated with C18 spin columns (Pierce, Thermo Fisher Scientific) and eluted peptides were gently dried with a vacuum evaporator (SpeedVac).
LC-MS/MS analysis
Dried peptides were re-solubilised in 25 μL of LC-MS grade water with 0.1% FA and 2% Acetonitrile (ACN). Peptides were separated and analysed using a Dionex Ultimate 3000 RSLCnano system (Thermo Fisher Scientific) coupled to a hybrid linear ion trap/Orbitrap mass spectrometer (LTQ Orbitrap XL; Thermo Fisher Scientific).
A 5 μL injection of sample was picked up using the autosampler and loaded onto a C18 trap column (C18 PepMap, 300 μm ID × 5 mm, 5 μm particle size, 100 Å pore size; Thermo Fisher Scientific). The sample was desalted for 3 minutes using a flow rate of 25 μL/min in 0.1% TFA containing 2% ACN. The trap column was then switched online with the analytical column [PepMap C18, 75 μm ID × 250 mm, 3 μm particle and 100 Å pore size; (Thermo Fisher Scientific)] using a column oven at 35°C and peptides were eluted with the following binary gradients of: Mobile Phase Buffer A and Mobile Phase Buffer B: 0%-25% solvent B in 240 min and 25%-50% solvent B in a further 60 min, where solvent A consisted of 2% acetonitrile (ACN) and 0.1% formic acid in water and solvent B consisted of 80% ACN and 0.08% formic acid in water. The column flow rate was set to 300 nL/min. Data was acquired with Xcalibur software, version 2.0.7 (Thermo Fisher Scientific).
The LTQ Orbitrap XL was operated in data-dependent and externally calibrated. Survey MS scans were acquired in the Orbitrap in the 400-1800 m/z range with the resolution set to a value of 30000 at m/z 400. Up to three of the most intense ions (1+, 2+ and 3+) per scan were CID fragmented in the linear ion trap. Dynamic exclusion was enabled with a repeat count of 1, repeat duration of 30 s, exclusion list size of 500 and exclusion duration of 40 s. The minimum signal was set to 500. All tandem mass spectra were collected using a normalised collision energy of 35%, an isolation window of 2 m/z with an activation time of 30 ms.
Quantitative label-free LC-MS/MS data analysis
The raw MS data files obtained were processed using Progenesis QI for Proteomics software (version 2.0; Non-Linear Dynamics, a Waters company, Newcastle upon Tyne, United Kingdom). Peptide LC retention times from all MS data files were aligned to an assigned reference run as previously described[31]. The samples from the two experimental groups were then set up within the Progenesis software for differential analysis. Peptide features were filtered using the following parameters; (1)peptide features with ANOVA ≤ 0.05 between experimental groups; and (2)mass peaks with charge states from +1 to +3 and greater than one isotope per peptide. For the proteomic analysis a mascot generic file (mgf) was then generated from all exported MS/MS spectra which was used for peptide and protein identification via Proteome Discoverer 2.1 using Sequest HT (Thermo Fisher Scientific) and Percolator against the human Swissprot database containing 23053 sequences. The following search parameters were used for protein identification: (1)peptide mass tolerance set to 20 ppm; (2)MS/MS mass tolerance set to 0.6 Da; (3)up to two missed cleavages were allowed; (4)carbamidomethylation of cysteine set as a fixed modification; and (5)methionine oxidation was set as variable modifications. Only high confident peptide identifications with an FDR ≤ 0.01 [identified using a SEQUEST HT workflow coupled with Percolator validation in Proteome Discoverer 2.1 (Thermo Fisher Scientific)] were imported into Progenesis QI software for further analysis. A statistical criteria of ANOVA P value less than 0.05, a minimum of two peptides matched to a protein and a ≥ 1.2 fold change between the two cell lines (HT-29 and Caco-2) was used as the criteria for identification as a differentially expressed protein. Data visualisation was achieved using a heatmap which were generated using ggplot package in R to show the distribution of all identified proteins based on statistics and fold-change across all of the samples used in the analysis.
Bioinformatics QC analysis, miRNA target prediction and GO analysis
Hierarchical cluster analysis (HCA) for all three parallel datasets (miRNA, proteomics, mRNA) was conducted in the R software environment (provided in the public domain by R Foundation for Statistical Computing, Vienna, Austria, available at http://www.r-project.org/) using the Euclidean distance measure and Ward’s clustering algorithm. The prediction of miRNA and oppositely correlated protein/mRNA interactions was performed using TargetScan 6.1 (http://www.targetscan.org/vert_61/) and has been described previously[32]. miRNA Names (e.g., hsa-miR-375) were annotated to miRNA Accession IDs (e.g., MI0000783) using miRBase (http://www.mirbase.org/) for GO analysis. GO biological process, molecular function and cellular component enrichment analysis was carried out for the DE miRNA, protein and mRNA lists via the Pathway Studio 11.3 Web interface (https://http://www.elsevier.com/solutions/pathway-studio-biological-research) using their Mammalian database and has been described previously[33].
RESULTS
Data analysis methodology of combined omics/"Tri-omics" expression profiling approach
Figure 1 describes the data analysis strategy employed in this study. Differential expression (DE) profiling of HT-29 and Caco-2 cell line triplicate samples at the miRNA, protein and mRNA levels identified three separate datasets of 160 DE miRNAs, 168 DE proteins and 1795 DE mRNA transcripts (Stage 1: Supplementary Tables 1-3). Enrichment analysis against GO was conducted using the literature mining software Pathway Studio Web for the three resulting lists to determine if any biological processes were overrepresented (Stage 2: Supplementary Tables 4-6). The availability of all three data streams acquired in parallel was essential for the identification of our priority candidates - targets undergoing potential miRNA translational repression.
Table 1 Twenty-seven downregulated proteins with no DE mRNA targeted by 19 upregulated miRNAs.
Protein
microRNA
Uniprot
Symbol
Title
Linear FC
Adj. P value
miRBase ID
logFC
LinFC
AveExpr
P value
Adj. P value
Q09666
AHNAK
Neuroblast differentiation-associated protein AHNAK
1.66
0.02036
hsa-miR-30b
1.39
2.62
10.38
5.04E-04
0.011296975
hsa-miR-30d
1.36
2.56
11.05
1.17E-04
0.003668745
hsa-miR-372
12.21
4743.91
7.03
6.55E-11
2.27E-07
hsa-miR-373
12.12
4451.32
7.23
2.25E-11
1.88E-07
P04075
ALDOA
Fructose-bisphosphate aldolase A
2.03
0.00001
hsa-miR-122
5.85
57.63
5.04
3.71E-05
0.001621788
P50995
ANXA11
Annexin A11
4.57
0.00012
hsa-miR-182
1.09
2.13
12.01
2.48E-04
0.006341084
Q14444
CAPRIN1
Caprin-1
1.42
0.00658
hsa-miR-28-5p
1.16
2.24
9.32
7.05E-05
0.002531521
hsa-miR-320a
1.46
2.76
12.49
2.74E-05
0.001333363
hsa-miR-320b
1.46
2.76
12.49
4.80E-05
0.001959043
hsa-miR-320c
1.46
2.74
12.35
5.43E-05
0.002111768
hsa-miR-371-5p
11.89
3799.61
7.79
1.04E-10
2.62E-07
P23528
CFL1
Cofilin-1
1.53
0.00962
hsa-miR-182
1.09
2.13
12.01
2.48E-04
0.006341084
P62633
CNBP
Cellular nucleic acid-binding protein
1.79
0.00064
hsa-miR-320a
1.46
2.76
12.49
2.74E-05
0.001333363
hsa-miR-320b
1.46
2.76
12.49
4.80E-05
0.001959043
hsa-miR-320c
1.46
2.74
12.35
5.43E-05
0.002111768
Q14247
CTTN
Src substrate cortactin
1.33
0.00716
hsa-miR-182
1.09
2.13
12.01
2.48E-04
0.006341084
P68104
EEF1A1
Elongation factor 1-alpha 1
1.46
0.01148
hsa-miR-371-5p
11.89
3799.61
7.79
1.04E-10
2.62E-07
P06733
ENO1
Alpha-enolase
2.06
0.00015
hsa-miR-22
1.96
3.9
10.84
1.01E-04
0.003309338
Q96AE4
FUBP1
Far upstream element-binding protein 1
2.17
0.00290
hsa-miR-155
1.85
3.61
9.51
5.54E-05
0.002127503
P22626
HNRNPA2B1
Heterogeneous nuclear ribonucleoproteins A2/B1
1.72
0.00053
hsa-miR-371-5p
11.89
3799.61
7.79
1.04E-10
2.62E-07
Q00839
HNRNPU
Heterogeneous nuclear ribonucleoprotein U
4.51
0.00152
hsa-miR-92b
1.22
2.33
6.95
4.51E-04
0.010275397
hsa-miR-122
5.85
57.63
5.04
3.71E-05
0.001621788
P31153
MAT2A
S-adenosylmethionine synthase isoform type-2
1.92
0.00023
hsa-miR-30b
1.39
2.62
10.38
5.04E-04
0.011296975
hsa-miR-30d
1.36
2.56
11.05
1.17E-04
0.003668745
Q15233
NONO
Non-POU domain-containing octamer-binding protein
2.74
0.00522
hsa-miR-320a
1.46
2.76
12.49
2.74E-05
0.001333363
hsa-miR-320b
1.46
2.76
12.49
4.80E-05
0.001959043
hsa-miR-320c
1.46
2.74
12.35
5.43E-05
0.002111768
P06748
NPM1
Nucleophosmin
1.51
0.01869
hsa-miR-182
1.09
2.13
12.01
2.48E-04
0.006341084
P30101
PDIA3
Protein disulfide-isomerase A3
1.84
0.00465
hsa-miR-155
1.85
3.61
9.51
5.54E-05
0.002127503
P13667
PDIA4
Protein disulfide-isomerase A4
1.74
0.00006
hsa-miR-378
1.35
2.55
11.31
7.79E-05
0.002720502
hsa-miR-378c
1.17
2.25
10.21
5.74E-04
0.012576882
hsa-miR-378i
0.85
1.80
9.00
7.98E-04
0.01627753
O15212
PFDN6
Prefoldin subunit 6
2.26
0.02810
hsa-miR-22
1.96
3.90
10.84
1.01E-04
0.003309338
P14618
PKM
Pyruvate kinase PKM
1.91
0.00020
hsa-miR-122
5.85
57.63
5.04
3.71E-05
0.001621788
P62491
RAB11A
Ras-related protein Rab-11A
1.74
0.00318
hsa-miR-138
5.84
57.23
4.36
6.08E-08
2.62E-05
hsa-miR-30b
1.39
2.62
10.38
5.04E-04
0.011296975
hsa-miR-30d
1.36
2.56
11.05
1.17E-04
0.003668745
hsa-miR-320a
1.46
2.76
12.49
2.74E-05
0.001333363
hsa-miR-320b
1.46
2.76
12.49
4.80E-05
0.001959043
hsa-miR-320c
1.46
2.74
12.35
5.43E-05
0.002111768
hsa-miR-371-5p
11.89
3799.61
7.79
1.04E-10
2.62E-07
hsa-miR-372
12.21
4743.91
7.03
6.55E-11
2.27E-07
hsa-miR-373
12.12
4451.32
7.23
2.25E-11
1.88E-07
Q09028
RBBP4
Histone-binding protein RBBP4
1.76
0.00339
hsa-miR-138
5.84
57.23
4.36
6.08E-08
2.62E-05
hsa-miR-138
5.84
57.23
4.36
6.08E-08
2.62E-05
hsa-miR-182
1.09
2.13
12.01
2.48E-04
0.006341084
hsa-miR-371-5p
11.89
3799.61
7.79
1.04E-10
2.62E-07
Q9UHD8
9-Sep
Septin-9
1.98
0.00016
hsa-miR-182
1.09
2.13
12.01
2.48E-04
0.006341084
Q8NC51
SERBP1
Plasminogen activator inhibitor 1 RNA-binding protein
Figure 1 Overview of Tri-Omics data analysis approach.
Stage 1: Differential expression analysis of “Tri-Omics” data (miRNA, protein, mRNA) between HT-29 and Caco-2 cell lines (for a full list of results, including fold-changes and associated P value statistics, see Supplementary Tables 1-3). Stage 2: Enrichment analysis using Pathway Studio to determine overrepresented GO biological processes in the DE miRNA, protein and mRNA lists (see Supplementary Tables 4-6). Stage 3: Integration of three parallel datasets to generate 3 groups of interesting candidates; (1) Potential targets of miRNA translational repression (protein DE without mRNA change and targeted by anti-correlated miRNA): see Table 1 (Downregulated Proteins targeted by Upregulated miRNAs) and Table 2 (Upregulated Proteins targeted by Downregulated miRNAs); (2) Non- microRNA-mediated gene-protein expression (matching mRNA transcripts and proteins were DE, no corresponding anti-correlated microRNA data or results were below detection level): see Supplementary Table 7 and (3) Targets where control of differential protein expression was unclear (protein DE without mRNA or corresponding targeting microRNA change). Stage 4: TargetScan analysis of both groups against miRNAs DE in the opposite direction. Stage 5: Overlap analysis of Supplementary Tables 1-3 with published profiling studies on same cell lines: see Table 3.
Table 2 Seven upregulated proteins with no DE mRNA targeted by 15 downregulated miRNAs.
Protein
microRNA
Uniprot
Symbol
Title
Linear FC
Adj. P value
miRBase ID
logFC
LinFC
AveExpr
P Value
Adj. P value
O43852
CALU
Calumenin
1.86
0.0162
hsa-let-7a
-1.44
-2.72
12.58
1.30E-04
0.00396
hsa-let-7b
-2.61
-6.11
12.84
1.71E-06
0.00017
hsa-let-7c
-3.45
-10.94
11.26
2.39E-06
0.00021
hsa-let-7d
-2.66
-6.34
12.12
2.20E-06
0.00020
hsa-let-7e
-1.40
-2.65
12.39
6.41E-05
0.00236
hsa-let-7f
-2.85
-7.20
8.75
5.04E-06
0.00034
hsa-let-7g
-1.06
-2.08
8.90
2.99E-04
0.00737
hsa-let-7i
-2.31
-4.95
10.22
2.35E-06
0.00021
hsa-miR-29a
-1.63
-3.09
9.69
1.14E-04
0.00361
hsa-miR-200a
-2.68
-6.43
9.16
4.14E-04
0.00965
hsa-miR-200b
-2.54
-5.81
10.40
2.63E-05
0.00129
P20810
CAST
Calpastatin
2.39
0.00011
hsa-miR-125a-3p
-3.14
-8.83
5.84
3.26E-05
0.00150
hsa-miR-224
-2.44
-5.42
7.35
8.21E-04
0.01667
hsa-miR-375
-4.06
-16.67
8.29
1.50E-05
0.00083
P21333
FLNA
Filamin-A
3.59
0.00048
hsa-miR-125a-3p
-3.14
-8.83
5.84
3.26E-05
0.00150
P13797
PLS3
Plastin-3
5.88
0.00006
hsa-miR-200b
-2.54
-5.81
10.40
2.63E-05
0.00129
P78330
PSPH
Phosphoserine phosphatase
2.22
0.0004
hsa-miR-200b
-2.54
-5.81
10.40
2.63E-05
0.00129
P35241
RDX
Radixin
5.57
0.00012
hsa-let-7a
-1.44
-2.72
12.58
1.30E-04
0.00396
hsa-let-7b
-2.61
-6.11
12.84
1.71E-06
0.00017
hsa-let-7c
-3.45
-10.94
11.26
2.39E-06
0.00021
hsa-let-7d
-2.66
-6.34
12.12
2.20E-06
0.00020
hsa-let-7e
-1.40
-2.65
12.39
6.41E-05
0.00236
hsa-let-7f
-2.85
-7.20
8.75
5.04E-06
0.00034
hsa-let-7g
-1.06
-2.08
8.90
2.99E-04
0.00737
hsa-let-7i
-2.31
-4.95
10.22
2.35E-06
0.00021
hsa-miR-31
-1.94
-3.83
12.79
7.91E-05
0.00274
Q13885
TUBB2A
Tubulin beta-2A chain
1.46
0.00388
hsa-miR-29a
-1.63
-3.09
9.69
1.14E-04
0.00361
Table 3 Overlap of DE genes, proteins, microRNAs with other HT-29/Caco-2 profiling studies.
Identifier
Symbol
Description/Family
LinFC
P value
Overlapping study
Dataset
hsa-miR-196a_st
hsa-miR-196a
miR-196abc
-5.06
8.53E-05
88
microRNA
hsa-let-7a_st
hsa-let-7a
let-7/98/4458/4500
-2.72
6.34E-04
57
hsa-miR-10a_st
hsa-miR-10a
miR-10abc/10a-5p
-46.76
7.40E-06
57
hsa-miR-98_st
hsa-miR-98_st
miR-98
-11.90
4.50E-06
57
P17931
LGALS3
Galectin-3
2.09
1.10E-03
89; 90
Protein
P06576
ATP5B
ATP synthase subunit beta, mitochondrial
2.07
1.20E-03
89
P27797
CALR
Calreticulin
1.62
3.35E-02
89
P23528
CFL1
Cofilin-1
1.53
5.94E-03
89
P07858
CTSB
Cathepsin B
5.15
5.36E-05
89
P06733
ENO1
Alpha-enolase
2.06
4.65E-03
89
P22626
HNRNPA2B1
Heterogeneous nuclear ribonucleoproteins A2/B1
1.72
9.62E-03
89
P61978
HNRNPK
Heterogeneous nuclear ribonucleoprotein K
1.38
4.38E-04
89
P14866
HNRNPL
Heterogeneous nuclear ribonucleoprotein L
1.69
4.02E-03
89
P05787
KRT8
Keratin, type II cytoskeletal 8
2.21
1.81E-03
89
P07237
P4HB
Protein disulfide-isomerase
1.41
3.49E-03
89
P30101
PDIA3
Protein disulfide-isomerase A3
1.84
7.49E-05
89
P18669
PGAM1
Phosphoglycerate mutase 1
1.93
1.42E-04
89
P12429
ANXA3
Annexin A3
2.53
1.55E-06
90
P19338
NCL
Nucleolin
2.74
4.80E-03
90
P06748
NPM1
Nucleophosmin
1.51
1.15E-02
90
Q00796
SORD
Sorbitol dehydrogenase
4.21
7.73E-05
90
16779311
ATP7B
ATPase, Cu++ transporting, beta polypeptide
3.60
1.57E-05
91; 92
Gene
16823109
ABCA3
ATP-binding cassette, sub-family A (ABC1), member 3
10.07
1.75E-05
91
17059491
ABCB1
ATP-binding cassette, sub-family B (MDR/TAP), member 1
36.64
2.14E-06
16701957
AKR1E2
aldo-keto reductase family 1, member E2
4.56
1.09E-05
17005368
ALDH5A1
aldehyde dehydrogenase 5 family, member A1
3.38
4.08E-06
16794632
ALDH6A1
aldehyde dehydrogenase 6 family, member A1
2.27
7.54E-05
17067102
CDCA2
carboxylesterase 1
2.32
2.28E-03
16889762
CYP20A1
cytochrome P450, family 20, subfamily A, polypeptide 1
2.78
8.71E-04
16891082
CYP27A1
cytochrome P450, family 27, subfamily A, polypeptide 1
20.68
2.05E-06
17067284
EPHX2
epoxide hydrolase 2, cytoplasmic
6.51
3.62E-06
17020103
GSTA4
glutathione S-transferase alpha 4
2.25
4.04E-04
16877728
NCOA1
nuclear receptor coactivator 1
2.99
7.90E-04
16675638
NR5A2
nuclear receptor subfamily 5, group A, member 2
2.33
1.73E-03
16959797
RBP2
retinol binding protein 2, cellular
27.20
2.06E-06
16780481
SLC15A1
solute carrier family 15 (oligopeptide transporter), member 1
20.05
1.95E-06
16909257
SLC19A3
solute carrier family 19 (thiamine transporter), member 3
35.13
6.97E-08
16984056
SLC1A3
solute carrier family 1 (glial high affinity glutamate transporter), member 3
50.97
1.62E-07
16989018
SLC22A5
solute carrier family 22 (organic cation/carnitine transporter), member 5
2.21
5.49E-04
16843078
SLC6A4
solute carrier family 6 (neurotransmitter transporter), member 4
9.33
1.33E-06
16820398
SLC7A6
solute carrier family 7 (amino acid transporter light chain, y+L system), member 6
4.71
5.00E-06
16790744
SLC7A8
solute carrier family 7 (amino acid transporter light chain, L system), member 8
11.99
4.99E-06
16729064
SLCO2B1
solute carrier organic anion transporter family, member 2B1
10.15
8.38E-06
16998551
SLCO4C1
solute carrier organic anion transporter family, member 4C1
11.95
2.59E-03
16884050
SULT1C4
sulfotransferase family, cytosolic, 1C, member 4
15.89
2.49E-07
16951567
THRB
thyroid hormone receptor, beta
4.38
1.90E-04
17068541
VDAC3
voltage-dependent anion channel 3
2.28
3.11E-04
Prior to target prediction against the DE miRNA list, we separated miRNA, protein and mRNA targets into three groups of interest (Stage 3). “Group 1” contains those targets where a degree of post-transcriptional regulation was observed (possibly via miRNA mediated translational repression) which may contribute to phenotypic differences between these two cell lines and were considered the most interesting group studied. The 34 proteins in Group 1 were DE between HT-29 and Caco-2, their respective mRNAs were expressed above the microarray detection threshold but no change in mRNA expression was observed and were predicted to be targeted by 34 DE microRNAs whose expression was correlated in the opposite direction. This group comprised 27 proteins downregulated in the Caco-2 cell line relative to the HT-29 cell line and predicted to be targeted by 19 unique anti-correlated/upregulated microRNAs (Table 1) and 7 proteins upregulated in the Caco-2 cell line relative to the HT-29 cell line and predicted to be targeted by 15 unique anti-correlated/downregulated microRNAs (Table 2).
The second group of candidates (referred to as “Group 2”) were comprised of DE protein targets where the matching mRNA transcript was DE in the matching direction. These “Group 2” candidates follow the classical mRNA-protein expression control model (i.e., non-miR-based expression control) and are detailed in Supplementary Table 7.
The third group of candidates (“Group 3”; detailed in Supplementary Table 8) were comprised of DE protein targets where (1) the matching mRNA transcripts were not DE in the same direction or (2) where the mRNA probeset was under the detection threshold or (3) not present on the chip and finally (4) microRNAs predicted to target these proteins (using Targetscan) were not identified as DE between the two cell lines. The role of miRNA in control of expression of these “Group 3” candidates is therefore unclear and these proteins are considered lowest-priority of the three groups studied, for the purposes of the analysis presented here.
Combined omics/"Tri-omics" expression profiling identifies three interesting DE candidate groups that characterise the differences between HT-29 and Caco-2
Tri-omics expression profiling using the three methodologies outlined identified 160 DE miRNAs, 168 DE Proteins and 1795 DE Genes from a comparison of the HT-29 and Caco-2 cell lines.
Bioinformatics analysis of the microRNA profiling data (as outlined) on the HT-29 and Caco-2 cell lines identified 1550 probesets (772 upregulated, 778 downregulated) as differentially expressed (DE). These 1550 DE probesets were species-annotated to 221 human-specific probesets (140 upregulated, 71 downregulated) and further reduced to 160 DE mature non-star human miRNAs (104 upregulated, 56 downregulated (Supplementary Table 1). Bioinformatics analysis (as outlined) of the proteomics profiling data identified a total of 168 annotated DE proteins (57 upregulated, 111 downregulated) between the HT-29 and Caco-2 cell lines (Supplementary Table 2). Bioinformatics analysis (as outlined) of the microarray profiling data identified 1795 probesets (1084 upregulated, 711 downregulated) as differentially expressed (DE) between the HT-29 and Caco-2 cell lines (Supplementary Table 3).
Two-way hierarchical HCA clustering analysis as outlined on the HT-29 and Caco-2 cell line samples confirms that the DE (miRNA, proteomics, miRNA) candidates separate the samples into their respective replicate groups, indicating that they are of sufficient quality to yield high-priority candidates for phenotypic characterisation between both cell lines (Figure 2). For (A) miRNA and (C) mRNA, red indicates diminished expression and green indicates increased expression for all three replicate groups, while for (B) proteomics blue indicates diminished expression and red indicates increased expression for all three replicates in each group.
Figure 2 Bioinformatics QC analysis of HT-29 and Caco-2 replicates.
2-way hierarchical HCA clustering analysis on the HT-29 and Caco-2 cell line replicates using (A) DE miRNA probeset list (160 probesets) (B) DE proteomics list (168 proteins) and (C) DE mRNA probeset list (1795 probesets) confirms that the DE candidates separate the samples into their respective cell line replicate groups. For (A) and (C), red indicates diminished expression and green indicates increased expression for all three replicate groups; for (B) blue indicates diminished expression and red indicates increased expression for all three replicate groups.
A TargetScan prediction analysis for (A) post-transcriptionally downregulated (green) proteins and anti-correlated/upregulated (red) miRNAs and (B) post-transcriptionally upregulated (red) proteins and anti-correlated/downregulated (green) miRNAs is shown in Figure 3. The prioritisation of these targets was possible only through integration of the miRNA, protein and mRNA datasets.
Figure 3 Summary of TargetScan predicted miRNA targets.
A: TargetScan prediction analysis for post-transcriptionally downregulated (green) proteins and anti-correlated/upregulated (red) miRNAs; B: TargetScan prediction analysis for post-transcriptionally upregulated (red) proteins and anti-correlated/downregulated (green) miRNAs. The prioritisation of these targets was possible only through integration of the miRNA, protein and mRNA datasets.
GO enrichment analysis identifies overrepresented GO categories related to cell differentiation, actin assembly and cytoskeleton organization shared between DE protein and gene lists
GO biological processes found to be overrepresented within the DE protein list included gene expression/transcription, epigenetic mechanisms, DNA replication, differentiation and translation (Supplementary Table 4).
A similar analysis of the DE mRNAs identified cell adhesion, migration and ECM organization, cellular lipid and cholesterol metabolic processes, small molecule transport and a range of responses to external stimuli to be the most significantly-enriched GO categories from this list (Supplementary Table 5).
Fifty shared biological processes were enriched by the DE protein and microarray datasets, including epithelial cell differentiation [P value ≤ 1.81613E-08 (protein list); P ≤ 0.000434311 (gene list), actin filament bundle assembly [P value ≤ 0.001582797 (protein list); P ≤ 0.002733714 (gene list)], cell morphogenesis involved in differentiation [P value ≤ 0.003493544 (protein list); P ≤ 0.005671882 (gene list)] and cytoskeleton organization [P value ≤ 0.004078084 (protein list); P ≤ 0.005154161 (gene list)].
GO analysis using the 160 DE microRNA list of entities did not identify any enriched biological processes in the Pathway Studio Web database.
Integration of 3 parallel datasets identifies a priority group of 37 DE proteins and 44 associated miRs that may characterise phenotypic differences between HT-29 and Caco-2
The availability of all three level of expression data (miRNA, protein, mRNA) acquired in parallel on identical cell line samples was essential for the identification of our priority candidates for characterisation of the differences between the HT-29 and Caco-2 cell lines - protein targets that could undergo potential miRNA translational repression.
These potential targets of miRNA translational repression were identified as differentially-expressed (DE) proteins that could be targeted by a DE miRNA in the opposite direction (anti-correlated miRNA) and where the corresponding gene transcript was unchanged (protein DE without mRNA change and targeted by anti-correlated miRNA). From this 3-way overlap analysis, a total of 34 DE proteins and 34 DE microRNAs were identified to be of interest (Tables 1 and 2). Table 1 (Downregulated Proteins targeted by Upregulated miRNAs) displays the 27 proteins observed to be downregulated in the Caco-2 cell line relative to the HT-29 cell line and predicted to be targeted by 19 unique anti-correlated/upregulated microRNAs. Table 2 (Upregulated Proteins targeted by Downregulated miRNAs) contains the 7 proteins upregulated in the Caco-2 cell line relative to the HT-29 cell line and predicted to be targeted by 15 unique anti-correlated/downregulated microRNAs. Annotation information and associated statistical data for both DE proteins and microRNAs are included.
Integration of 3 parallel datasets identifies a group of 36 DE proteins and corresponding mRNA transcripts that may characterise phenotypic differences between HT-29 and Caco-2 that are not miR-mediated
The data-overlapping strategy similarly identified a group of 35 DE proteins (16 upregulated in Caco-2 relative to HT-29 and 19 downregulated in Caco-2 relative to HT-29) where the corresponding gene transcript was also DE in the direction and where the protein was not predicted to be targeted by any of the DE microRNAs (or where the DE miRNA data results were below detection levels). These proteins and genes (listed together with their attendant annotation information and associated statistical results in Supplementary Table 6) were considered to follow the standard molecular model for gene-protein expression, without being miRNA-mediated (matching mRNA transcripts & proteins were DE (36), no corresponding anti-correlated miRNA data).
Integration of 3 parallel datasets identifies a group of 132 DE proteins where the control of differential protein expression was unclear
Finally, the data-overlapping strategy identified a list of 132 DE proteins (40 upregulated in Caco-2 relative to HT-29 and 92 downregulated in Caco-2 relative to HT-29) where the corresponding gene transcript was (1) not changing or probeset expression levels were below the detection threshold and (2) were not predicted to be targeted by any of the DE microRNAs, or where the DE miRNA data results were below detection levels. As a result, the control of expression of these proteins (listed together with their attendant annotation information and associated statistical results in Supplementary Table 7) is currently unclear (protein DE without mRNA or corresponding targeting miRNA change).
DISCUSSION
Data analysis methodology identifies unique targets involved in actin organisation
The utilisation of three expression profiling technologies (miRNA, proteomics, mRNA) in parallel as part of an integrative methodology on replicate HT-29 and Caco-2 cell lines provides a unique opportunity for characterising these two cell lines, complementing and adding to the previous literature. A total of 37 differentially-expressed proteins and targeting miRNAs were identified as a result of this unique approach, which has (to our knowledge) never before been employed on analysing these two key intestinal cell line models and which will assist in characterising miRNA/protein regulatory networks that may contribute to their respective phenotypic differences.
The key aspect of the experimental design is the combination of data from multiple expression profiling methods and in-silico prediction to study potential miRNA-protein networks that may give rise to differences between the two cell lines. We have previously shown that the use of such a combined data can address some of the disadvantages of solitary profiling methodologies, while also allowing us to identify potential interacting miRNA-protein networks that would not have been identified by a single dataset[31]. As a result, we have prioritised our discussion of these potential interacting networks (whose identification is possible only via the integration of the “tri-omics” (miRNA, protein, mRNA) datasets) which may undergo classical miRNA-mediated translation repression when comparing mucin-producing HT-29 cell phenotype to the more absorptive Caco-2 cell phenotype.
When differing gene- and protein-expression profiles between the cell lines were compared, the key areas of common ontological enrichment between the larger, non-overlapped DE gene and protein lists were in cytoskeleton organization and actin filament bundle assembly, possibly pointing to differences in the shape, support and movement of the cells. We have focused our discussion on a selection of these DE protein and miRNA targets that have demonstrated potential roles in cytoskeleton organization and biogenesis.
Cofilin1 (CFL1), which was downregulated at the protein level in the Caco-2 cell line relative to HT-29 and is targeted by the upregulated miR-182 (Table 1), is part of the ADF/cofilin family of actin-binding proteins which disassemble actin filaments. The role of CFL1 in positively regulating actin filament depolymerisation and severing as well as in cytoskeleton organization, function & biogenesis has been extensively reviewed[34-36]. It has been previously reported that CFL1 levels increase transiently in Caco-2 cells during early differentiation and that this may be linked to the changes required in the actin cytoskeleton[37]. Additionally, several groups have reported that CFL1 has been shown to positively regulate lamellipodium formation and cytoskeletal protein actin projections on the leading edge of highly mobile cells[38-44], although a recent study[45], indicated that it may both inhibit and promote lamellipodium formation. Loss of Cofilin1 has also been shown to hinder apical constriction in mouse endothelial cells[46].
Also targeted by the upregulated miR-182 was the downregulated (in Caco-2 relative to HT-29) protein Cortactin (CTTN)(Table 1), which is a monomeric protein located in the cytoplasm of cells that can be activated by external stimuli to promote actin filament branching and rearrangement of the actin cytoskeleton[47,48] and lamellipodium formation[49,50].
Proteins that were upregulated (in the Caco-2 cell line relative to HT-29) included the actin binding protein Radixin (upregulated 5.57 fold) which was targeted by several DE members of the let-7 miRNA family, as well as miR-31 (Table 2). Radixin is one of three members of the ERM family of proteins, which have a number of roles in the stabilisation and regulation of the membrane cytoskeleton interface[51]. Radixin has been found to be important in the linking of ABCC2 (MRP2) and P-glycoprotein to the actin cytoskeleton in Caco-2 and murine small intestinal cells respectively[52,53].
In addition to roles involving the actin cytoskeleton, the group of 37 proteins also includes those with metabolic functions, such as α-Enolase (Eno1) which catalyses 2-phosphoglycerate to phosphoenolpyruvate in glycolysis, is expressed at a 2.06 fold lower level in Caco-2 cells relative to HT-29 cells and is potentially targeted by miR-22 (Table 1). Down regulation of α-enolase has previously been reported as increasing and then decreasing in early differentiating Caco-2 cells and decreased in formula fed preterm pigs developing necrotizing enterocolitis[37,54]. NONO (2.74 fold lower in Caco-2 v’s HT-29, Table 1) is involved in regulating androgen receptor and carbonic anhydrase activity[55,56].
Identification of differentially expressed miRNA
The profiling of miRNA expression in Caco-2 compared to HT-29 found 160 miRNAs differentially expressed, a number of which ( for example miR-152, miR-99b, miR-125a-5p, miR-10a, miR-196b, miR-222) have been previously identified with intestinal differentiation, function and immune response in a number of species[19-21,57-59]. While in the top 10 upregulated miRNAs in Caco-2 compared to HT-29, an association with cell viability and proliferation has been reported previously for example miR-372 (4743.91 fold up), miR-373 (4451.32 fold up)and miR-122 (57.63 fold up)[60-62]. Similarly among the top 10 downregulated miRNAs in Caco-2 compared to HT-29, there is again an association with cell viability and proliferation for example miR-10a (46.76 fold down), miR-935 (13.38 fold down)[63,64].
Identification of differentially expressed mRNA
A total of 1795 differentially expressed mRNA transcripts were identified DE between HT-29 and Caco-2 cells, comprising of 1084 upregulated and 711 downregulated transcripts (Supplementary Table 3).
GO analysis found that biological processes linked to lipid metabolism, extracellular matrix organisation, cell adhesion and cytoskeleton were over represented. This enrichment for processes linked to cytoskeleton can be seen in a number of targets both down and up regulated.
For example, Moesin, is one of the largest down regulated transcripts (173 fold) in Caco-2 compared to HT-29 (Supplementary Table 3). Which in addition to being down regulated at transcript level, was also downregulated at a protein level (6.48 fold) in Caco-2 compared to HT-29 (Supplementary Table 2). Moesin is a member of the ERM family of proteins and is involved in regulating actin filament depolymerisation along with apical junction assembly and focal adhesion assembly[51]. Moesin often shows different cellular distribution and expression from the other two ERM proteins, radxin and ezrin[65]. It has been previously reported that it is not expressed Caco-2 or in normal colorectal mucosa epithelial cells[52,66].
Similarly, Galectin 4 (LGALS4), whose transcript was down regulated in Caco-2 compared to HT-29 (109 fold, Supplementary Table 3), has as one of its principal functions the improved stabilisation of the apical membrane rafts and the trafficking of apical cell membrane[67,68]. Variable staining for Galectin 4 has been reported in gut villi, with the upper 1/4 of the villus showing negative staining[69]. Galectin 4 is expressed after confluence in Caco-2 cells along with the formation of brush boarder[68]. Galectin 4 linked to the segregation of glycolproteins for apical secretion in HT-29 cells[68].
The Ca2+ binding protein Calbindin 2 (CALB2), also known as Calretinin was down regulated 102 fold in Caco-2 compared to HT-29 cells (Supplementary Table 3). The exact role of calretinin in intestinal epithelia is unknown but it is believed to have roles in the stabilisation of cytokeratins and microtubules, differentiation status and butyrate response[70-72]. The expression profile obtained is in agreement with previous studies which report expression in undifferentiated colon adenocarcinoma cell lines (e.g., HT-29) and no expression in for normal differentiated colon epithelial cells and more differentiated cell lines (e.g., Caco-2) studies[71].
In case of transcripts that were up regulated in Caco-2 relative to HT-29, again a number of targets were linked to cytoskeleton function, for example Fibronectin1 (FN1), whose, mRNA was upregulated 441 fold in the Caco-2 cell line relative to HT-29 (Supplementary Table 3). In addition to having an elevated transcript, fibronectin protein was upregulated 5.15 fold up at a protein in Caco-2 relative to HT-29. Fibronectin is a glycoprotein of the extracellular matrix that binds to membrane-spanning receptor integrins and plays a major role in cell adhesion and actin cytoskeleton organisation[73]. Previously the differentiation of Caco-2 cells has been associated with a down regulation of fibronectin[74,75], but our studies were on proliferating cells.
Another transcript with associated cytoskeleton function that was upregulated was BicC 1 (up regulated 206.75 fold in the Caco-2 cell line relative to HT-29, Supplementary Table 3). BicC1 is believed to bind and regulate mRNA translation via a number of mechanisms such as clustering of target mRNAs or as a chaperone which recruits specific miRNA precursors and Dicer to target mRNAs and subsequently transfers these to AGO for silencing[76,77]. Its expression has been linked positively to the formation of E-cadherin adherens junctions and cortical actin distribution[78].
Also upregulated (192.74 fold) at a transcript level was LRP2 also know Megalin, a member of the low density lipoprotein family (Supplementary Table 3). LRP2/Megalin has been found expressed on the apical surface of a number of epithelial cells, including the brush border membrane of Caco-2 cells and to have upregulated expression in the ileum of sucking rats[79-81].
Identification of differentially expressed protein
A total of 168 differentially expressed proteins were in expressed between Caco-2 and HT-29 (57 were upregulated in Caco-2 and 111 were upregulated in HT-29). Using Pathway Studio to generate GO biological processes for the DE proteins (Supplemental Table 1), it was found that there was an over representation of processes linked to Translation, Carbohydrate metabolism and Actin cytoskeleton organisation. For example Insulin-like growth factor 2 mRNA binding protein (IGF2BP1, IMP1) was upregulated 51.58 fold in Caco-2 relative to HT-29. The expression of IGF2BP1 has previously been linked to morphogenesis of the small intestine[82]
Interestingly, the expression of Insulin-like growth factor 2 mRNA binding protein has been linked to SERPINH1 a 47 KDa stress protein, also called Heat Shock Protein 47 (HSP47). HSP47 is upregulated 9.11 fold in this study in Caco-2 relative to HT-29 (Supplementary Table 2). HSP47 is localised to the endoplasmic reticulum where it is a principle chaperone in the collagen biosynthesis pathway. In addition to HSP47’s role as a collagen chaperone it has been suggested that it plays an important role in reorganising of the actin cytoskeleton in Caco-2, increasing permeability[83].
The protein showing the biggest down regulation (down regulated 91.89 fold) in Caco-2 relative to HT-29 was the secreted glycoprotein Galectin-3-binding protein (also known as Mac2-binding protein) (Supplementary Table 2). Galectin-3-Binding protein has been linked to the innate immune response[84,85].
While the second most down regulated protein in Caco-2 relative to HT-29 was Annexin A13 (down regulated in 42.38 fold, Supplementary Table 2). Annexin A13 is found expressed in the small intestine, involved in the localisation of protein to the apical membrane of cells[86,87].
Overlap comparison of individual datasets with already-published studies identifies several common DE gene, protein and miRNA candidates
In order to determine the comparability of our analysis of these cell line models to other published studies on those same cell lines, we overlapped our three individual DE datasets (gene, protein and miRNA) with a selection of two studies that each combined both cell lines (HT-29 and Caco-2) in the one analysis.
For the miRNA lists, we utilised two studies examining the role of miRNAs in colorectal cancer metastasis in the HT-29 and Caco-2 cell lines; Ma et al[88] (2011) which reported a DE list of just 8 microRNAs and Zhang et al[57] (2012) who disclosed a DE list of 16 microRNAs. Despite the shortness of these previously published lists, we identified the hsa-miR-196a microRNA as commonly DE in both our study and that of Ma et al[88] (2011). Additionally, overlapping our 160 microRNAs with the 16 microRNAs identified by Zhang et al[57]. (2012) identified a further 3 microRNAs (hsa-let-7a, hsa-miR-10a and hsa-miR-98) which were commonly DE between HT-29 and Caco-2 in both studies (see Table 3)[88].
At the protein level, we incorporated two proteomics profiling studies combining HT-29 and Caco-2; one utilising two-dimensional gel electrophoresis and a newer study incorporating LC-MS/MS[89,90]. Overlap analysis of our DE gene dataset with these two studies identified only one protein, LGALS3 (Galectin-3) as DE between all three studies (Table 3). Individual overlaps with each of the two studies separately identified an additional four proteins [ANXA3 (Annexin A3); NCL (Nucleolin); NPM1 (Nucleophosmin) and SORD (Sorbitol dehydrogenase)] commonly DE between our study and that of Uzozie et al[90]., 2014, with a further 12 proteins (ATP5B (ATP synthase subunit beta, mitochondrial); CALR (Calreticulin); CFL1 (Cofilin-1); CTSB (Cathepsin B); ENO1 (Alpha-enolase); HNRNPA2B1 (Heterogeneous nuclear ribonucleoproteins A2/B1); HNRNPL (Heterogeneous nuclear ribonucleoprotein L); HNRNPK (Heterogeneous nuclear ribonucleoprotein K); KRT8 (Keratin, type II cytoskeletal 8); P4HB (Protein disulfide-isomerase); PDIA3 (Protein disulfide-isomerase A3) and PGAM1 (Phosphoglycerate mutase 1) commonly DE between Lenaerts et al[89] (2007) and our study (Table 3)[90].
To overlap our DE genes, we utilised a study examining the expression of 377 xenobiotics metabolism genes (incl. xenobiotic-metabolizing enzymes, transporters, and nuclear receptors and transcription factors)[91], as well as a newer study examining Copper homeostasis and distribution, which utilised the same microarray chip that was used in this study[92] but only disclosed data on 33 DE genes. Overlap analysis of our DE gene dataset with these two studies identified only one gene, the ATF7B (ATPase, Cu2+ transporting, beta polypeptide) transporter as DE between all three studies (Table 3). Overlapping our dataset with only that of Bourgine et al[91]. (2012) identified an additional 25 genes (ABCA3, ABCB1, AKR1E2, ALDH5A1, ALDH6A1, CDCA2, CYP20A1, CYP27A1, EPHX2, GSTA4, NCOA1, NR5A2, RBP2, SLC15A1, SLC19A3, SLC1A3, SLC22A5, SLC6A4, SLC7A6, SLC7A8, SLCO2B1, SLCO4C1, SULT1C4, THRB and VDAC3) commonly DE between both cell lines[91] . The greater degree of overlap identified was most likely due to the roughly 10-fold greater number of DE genes disclosed (377) by Bourgine et al[91] (2012), compared to the 33 genes disclosed by Barresi et al[92] (2016).
DE proteins targeted by microRNAs in other studies
In this study, we propose that several of the proteins described here as DE between the HT-29 and Caco-2 cell lines may be targeted by anti-correlated microRNAs (also DE here between these cell lines) and several of these protein-miRNA regulatory interactions have been demonstrated in other cell systems. For example, microRNA-148a has been shown to inhibit the proliferation and promote the paclitaxel-induced apoptosis of ovarian cancer cells by targeting PDIA3[93]. Additionally, YWHAZ (14-3-3 zeta) was confirmed to be a target of miR-302d by proteomic comparison of Human embryonic stem (ES) cells and their differentiated T3DF fibroblasts[94], while LASP1 has been shown to be directly regulated by miR-145 in bladder cancer cell lines[95-97].
In conclusion the intestinal cell lines Caco-2 and HT-29 are commonly used for the creation of in vitro models for the intestine. However, the use of such cell lines to model the intestine requires that they are fully characterised which will require a better understanding on the molecular controls that these cells exhibit. One such molecular control is the miRNA control mechanism of translation. This study, by performing a tri-omics analysis (genes, proteomics, miRNAs) on parallel data sets generated from these two cell lines, potential miRNA-protein networks were identified that would not have been immediately apparent otherwise. An advantage of such an approach was that the potential biological noise of the study was reduced by running identical samples in parallel to generate each of the profiles. This has the effect of reducing the potential false negative false positive rates of the in silico prediction and allows for high priority list of candidates for functional validation. It is interesting to note that a large number of targets were associated with the actin cytoskeleton and its associated cell processes such as motility, polarisation, endocytosis and cell division. Most of the identified proteins in table I and table II have important roles in the establishment of apical-basal membrane polarity and the formation of the apical microvilli brush border in enterocytes. The data in this study substantially expands the lists of differentially expressed miRNAs, Proteins and mRNAs between Caco-2 and HT-29 cell lines. Furthermore it is to our knowledge the first to provide “tri-omics” analysis (genes, proteomics, miRNAs) on the Caco-2 and HT29 cell lines in combination. It uses the availability of data on multiple levels to identify potential interactions that would not have been identified by a single dataset. Allowing us to provide new information for the characterising of these two cell lines, which are important in intestinal modelling, and help focus on the most likely miRNA candidates for further analysis.
COMMENTS
Background
The development of suitable in vitro models of the intestine is of great interest to the food and pharmaceutical industries. Two commonly used cell lines for the generation of such in vitro models are Caco-2 and HT-29. The use of “-omics” studies (transcriptomics, proteomics and metabolomics) has provided insights into how in vitro models work in comparison to in vivo scenarios and how to improve them. However, the complexity involved identifying of miRNA targets still remains a significant challenge to researchers. This study addresses that complexity by using a “tri-omics” approach to identify targets.
Research frontiers
The development and maturation of “-omics” technologies has revolutionised our understanding of biological processes. However, while there have been previous studies investigating the differences between theses cell lines, these have focused on only one aspect at a time.
Innovations and breakthroughs
This study is the first to use an integrated “tri-omics” analysis (mRNA, proteomics, miRNAs) on Caco-2 and HT-29 to identify potential miRNA-protein networks were identified that would not have been immediately apparent otherwise. These networks contain a large number of targets associated with the actin cytoskeleton and its associated cell processes such as motility, polarisation, and endocytosis.
Applications
The use of an integrated “tri-omics” analysis will support a better understanding of the molecular events underpinning the different biological behaviours of these two cell lines, which are so important to pharmacological and nutritional research.
Terminology
Ward’s method is a robust clustering and is defined as the proximity between two clusters as the increase in squared error that results when two clusters are merged.
Peer-review
This study is well elaborated, uses innovative methodology and supports organoid studies aiming at greater knowledge of enteric epithelial cells.
Footnotes
Manuscript source: Unsolicited manuscript
Specialty type: Gastroenterology and hepatology
Country of origin: Ireland
Peer-review report classification
Grade A (Excellent): A
Grade B (Very good): 0
Grade C (Good): C
Grade D (Fair): D
Grade E (Poor): 0
P- Reviewer: Gassler N, Rajendran V, Sipahi AM S- Editor: Ma YJ L- Editor: A E- Editor: Huang Y
Nakamura Y, Yogosawa S, Izutani Y, Watanabe H, Otsuji E, Sakai T. A combination of indol-3-carbinol and genistein synergistically induces apoptosis in human colon cancer HT-29 cells by inhibiting Akt phosphorylation and progression of autophagy.Mol Cancer. 2009;8:100.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 102][Cited by in F6Publishing: 100][Article Influence: 6.3][Reference Citation Analysis (0)]
Fogh J, Wright WC, Loveless JD. Absence of HeLa cell contamination in 169 cell lines derived from human tumors.J Natl Cancer Inst. 1977;58:209-214.
[PubMed] [DOI][Cited in This Article: ]
Rousset M, Laburthe M, Pinto M, Chevalier G, Rouyer-Fessard C, Dussaulx E, Trugnan G, Boige N, Brun JL, Zweibaum A. Enterocytic differentiation and glucose utilization in the human colon tumor cell line Caco-2: modulation by forskolin.J Cell Physiol. 1985;123:377-385.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 112][Cited by in F6Publishing: 122][Article Influence: 3.1][Reference Citation Analysis (0)]
Maoret JJ, Font J, Augeron C, Codogno P, Bauvy C, Aubery M, Laboisse CL. A mucus-secreting human colonic cancer cell line. Purification and partial characterization of the secreted mucins.Biochem J. 1989;258:793-799.
[PubMed] [DOI][Cited in This Article: ]
Augeron C, Laboisse CL. Emergence of permanently differentiated cell clones in a human colonic cancer cell line in culture after treatment with sodium butyrate.Cancer Res. 1984;44:3961-3969.
[PubMed] [DOI][Cited in This Article: ]
Lesuffleur T, Barbat A, Dussaulx E, Zweibaum A. Growth adaptation to methotrexate of HT-29 human colon carcinoma cells is associated with their ability to differentiate into columnar absorptive and mucus-secreting cells.Cancer Res. 1990;50:6334-6343.
[PubMed] [DOI][Cited in This Article: ]
Chastre E, Emami S, Rosselin G, Gespach C. Vasoactive intestinal peptide receptor activity and specificity during enterocyte-like differentiation and retrodifferentiation of the human colonic cancerous subclone HT29-18.FEBS Lett. 1985;188:197-204.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 21][Cited by in F6Publishing: 23][Article Influence: 0.6][Reference Citation Analysis (0)]
Mathonnet G, Fabian MR, Svitkin YV, Parsyan A, Huck L, Murata T, Biffo S, Merrick WC, Darzynkiewicz E, Pillai RS. MicroRNA inhibition of translation initiation in vitro by targeting the cap-binding complex eIF4F.Science. 2007;317:1764-1767.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 376][Cited by in F6Publishing: 385][Article Influence: 21.4][Reference Citation Analysis (0)]
Clarke C, Henry M, Doolan P, Kelly S, Aherne S, Sanchez N, Kelly P, Kinsella P, Breen L, Madden SF. Integrated miRNA, mRNA and protein expression analysis reveals the role of post-transcriptional regulation in controlling CHO cell growth rate.BMC Genomics. 2012;13:656.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 66][Cited by in F6Publishing: 66][Article Influence: 5.1][Reference Citation Analysis (0)]
Nikitin A, Egorov S, Daraselia N, Mazo I. Pathway studio--the analysis and navigation of molecular networks.Bioinformatics. 2003;19:2155-2157.
[PubMed] [DOI][Cited in This Article: ]
Weed SA, Karginov AV, Schafer DA, Weaver AM, Kinley AW, Cooper JA, Parsons JT. Cortactin localization to sites of actin assembly in lamellipodia requires interactions with F-actin and the Arp2/3 complex.J Cell Biol. 2000;151:29-40.
[PubMed] [DOI][Cited in This Article: ]
Kinley AW, Weed SA, Weaver AM, Karginov AV, Bissonette E, Cooper JA, Parsons JT. Cortactin interacts with WIP in regulating Arp2/3 activation and membrane protrusion.Curr Biol. 2003;13:384-393.
[PubMed] [DOI][Cited in This Article: ]
Arpin M, Chirivino D, Naba A, Zwaenepoel I. Emerging role for ERM proteins in cell adhesion and migration.Cell Adh Migr. 2011;5:199-206.
[PubMed] [DOI][Cited in This Article: ]
Jiang P, Siggers JL, Ngai HH, Sit WH, Sangild PT, Wan JM. The small intestine proteome is changed in preterm pigs developing necrotizing enterocolitis in response to formula feeding.J Nutr. 2008;138:1895-1901.
[PubMed] [DOI][Cited in This Article: ]
Kuwahara S, Ikei A, Taguchi Y, Tabuchi Y, Fujimoto N, Obinata M, Uesugi S, Kurihara Y. PSPC1, NONO, and SFPQ are expressed in mouse Sertoli cells and may function as coregulators of androgen receptor-mediated transcription.Biol Reprod. 2006;75:352-359.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 66][Cited by in F6Publishing: 72][Article Influence: 3.8][Reference Citation Analysis (0)]
Kanaan Z, Rai SN, Eichenberger MR, Barnes C, Dworkin AM, Weller C, Cohen E, Roberts H, Keskey B, Petras RE. Differential microRNA expression tracks neoplastic progression in inflammatory bowel disease-associated colorectal cancer.Hum Mutat. 2012;33:551-560.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 91][Cited by in F6Publishing: 98][Article Influence: 7.5][Reference Citation Analysis (0)]
Stadthagen G, Tehler D, Høyland-Kroghsbo NM, Wen J, Krogh A, Jensen KT, Santoni-Rugiu E, Engelholm LH, Lund AH. Loss of miR-10a activates lpo and collaborates with activated Wnt signaling in inducing intestinal neoplasia in female mice.PLoS Genet. 2013;9:e1003913.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 36][Cited by in F6Publishing: 42][Article Influence: 3.5][Reference Citation Analysis (0)]
Berryman M, Franck Z, Bretscher A. Ezrin is concentrated in the apical microvilli of a wide variety of epithelial cells whereas moesin is found primarily in endothelial cells.J Cell Sci. 1993;105:1025-1043.
[PubMed] [DOI][Cited in This Article: ]
Cargnello R, Celio MR, Schwaller B, Gotzos V. Change of calretinin expression in the human colon adenocarcinoma cell line HT29 after differentiation.Biochim Biophys Acta. 1996;1313:201-208.
[PubMed] [DOI][Cited in This Article: ]
Chen C, Chi H, Sun BG, Sun L. The galectin-3-binding protein of Cynoglossus semilaevis is a secreted protein of the innate immune system that binds a wide range of bacteria and is involved in host phagocytosis.Dev Comp Immunol. 2013;39:399-408.
[PubMed] [DOI][Cited in This Article: ]
Uzozie A, Nanni P, Staiano T, Grossmann J, Barkow-Oesterreicher S, Shay JW, Tiwari A, Buffoli F, Laczko E, Marra G. Sorbitol dehydrogenase overexpression and other aspects of dysregulated protein expression in human precancerous colorectal neoplasms: a quantitative proteomics study.Mol Cell Proteomics. 2014;13:1198-1218.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 35][Cited by in F6Publishing: 39][Article Influence: 3.5][Reference Citation Analysis (0)]
Bourgine J, Billaut-Laden I, Happillon M, Lo-Guidice JM, Maunoury V, Imbenotte M, Broly F. Gene expression profiling of systems involved in the metabolism and the disposition of xenobiotics: comparison between human intestinal biopsy samples and colon cell lines.Drug Metab Dispos. 2012;40:694-705.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 54][Cited by in F6Publishing: 61][Article Influence: 4.7][Reference Citation Analysis (0)]