Clinical and Translational Research
Copyright ©The Author(s) 2023. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Gastrointest Oncol. Jul 15, 2023; 15(7): 1215-1226
Published online Jul 15, 2023. doi: 10.4251/wjgo.v15.i7.1215
Integrated analysis of single-cell and bulk RNA-seq establishes a novel signature for prediction in gastric cancer
Fei Wen, Xin Guan, Hai-Xia Qu, Xiang-Jun Jiang
Fei Wen, Qingdao University, Medical College, Qingdao 266000, Shandong Province, China
Xin Guan, Hai-Xia Qu, Xiang-Jun Jiang, Department of Gastroenterology, Qingdao Municipal Hospital, Qingdao 266071, Shandong Province, China
Author contributions: Jiang XJ designed and coordinated the study; Wen F, Qu HX, and Guan X performed data collection and analysis; Wen F interpreted the data and wrote the manuscript; All authors approved the final version of the article.
Institutional review board statement: Given that our article is based on a study of sequencing data in the public database, GEO, there are no ethical issues involved, so the institutional review board approval form or document and institutional animal care and use committee approval form or document are not applicable.
Conflict-of-interest statement: All the authors report having no relevant conflicts of interest for this article.
Data sharing statement: No additional data are available.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Xiang-Jun Jiang, PhD, Doctor, Department of Gastroenterology, Qingdao Municipal Hospital, No. 1 Jiaozhou Road, Qingdao 266071, Shandong Province, China. drjxj@163.com
Received: February 1, 2023
Peer-review started: February 1, 2023
First decision: March 21, 2023
Revised: March 31, 2023
Accepted: May 8, 2023
Article in press: May 8, 2023
Published online: July 15, 2023
Processing time: 161 Days and 2.5 Hours
ARTICLE HIGHLIGHTS
Research background

Improving early diagnosis rates of gastric cancer (GC) is of great importance for reducing GC-related deaths. This study aimed to construct a predictive model for GC by integrating single-cell sequencing data and bulk RNA sequencing (bulk RNA-seq) data to identify potential targets for GC prediction.

Research motivation

Identifying predictive targets for GC is an important approach to reduce GC-related deaths, which is the driving force behind this study.

Research objectives

The objective of this study was to develop a predictive model for GC by combining single-cell sequencing data and bulk RNA-seq data and to identify potential targets for predicting GC.

Research methods

We downloaded GC single-cell sequencing and bulk RNA-seq datasets from the Gene Expression Omnibus and University of California at Santa Cruz databases. The single-cell sequencing data were analyzed using the Seurat package, and the bulk RNA-seq data were analyzed using the limma package. The construction of the GC prediction model was based on the Least absolute shrinkage and selection operator (LASSO) and random forest methods. Survival analysis was conducted using the KM-PLOTTER online database.

Research results

By analyzing single-cell RNA sequencing data from 70707 cells from GC tissue, normal gastric tissue, and chronic gastric tissue, we identified 10 different cell types and screened for genes differentially expressed between GC and normal epithelial cells. After determining differentially expressed genes identified from batch RNA sequencing data of GC and normal gastric samples, we constructed a GC prediction classifier using LASSO and random forest methods. The LASSO classifier performed well when validated and when the model was verified using The Cancer Genome Atlas and Genotype-Tissue Expression datasets [area under the curve (AUC)_min = 0.988, AUC_1se = 0.994], and the random forest model also achieved good results with the validation set (AUC = 0.92). We identified genes such as TIMP1, PLOD3, CKS2, TYMP, TNFRSF10B, CPNE1, GDF15, BCAP31, and CLDN7 with significant importance in multiple GC prediction models, and KM-PLOTTER analysis showed their relevance to GC prognosis, indicating their potential value in GC diagnosis and treatment. However, the limitation of our study is the lack of clinical sample validation for the GC prediction models.

Research conclusions

This study demonstrates that the combination of single-cell sequencing data and bulk RNA-seq data is feasible for constructing a GC prediction model.

Research perspectives

Using single-nucleus sequencing to assist in constructing GC prediction models may lead to more reliable results, as it has advantages in identifying epithelial cells.