Review
Copyright ©The Author(s) 2019. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Gastroenterol. Jun 28, 2019; 25(24): 2990-3008
Published online Jun 28, 2019. doi: 10.3748/wjg.v25.i24.2990
Application of Big Data analysis in gastrointestinal research
Ka-Shing Cheung, Wai K Leung, Wai-Kay Seto
Ka-Shing Cheung, Wai K Leung, Wai-Kay Seto, Department of Medicine, The University of Hong Kong, Queen Mary Hospital, Hong Kong, China
Ka-Shing Cheung, Wai-Kay Seto, Department of Medicine, The University of Hong Kong-Shenzhen Hospital, Shenzhen 518053, Guangdong Province, China
Author contributions: All authors contributed equally to this paper with literature review and analysis, drafting and critical revision and editing, and approval of the final version of this article.
Conflict-of-interest statement: WKL has received an honorarium for attending advisory board meetings of Boehringer Ingelheim and Takeda. WKS received honorarium for attending advisory board meetings of AbbVie, Celltrion and Gilead; speaker fees from AbbVie, Astrazeneca, Eisai, Gilead and Ipsen; and research funding from Gilead.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Corresponding author: Wai-Kay Seto, FRCP (C), MBBS, MD, MRCP, Associate Professor, Department of Medicine, The University of Hong Kong, Queen Mary Hospital, 102 Pokfulam Road, Hong Kong, China. wkseto@hku.hk
Telephone: +86-852-22553994 Fax: +86-852-28725828
Received: March 12, 2019
Peer-review started: March 13, 2019
First decision: April 11, 2019
Revised: April 14, 2019
Accepted: April 29, 2019
Article in press: April 29, 2019
Published online: June 28, 2019
Processing time: 109 Days and 0.4 Hours
Abstract

Big Data, which are characterized by certain unique traits like volume, velocity and value, have revolutionized the research of multiple fields including medicine. Big Data in health care are defined as large datasets that are collected routinely or automatically, and stored electronically. With the rapidly expanding volume of health data collection, it is envisioned that the Big Data approach can improve not only individual health, but also the performance of health care systems. The application of Big Data analysis in the field of gastroenterology and hepatology research has also opened new research approaches. While it retains most of the advantages and avoids some of the disadvantages of traditional observational studies (case-control and prospective cohort studies), it allows for phenomapping of disease heterogeneity, enhancement of drug safety, as well as development of precision medicine, prediction models and personalized treatment. Unlike randomized controlled trials, it reflects the real-world situation and studies patients who are often under-represented in randomized controlled trials. However, residual and/or unmeasured confounding remains a major concern, which requires meticulous study design and various statistical adjustment methods. Other potential drawbacks include data validity, missing data, incomplete data capture due to the unavailability of diagnosis codes for certain clinical situations, and individual privacy. With continuous technological advances, some of the current limitations with Big Data may be further minimized. This review will illustrate the use of Big Data research on gastrointestinal and liver diseases using recently published examples.

Keywords: Healthcare dataset; Epidemiology; Gastric cancer; Inflammatory bowel disease; Colorectal cancer; Hepatocellular carcinoma; Gastrointestinal bleeding

Core tip: Digital collection and storage of data has led to the generation of Big Data. Big Data analysis in the field of gastroenterology and hepatology allows for phenomapping due to disease heterogeneity (e.g., inflammatory bowel disease, gastrointestinal and liver cancers) and hence the development of precision medicine, enhances in drug safety and faster drug discovery. It has also revolutionized clinical study approaches. Although there are still limitations to Big Data approaches, some of them may be further minimized with continuous technological advances.