Tunable structure priors for Bayesian rule learning for knowledge integrated biomarker discovery

doi:10.5306/wjco.v9.i5.98

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 9, Issue 5

This Article

Academic Content and Language Evaluation of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (7404)

All Articles published online

The chart showing PDF series, WORD series, HTML series, Figures (1-11) series, Tables (1-1) series.

Item

Count

PDF

441

WORD

332

HTML

3723

Figures (1-11)

482

Tables (1-1)

510

Sum=5488

Publishing Process of This Article

The chart showing Browse series, Download series.

Item

Count

Browse

665

Download

1251

Sum=1916

Sep 14, 2018 (publication date) through Aug 13, 2025

Times Cited of This Article

Times Cited (2)

Journal Information of This Article

Publication Name

World Journal of Clinical Oncology

ISSN

2218-4333

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Basic Study

World J Clin Oncol. Sep 14, 2018; 9(5): 98-109
Published online Sep 14, 2018. doi: 10.5306/wjco.v9.i5.98

Tunable structure priors for Bayesian rule learning for knowledge integrated biomarker discovery

Jeya Balaji Balasubramanian, Vanathi Gopalakrishnan

Jeya Balaji Balasubramanian, Intelligent Systems Program, School of Computing and Information, University of Pittsburgh, Pittsburgh, PA 15260, United States

Vanathi Gopalakrishnan, Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15206, United States

Author contributions: Balasubramanian JB developed the concept, conducted the research, and prepared the first draft of the manuscript in consultation with research mentor and senior author Gopalakrishnan V; All authors contributed to writing and editing the manuscript.

Supported by National Institute of General Medical Sciences of the National Institutes of Health, No. R01GM100387.

Conflict-of-interest statement: The authors declare no conflicts of interest with respect to the submitted manuscript.

Open-Access: This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/

Correspondence to: Vanathi Gopalakrishnan, PhD, Associate Professor, Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Room 530, 5607 Baum Boulevard, Pittsburgh, PA 15206, United States. vanathi@pitt.edu

Telephone: +1-412-6243290 Fax: +1-412-6245310

Received: April 27, 2018
Peer-review started: April 27, 2018
First decision: July 9, 2018
Revised: July 24, 2018
Accepted: August 5, 2018
Article in press: August 5, 2018
Published online: September 14, 2018
Processing time: 140 Days and 15.9 Hours

Abstract

AIM

To develop a framework to incorporate background domain knowledge into classification rule learning for knowledge discovery in biomedicine.

METHODS

Bayesian rule learning (BRL) is a rule-based classifier that uses a greedy best-first search over a space of Bayesian belief-networks (BN) to find the optimal BN to explain the input dataset, and then infers classification rules from this BN. BRL uses a Bayesian score to evaluate the quality of BNs. In this paper, we extended the Bayesian score to include informative structure priors, which encodes our prior domain knowledge about the dataset. We call this extension of BRL as BRL_p. The structure prior has a λ hyperparameter that allows the user to tune the degree of incorporation of the prior knowledge in the model learning process. We studied the effect of λ on model learning using a simulated dataset and a real-world lung cancer prognostic biomarker dataset, by measuring the degree of incorporation of our specified prior knowledge. We also monitored its effect on the model predictive performance. Finally, we compared BRL_p to other state-of-the-art classifiers commonly used in biomedicine.

RESULTS

We evaluated the degree of incorporation of prior knowledge into BRL_p, with simulated data by measuring the Graph Edit Distance between the true data-generating model and the model learned by BRL_p. We specified the true model using informative structure priors. We observed that by increasing the value of λ we were able to increase the influence of the specified structure priors on model learning. A large value of λ of BRL_p caused it to return the true model. This also led to a gain in predictive performance measured by area under the receiver operator characteristic curve (AUC). We then obtained a publicly available real-world lung cancer prognostic biomarker dataset and specified a known biomarker from literature [the epidermal growth factor receptor (EGFR) gene]. We again observed that larger values of λ led to an increased incorporation of EGFR into the final BRL_p model. This relevant background knowledge also led to a gain in AUC.

CONCLUSION

BRL_p enables tunable structure priors to be incorporated during Bayesian classification rule learning that integrates data and knowledge as demonstrated using lung cancer biomarker data.

Keywords: Supervised machine learning; Rule-based models; Bayesian methods; Background knowledge; Informative priors; Biomarker discovery

Core tip: Bayesian rule learning is a unique rule learning algorithm that infers rule models from searched Bayesian networks. We extended it to allow the incorporation of prior domain knowledge using a mathematically robust Bayesian framework with structure priors. The hyperparameter of the structure priors enables the user to control the influence of their specified prior knowledge. This opens up many possibilities including incorporating uncertain knowledge that can interact with data accordingly during inference.