Comparative evaluation of artificial intelligence systems' accuracy in providing medical drug dosages: A methodological study

doi:10.5662/wjm.v14.i4.92802

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 14, Issue 4

This Article

Academic Content and Language Evaluation of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Supplementary Materials of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (4546)

All Articles published online

The chart showing PDF series, HTML series, Figures (1-5) series, Tables (1-4) series.

Item

Count

PDF

128

HTML

2562

Figures (1-5)

484

Tables (1-4)

551

Sum=3725

Publishing Process of This Article

The chart showing Browse series, Download series.

Item

Count

Browse

Download

606

Sum=678

Dec 20, 2024 (publication date) through Aug 10, 2025

Times Cited of This Article

Times Cited (0)

Journal Information of This Article

Publication Name

World Journal of Methodology

ISSN

2222-0682

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Observational Study

World J Methodol. Dec 20, 2024; 14(4): 92802
Published online Dec 20, 2024. doi: 10.5662/wjm.v14.i4.92802

Table 1 Total count and percentage of 'Yes' responses for each Large Language Model

System	Neutral	No	Yes	Total
GPT 4	1	74	387 (83.77)	462
GPT 3.5	5	101	356 (77.06)	462
Bard	129	80	253 (54.76)	462

Data are n (%).

Table 2 Weighted accuracy comparison across the Large Language Model

Model	Weighted accuracy
ChatGPT 4	0.6775
ChatGPT 3.5	0.5519
Bard	0.3745

Table 3 Disease accuracy comparison

Name of disease	ChatGPT 4	ChatGPT 3.5	Bard
Acromegaly	1.0	1.0	1.0
Orthostatic hypotension	1.0	1.0	0.0
Myasthenia gravis	1.0	1.0	0.5
Myoclonus	1.0	1.0	1.0
Myotonic dystrophy	1.0	-1.0	1.0
Neonatal onset multisystem inflammatory disease	1.0	1.0	1.0
Neoplastic spinal cord compression	1.0	1.0	1.0
Nephrolithiasis	1.0	1.0	1.0
Neurological infections	1.0	1.0	0.0
Neuromyelitis optica	1.0	0.0	1.0
Thiamine deficiency	-1.0	1.0	1.0
Anaphylactic reaction	-1.0	-1.0	0.0
Reactive arthritis	-1.0	-1.0	0.0
Fibrous dysplasia	-1.0	-1.0	1.0
Hypothyroidism	-1.0	-1.0	-1.0
Multiple sclerosis	-1.0	-1.0	-1.0
Hypophosphatemia	-1.0	-1.0	1.0
Hypomagnesemia	-1.0	-1.0	0.0
Alcohol intoxication	-1.0	-1.0	0.0
Post-concussive state	-1.0	-1.0	0.0

Table 4 Detailed accuracy values for each organ system across the three Large Language Model

Organ system	ChatGPT 4	ChatGPT 3.5	Bard
Cardio vascular system, respiratory system	1.0000	0.6667	0.6667
Hematology	1.0000	1.0000	-1.0000
Respiratory	1.0000	0.3333	0.3333
Respiratory system	1.0000	1.0000	0.5000
Infectious diseases	0.8039	0.7451	0.2059
Immune system	0.6752	0.4188	0.2650
Central nervous system	0.6585	0.6220	0.5610
Hematological malignancies	0.6429	0.5714	0.4286
Cardio vascular system	0.6000	0.6667	0.3333
Endocrine system	0.5556	0.4444	0.5714
Renal	0.5556	0.3704	0.5185
Gastrointestinal tract	0.5385	0.2308	0.2308

Citation: Ramasubramanian S, Balaji S, Kannan T, Jeyaraman N, Sharma S, Migliorini F, Balasubramaniam S, Jeyaraman M. Comparative evaluation of artificial intelligence systems' accuracy in providing medical drug dosages: A methodological study. World J Methodol 2024; 14(4): 92802
URL: https://www.wjgnet.com/2222-0682/full/v14/i4/92802.htm
DOI: https://dx.doi.org/10.5662/wjm.v14.i4.92802