Published online Jun 18, 2022. doi: 10.5312/wjo.v13.i6.603
Peer-review started: November 30, 2021
First decision: January 11, 2022
Revised: January 20, 2022
Accepted: May 13, 2022
Article in press: May 13, 2022
Published online: June 18, 2022
Processing time: 198 Days and 15.7 Hours
Artificial intelligence (AI)-based on deep leaning (DL) has demonstrated promising results for the interpretation of radiographs. To develop a machine learning (ML) program capable of interpreting orthopedic radiographs with accuracy, a project called DL algorithm for orthopedic radiographs was initiated. It was used to diagnose knee osteoarthritis (KOA) using Kellgren-Lawrence scales in the first phase.
By using DL methods trained by senior radiologists and orthopedic surgeons in larger hospitals, smaller institutions could gain the expertise they need and create more space in emergency care, where medical professionals may not be readily available. Providing care in this manner would improve access dramatically.
This study aimed to explore the use of transfer learning convolutional neural network for medical image classification applications using KOA as a clinical scenario, comparing eight different transfer learning DL models for detecting the grade of KOA from a radiograph, to compare the accuracy between results from AI models and expert human interpretation, and to identify the most appropriate model for detecting the grade of KOA.
As per the Kellgren-Lawrence scale, three orthopedic surgeons reviewed these independent cases, graded their severity for OA, and settled disagreements through consensus. To assess the efficacy of ML in accurately classifying radiographs for KOA, eight models were used, including ResNet50, VGG-16, InceptionV3, MobilnetV2, EfficientNetB7, DenseNet201, Xception, and NasNetMobile. A total of 2068 images were used, of which 70% were used initially to train the model, 10% were then used to test the model, and 20% were used for accuracy testing and validation of each model.
Overall, our network showed a high degree of accuracy for detecting KOA, ranging from 54% to 93%. Some networks were highly accurate, but few had an efficiency of more than 50%. The DenseNet model was the most accurate, at 93%, while expert human interpretation indicated accuracy of 74%.
The study has compared the accuracy provided by expert human interpretation and AI models. It showed that an AI model can successfully classify and differentiate the knee X-ray image with the presence of different grades of KOA or by using various transfer learning convolution neural network models against human actions to classify the same. The purpose of the study was to pave the way for the development of more accurate models and tools, which can improve the classification of medical images by ML and provide insight into orthopedic disease pathology.
AI can only operate within the areas in which it has been trained, whereas human intelligence and its interpretation are independent of the area in which it has been trained. One of the key differences between humans and machines is that humans will be able to solve problems related to unforeseen domains, while the latter will not have the capability to do that. It can be accomplished by increasing the size or number of parameters in the ML model, examining the complexity or type of the model, increasing the time spent training, and increasing the number of iterations until the loss function in ML is minimized.