
New research at ACR Convergence 2023, the annual meeting of the American College of Rheumatology (ACR), shows that a deep learning system can accurately identify and predict joint space narrowing and erosions in hand x-rays of patients with rheumatoid arthritis (RA) (Abstract #0745) .
X-rays are the most commonly used imaging technique for detecting and monitoring RA in the hand. Radiologists often use the well-validated Sharp/van der Heidje (SvH) method to evaluate joint space narrowing and erosions by assessing specific locations in each hand and wrist. However, scoring SvH is time-consuming and requires expertise that is not always available. This has led to increased use of deep learning (also called machine learning) to analyze hand X-ray data in RA.
According to Carol Hitchon, MD, FRCPC, MSc, associate professor at the University of Manitoba and clinical scientist in rheumatology and lead co-author of the study: “Machine learning offers a powerful and complementary approach to traditional RA detection and diagnosis methods. It improves the accuracy, efficiency and objectivity of RA radiograph assessment, while providing the opportunity for early detection of damage and valuable insights into the disease.”
For the current study, Hitchon and colleagues aimed to develop and validate a deep learning system for the automated detection of joints and prediction of SvH scores on hand radiographs of patients with RA.
They used a convolutional neural network (CNN)-based algorithm called You Only Look Once (YOLO). CNN is a deep learning neural network commonly used in computer vision and recognition tasks and has been successfully used in medical image classification. YOLO is a type of CNN model specifically designed for real-time object detection in images and videos and known for its speed and efficiency in image processing. Hitchon and colleagues used a recent version of YOLOv516, which they showed to be more than 90% accurate in detecting hand joints.
The YOLO model was trained to detect joints in 240 training and evaluation pediatric hand radiographs from the Radiologic Society of North America database.
The researchers boxed and labeled the different joints of interest: proximal interphalangeal, metacarpophalangeal, wrist, distal radius, and distal ulna. The joint detection model was validated with 54 clinician-labeled radiographs from four adult RA patients followed for more than ten years.
Researchers then applied a vision transformer model (VTM) to predict the erosion and joint space narrowing score of each joint. Hitchon explains that a VTM is a deep learning architecture designed to efficiently process and understand sets of data.
It works by splitting an image into small chunks, transforming or flattening the chunks into a sequence, creating low-dimensional linear embeddings from the flattened spots, adding the positional embedding, and then running the encoded sequence into a standard transformer encoder for the remaining prediction task. “
Carol Hitchon, MD, FRCPC, MSc, Associate Professor, University of Manitoba
The VTM was validated using more than 2,200 hand radiographs from 381 RA patients to whom the physician assigned SvH scores. Patients were from the Canadian Early Arthritis Cohort, a multicenter Canadian study. These scored radiographs were used as the gold standard for this study.
The joint detection model was trained to detect the entire wrist, but the researchers had SvH scores for individual wrist joints, so they trained a separate model to detect joint space narrowing and erosion in each joint.
When they evaluated the accuracy of their models, they found:
- The joint detection model accurately identified target joints. The F1 score for children was 0.991 and the F1 score for adults was 0.812. (In machine learning, the F1 score is a metric that measures the accuracy of a model).
- VTM predictions for joint space narrowing and erosion were very accurate. The principal square error, which evaluates the accuracy of predictions, was 0.91 and 0.93, respectively.
- The multitask models predicted SvH erosion and joint space narrowing scores of individual wrist joints with moderate accuracy (0.6 to 0.91).
Hitchon says they weren’t surprised by the performance of their model.
“The AI technologies we applied in this study have been successfully and widely used in other domains, some of which have been commercialized. Compared to the model’s performance in other domains, our performance is relatively low in predicting X-ray scores for some joint types, such as the wrist. [This] may be due to the relatively small sample size in our study or to the complexity of the anatomy of the wrist joint,” she notes.
Hitchon also says the model’s performance does not match that of human radiologists for joints such as the wrist.
“The AI models cannot replace human radiologists at this stage, but they will be excellent complementary tools that can improve the overall quality and efficiency of radiograph scoring analysis when used in conjunction with the radiologist’s judgment. [these models] may be applicable to the interpretation of large volumes of radiographs in clinical trials.”
The study has two major limitations: X-rays were obtained from cohorts composed almost entirely of white women, and the findings may not apply to races and ethnicities traditionally underrepresented in research studies. Hitchon acknowledges that the findings need to be replicated in other groups. The model also lacks the ability to learn and become more accurate with subsequent images, although Hitchon says they are developing a new deep learning framework so that the model continuously learns as new data is available.
This study received local funding from the Health Science Center Foundation, a hospital charity in Winnipeg, Manitoba, Canada. One of the co-authors, Pingzhao Hu, is supported by the Canada Research Chair Program. The Canadian Early Arthritis Cohort, which provided one set of radiographs, is funded by multiple sources.
Source:
American College of Rheumatology

Leave a Reply