Medical imaging is a significant part of modern healthcare. It boosts both the precision, reliability and development of treatment for various diseases. Artificial intelligence has also been widely used to further enhance the process.
However, conventional medical image diagnosis that employs AI algorithms requires large amounts of annotations as supervision signals for model training. To acquire accurate labels for the AI algorithms – radiologists, as part of the clinical routine, prepare radiology reports for each of their patients, followed by annotation staff extracting and confirming structured labels from those reports using human-defined rules and existing natural language processing (NLP) tools. The ultimate accuracy of extracted labels hinges on the quality of human work and various NLP tools. The method comes at a heavy price, being both labour intensive and time-consuming.
An engineering team at the University of Hong Kong (HKU) has developed a new approach “REFERS” (Reviewing Free-text Reports for Supervision), which can cut the human cost down by 90%, by enabling the automatic acquisition of supervision signals from hundreds of thousands of radiology reports at the same time. It attains a high accuracy in predictions, surpassing its counterpart of conventional medical image diagnosis employing AI algorithms.
The innovative approach marks a solid step towards realising generalised medical artificial intelligence. The breakthrough was published in Nature Machine Intelligence in the paper titled “Generalized radiograph representation learning via cross-supervision between images and free-text radiology reports”.
Professor YU Yizhou, the leader of the team from HKU’s Department of Computer Science under the Faculty of Engineering stated that AI-enabled medical image diagnosis has the potential to support medical specialists in reducing their workload and improving the diagnostic efficiency and accuracy, including but not limited to reducing the diagnosis time and detecting subtle disease patterns.
The team believes that abstract and complex logical reasoning sentences in radiology reports provide sufficient information for learning easily transferable visual features. With appropriate training, REFERS directly learns radiograph representations from free-text reports without the need to involve manpower in labelling, Professor Yu remarked.
For training REFERS, the research team uses a public database with 370,000 X-Ray images, and associated radiology reports, on 14 common chest diseases including atelectasis, cardiomegaly, pleural effusion, pneumonia and pneumothorax.
The researchers were able to build a radiograph recognition model using 100 radiographs only and attain 83% accuracy in predictions. When the number was increased to 1,000, their model exhibits amazing performance with an accuracy of 88.2%, which surpasses its counterpart trained with 10,000 radiologist annotations (accuracy at 87.6%). When 10,000 radiographs were used, the accuracy is at 90.1%. In general, an accuracy above 85% in predictions is useful in real-world clinical applications.
REFERS achieves the goal by accomplishing two report-related tasks, i.e., report generation and radiograph–report matching. In the first task, REFERS translates radiographs into text reports by first encoding radiographs into an intermediate representation, which is then used to predict text reports via a decoder network.
A cost function is defined to measure the similarity between predicted and real report texts, based on which gradient-based optimisation is employed to train the neural network and update its weights.
As for the second task, REFERS first encodes both radiographs and free-text reports into the same semantic space, where representations of each report and its associated radiographs are aligned via contrastive learning.
The paper’s first author, Dr ZHOU Hong-Yu, said that compared to conventional methods that heavily rely on human annotations, REFERS can acquire supervision from each word in the radiology reports. The amount of data annotation can be substantially reduced (i.e., by 90%) and the cost to build medical artificial intelligence is also lowered. This marks a significant step towards realising generalised medical artificial intelligence.