Deep learning with weak annotation from diagnostic reports for the detection of multiple cranial disorders: a multicenter prospective study

Search in context

Evidence before this study

We searched PubMed and Google Scholar for studies related to the use of artificial intelligence (AI) and deep learning on computed tomography for the diagnosis of head disorders published between January 1, 2017 and March 15 2022, using the search terms “brain disorder”. , “head disorder”, “brain injury”, “hemorrhages”, “fractures”, “ischemia”, “stroke”, “deep learning”, “loosely supervised learning” or “artificial intelligence”, without restriction of language. One study reported an area under the curve (AUC) of 0 991 (SD 0 006) for acute intracranial hemorrhage in an intra-center test, and no between-center test was performed. The model for this previous study was trained with 4,396 expert-annotated CT scans. Another study reported an AUC of 0 942 (95% CI 0 919–0 965) for detecting intracranial hemorrhage and 0 962 (0 920–1 000) for detecting fractures in an intercenter test performed on the CQ500 dataset (in India), in which the model was trained with 5423 expert-annotated CT scans. A third study reported an AUC of 0.961 (95% CI 0.927–0.986) for detection of intracranial hemorrhage in a prospective test set, in which the model was trained on 904 expert-annotated scans. The other studies identified by our search did not achieve the results seen in the research mentioned above using CT with AI or deep learning. All of the studies we identified involved no more than two types of disorder and required heavy expert annotation. We found no publications indicating how AI systems could help and improve the performance of radiologists.

Added value of this study

In this study, we reported a new annotator-less deep learning system using weak annotation from diagnostic reports for accurate and generalizable detection of head disorders. To the best of our knowledge, this system is the first to simultaneously require no expert-annotated CT scans for model training, cover four types of head disorders at the same time, achieve accurate performance for multiple types of disorders in an intra-center test, to generalize well for different centers, different CT equipment and different countries, and improve the performance of radiologists in clinical practice. This study also proposed a new deep learning algorithm with the following unique features compared to existing AI and deep learning models for the detection of head disorders from CT scans. Our system used keyword matching on textual diagnostic reports to generate disorder labels for each CT scan, which requires no expert effort. Therefore, building a large dataset for model training requires less effort, resulting in accurate and generalizable performance. We also proposed RoLo, a new weakly supervised learning algorithm, with a noise-tolerant mechanism for robust learning and a multi-instance learning strategy with an attention module to locate lesions, even without precise labels and exact. Finally, the learning framework is task-independent, so it allows the flexibility to involve new types of disorders, and the principles of this system could be applied to the construction of computer-aided diagnostic systems for a range of other diseases.

Implications of all available evidence

Head disorders, such as cerebral ischemia, hemorrhages, tumors, and skull fractures, significantly affect the structure and function of the head and brain, resulting in high morbidity and mortality. Our system could be a useful aid for radiologists to diagnose several types of head disorders efficiently and accurately. The generalizability of our system has ensured stable performance across different institutions and countries, indicating a potentially wide deployment of our system worldwide.