false
Catalog
AANS Online Scientific Session: Pediatrics
Machine Versus Human: Deep Learning to Automatical ...
Machine Versus Human: Deep Learning to Automatically Detect and Diagnose Pediatric Brain Tumors in a Large Multi-Institutional Study
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Hello, my name is Jennifer Kwan and I'm a PGY5 resident at Stanford. In this presentation, I'm going to be discussing our deep learning model for diagnosing posterior fosso tumors. I have no disclosures. No matter where you look, you'll see news about artificial intelligence. But how much of this is hype and how much is reality? Well, many of these tools are already being used in our daily lives. And they're also being explored in medicine. Groups at Stanford have shown that deep learning models are just as good at reading chest x-rays and EKGs. And that's not just in research. Deep learning has also been approved by the FDA for retinal scans and stroke detection. But what about neurosurgery? As we know, tumor subtypes vary widely in their behavior and management. Neurosurgical biopsy and resection is an important part of diagnosing and managing tumors. But we wanted to know if we could use artificial intelligence tools to help us do better. Specifically, if we could use deep learning to help diagnose posterior fosso tumors using imaging alone. We identified a cohort of 617 patients with posterior fosso tumors across five children's hospitals. All patients underwent routine imaging protocols for their own institution. For our study, we used pre-intervention T2 MRIs. We also obtained 200 normal brain MRIs from our repository at Stanford. Patients with medulloblastoma, pyelocytic astrocytoma, and dependamoma all had radiologic and confirmed pathologic diagnoses. Again, the images used for our model were prior to biopsy or resection. Patients with DIPG had primarily radiologic diagnoses. We separated images into training, validation, and a held-out test set. The model was developed primarily using tumor images. And final model performance was tested on 135 tumor and 183 normal images it had never seen before. Our model architecture is a ResNxt50 pre-trained on ImageNet, but fine-tuned using our own tumor images. The final layer in the network predicts tumor type, as well as the relative position of the tumor slice in the scan. We enhanced predictive accuracy using an ensemble model comprised of a majority vote between five individual models. For each slice, the model predicts the presence or absence of tumor. The slice predictions are tallied. If the predicted tumor slices exceeds a certain threshold, then the entire scan is detected as having a tumor. The tumor type with the most votes was used to classify the scan. In deep learning, we're asking models to perform tasks, but not dictating what features they use to do so. So it can sometimes feel like a black box. To gain insight into how the model is making predictions, we use class activation maps. You can see that red are hot areas, and we can see that the model is focusing on the cystic portion of the pyelocytic astrocytoma. Below are its secondary predictions, with a level of confidence in those predictions. In this case, the model correctly predicts the pyelocytic astrocytoma. Here, the model is focusing on the tumor region, as well as the surrounding area of tumor invasion, and correctly predicts medulloblastoma. We also use class activation maps to visualize the incorrect predictions, which is perhaps even more helpful. This brainstem pyelocytic was incorrectly predicted to be a DIPG. But you can see on the bottom that the model's secondary prediction was actually pyelocytic. In this case, a cystic medulloblastoma was incorrectly predicted to be a pyelocytic. But again, the secondary prediction was actually the correct one. As I mentioned earlier, we don't pre-define what features the model uses, but we can examine the pattern of features the model found relevant to detection and classification. We use principal component analysis to visualize this pattern from the final layers of our model. You can see that DIPG has the most distinctive feature space, followed by pyelocytic astrocytoma and medulloblastoma. You can see that the ependymoma feature space overlaps significantly with medulloblastoma. We also compared tumor detection between the myelocytic and pyelocytic features. We also compared tumor detection between the model in blue and the radiologist in black, and both had very high detection rates. We also looked at classification accuracy compared to the radiologists. We had two neuroradiologists, one pediatric body radiologist, and one interventionalist. The model was more accurate than the pediatric radiologist and interventionalist, less accurate than one of the neuroradiologists, and comparable to the other. We examined model performance by tumor type. Striped on the left is the model, gray is the average of the radiologists. The model had higher sensitivity and specificity for classifying DIPG, medulloblastoma, and pyelocytic astrocytoma. However, the model was worse at classifying ependymoma, as we might expect after looking at the principal component analysis. In conclusion, we thought of our model as a fifth radiologist, as its performance was comparable to a clinical expert. We think that this could be a useful clinical adjunct, especially for trainees and in places where pediatric subspecialists are limited. Despite the model's limitations, automated diagnosis is faster. Each radiologist took about 2-3 hours to read about 300 test scans, whereas the model output took seconds. We are also working to improve model performance. We'd like to expand to other tumor types and image sequences. We also hope that future work can start to incorporate information about genetic subtype. I'd like to thank our multidisciplinary group at Stanford, and all of our outside collaborators. Thank you.
Video Summary
In this video, Jennifer Kwan, a PGY5 resident at Stanford, discusses a deep learning model for diagnosing posterior fosso tumors. She explains that deep learning models are already being used in various fields, including medicine, and are effective in reading chest x-rays, EKGs, retinal scans, and stroke detection. The study focused on using artificial intelligence tools to diagnose posterior fosso tumors using imaging alone. The model architecture was based on a ResNxt50 pre-trained on ImageNet and fine-tuned using tumor images. The model's performance was tested on tumor and normal images it had never seen before. The model performed comparably to clinical experts and provided faster automated diagnosis. The team hopes to expand the model to other tumor types and incorporate genetic subtype information in the future.
Asset Subtitle
Jennifer Lauren Quon, MD
Keywords
deep learning model
diagnosing posterior fossa tumors
artificial intelligence tools
ResNxt50
automated diagnosis
×
Please select your language
1
English