AmnioML

Accurately predicting the precise volume of amniotic fluid is fundamental to assessing pregnancy risks, though the task usually requires many hours of laborious work by medical experts. AmnioML is a machine learning solution that leverages deep learning and conformal prediction to output accurate volume estimates and segmentation masks from fetal MRIs in a few seconds, with Dice coefficient over 0.9, along with valid predictive intervals.

We make available a novel, curated dataset for fetal MRIs with 853 exams, and benchmark the performance of many recent deep learning architectures. It can be found here, in two formats: NRRD files and HDF files. (If you link it, please use either https://impa.br/~daniel.csillag/projects/amnioml/dataset or https://impa.br/~pauloo/amnioml/data).

Example slice of an input exam along with its target amniotic fluid segmentation.

We also introduce modifications to conformal prediction tools that yield narrower predictive intervals with valid coverage, thus aiding doctors in quantifying pregnancy risks.

A comparison of AmnioML’s predicted amniotic fluid volumes and their corresponding target volumes, with AmnioML’s generated predictive intervals (for a confidence of 90%) displayed.

A case study of AmnioML use in a medical setting is also reported. Segmentations were rated by specialists from 1 to 5:

Worse than automatic thresholding;
Same quality as automatic thresholding;
A lot of manual adjustments were necessary;
A few manual adjustments were necessary; and
No manual adjustments were necessary.

Ratings (1) and (2) compare AmnioML against thresholding, a basic but popular first-step color filtering technique commonly used in fetal segmentation that typically requires extensive refinement. Ratings (3)-(5) indicate the level of manual work required to post-process AmnioML’s automatic segmentation beyond what is provided by simple thresholding.

Real-world clinical benefits range from up to 20x segmentation time reduction, with over 60% of segmentations requiring no further human intervention. AmnioML’s volume predictions were found to be highly accurate in practice, with mean absolute error below 55mL, and tight predictive intervals.

Segmentation times with and without the aid of AmnioML, split by the rating of predictions (ratings of 1 and 2 were negligible and are not displayed). The average time reduction was of 20x, achieving around 60x in many cases.

The code – both to reproducing the results in the paper and for our deployed solution (which is a plugin for 3D Slicer) – is available at dccsillag/amnioml