In this work, we compared the results of a caries detection system, developed by us at ParallelDots, with three practicing dentists and found that our system has a higher agreement (F Score) with clinically verified ground truth than all three individually (the difference between system's F Score and average F-Score of the dentists is over 17%). Our system has higher sensitivity with respect to Dentists individually and hence can be used as a tool to ease the work of dentists by suggesting them possible caries they can then verify and treat. A breakdown of metrics in our test is given as an attached table. Please note that the metrics Recall and Precision are names Machine Learning community generally uses for Sensitivity and Positive Predictive Value.
Our paper detailing the experiments performed has been accepted at the NIPS 2017 Workshop on Machine Learning for Health being held on the theme "What Parts of Healthcare are Ripe for Disruption by Machine Learning Right Now?". NIPS (Neural Information Processing Systems) is among the topmost Machine Learning conferences globally and has two tracks of papers and multiple focussed workshops.