News – AI for Surgery Lab at JHU

Tae Soo Kim et al., IPCAI 2019, Rennes, France

[expand title="More Info"]

PURPOSE:

Objective assessment of intraoperative technical skill is necessary for technology to improve patient care through surgical training. Our objective in this study was to develop and validate deep learning techniques for technical skill assessment using videos of the surgical field.

METHODS:

We used a data set of 99 videos of capsulorhexis, a critical step in cataract surgery. One expert surgeon annotated each video for technical skill using a standard structured rating scale, the International Council of Ophthalmology's Ophthalmology Surgical Competency Assessment Rubric:phacoemulsification (ICO-OSCAR:phaco). Using two capsulorhexis indices in this scale (commencement of flap and follow-through, formation and completion), we specified an expert performance when at least one of the indices was 5 and the other index was at least 4, and novice otherwise. In addition, we used scores for capsulorhexis commencement and capsulorhexis formation as separate ground truths (Likert scale of 2 to 5; analyzed as 2/3, 4 and 5). We crowdsourced annotations of instrument tips. We separately modeled instrument trajectories and optical flow using temporal convolutional neural networks to predict a skill class (expert/novice) and score on each item for capsulorhexis in ICO-OSCAR:phaco. We evaluated the algorithms in a five-fold cross-validation and computed accuracy and area under the receiver operating characteristics curve (AUC).

RESULTS:

The accuracy and AUC were 0.848 and 0.863 for instrument tip velocities, and 0.634 and 0.803 for optical flow fields, respectively.

CONCLUSIONS:

Deep neural networks effectively model surgical technical skill in capsulorhexis given structured representation of intraoperative data such as optical flow fields extracted from video or crowdsourced tool localization information.

[/expand]

Yu et al., JAMA Open, 2019

[expand title="More Info"]

PubMed

Importance

Competence in cataract surgery is a public health necessity, and videos of cataract surgery are routinely available to educators and trainees but currently are of limited use in training. Machine learning and deep learning techniques can yield tools that efficiently segment videos of cataract surgery into constituent phases for subsequent automated skill assessment and feedback.

Objective

To evaluate machine learning and deep learning algorithms for automated phase classification of manually presegmented phases in videos of cataract surgery.

Design, Setting, and Participants

This was a cross-sectional study using a data set of videos from a convenience sample of 100 cataract procedures performed by faculty and trainee surgeons in an ophthalmology residency program from July 2011 to December 2017. Demographic characteristics for surgeons and patients were not captured. Ten standard labels in the procedure and 14 instruments used during surgery were manually annotated, which served as the ground truth.

Exposures

Five algorithms with different input data: (1) a support vector machine input with cross-sectional instrument label data; (2) a recurrent neural network (RNN) input with a time series of instrument labels; (3) a convolutional neural network (CNN) input with cross-sectional image data; (4) a CNN-RNN input with a time series of images; and (5) a CNN-RNN input with time series of images and instrument labels. Each algorithm was evaluated with 5-fold cross-validation.

Main Outcomes and Measures

Accuracy, area under the receiver operating characteristic curve, sensitivity, specificity, and precision.

Results

Unweighted accuracy for the 5 algorithms ranged between 0.915 and 0.959. Area under the receiver operating characteristic curve for the 5 algorithms ranged between 0.712 and 0.773, with small differences among them. The area under the receiver operating characteristic curve for the image-only CNN-RNN (0.752) was significantly greater than that of the CNN with cross-sectional image data (0.712) (difference, −0.040; 95% CI, −0.049 to −0.033) and the CNN-RNN with images and instrument labels (0.737) (difference, 0.016; 95% CI, 0.014 to 0.018). While specificity was uniformly high for all phases with all 5 algorithms (range, 0.877 to 0.999), sensitivity ranged between 0.005 (95% CI, 0.000 to 0.015) for the support vector machine for wound closure (corneal hydration) and 0.974 (95% CI, 0.957 to 0.991) for the RNN for main incision. Precision ranged between 0.283 and 0.963.

Conclusions and Relevance

Time series modeling of instrument labels and video images using deep learning techniques may yield potentially useful tools for the automated detection of phases in cataract surgery procedures.