2016_09_08__DeepLearningSeminar – Research Group CAMMA

Date & Time: Thursday, September 8th, 2016 at 2pm

Location: amphitheater ‘L. HIRSCH’, IRCAD

Speakers:

Fausto Milletari: “Neural networks for volumetric medical image segmentation”

Abstract:
“Convolutional Neural Networks (CNNs) have been recently employed to solve problems from both the computer vision and medical image analysis fields. Despite their popularity, most approaches are only able to process 2D images while most medical data used in clinical practice consists of 3D volumes. In this talk I will introduce two approaches to 3D image segmentation based on a volumetric neural networks. The first approach (Hough-CNN) uses Hough voting coupled with CNNs to deliver robust segmentation of poorly visible anatomies in both ultrasound and MRI. By imposing implicit shape constraints on the final segmentation outcomes and thanks to the anatomy localisation capabilities of this method, we can apply our algorithm to very challenging datasets.The second approach introduces a CNN trained end-to-end on MRI volumes depicting prostate, and learns to predict segmentation for the whole volume at once. We introduce a novel objective function, that we optimise during training, based on Dice coefficient. In this way we can deal with situations where there is a strong imbalance between the number of foreground and background voxels. Our experiments showed that our approaches achieve excellent performances on challenging test data while being time efficient and fully automatic”

Short Bio:
Fausto Milletari is a Ph.D. candidate at the Technical University of Munich (TUM) since October 2013. After earning his M.Sc. in informatics, passed with high distinction, he joined the chair for Computer Aided Medical Procedures, directed by Professor Nassir Navab. Fausto’s major research topic is segmentation of ultrasound images of the brain. In addition, he works on a variety of other computer vision problems, such as object tracking and detection. His work focuses on pattern recognition and machine learning, and in particular on voting-based approaches using state-of-the-art learning techniques. Several of his contributions have been presented in recent editions of MICCAI, IPCAI, and BMVC. Outside of the lab, Fausto strives to spread scientific knowledge about machine vision to a wider audience. He recently founded the computer vision and medical image analysis meetup group of Munich, which hosts monthly events that bring together academics and industry representatives interested in the field.

Wadim Kehl: “Deep Learning of Local RGB-D Patches for 3D Object Detection and 6D Pose Estimation”

Abstract:
We present a 3D object detection method that uses regressed descriptors of locally-sampled RGB-D patches for 6D vote casting. For regression, we employ a convolutional auto-encoder that has been trained on a large collection of random local patches. During testing, scene patch descriptors are matched against a database of synthetic model view patches and cast 6D object votes which are subsequently filtered to refined hypotheses. We evaluate on three datasets to show that our method generalizes well to previously unseen input data, delivers robust detection results that compete with and surpass the state-of-the-art while being scalable in the number of objects.

Short Bio:
Wadim Kehl is a PhD student of the computer vision group at Nassir Navab’s CAMP chair, supervised by Slobodan Ilic and Federico Tombari and supported by Toyota Motors Corporation. His main research focus is reconstruction and scalable detection of 3D objects using RGB-D imagery. Recently, he started to look into deep-learning techniques to replace methods based on hand-crafted features.

Robert DiPietro: “Recognising surgical activities with recurrent neural networks”

Abstract:
We apply recurrent neural networks to the task of recognizing surgical activities from robot kinematics. Prior work in this area focuses on recognizing short, low-level activities, or gestures, and has been based on variants of hidden Markov models and conditional random fields. In contrast, we work on recognizing both gestures and longer, higher-level activities, or maneuvers, and we model the mapping from kinematics to gestures/maneuvers with recurrent neural networks. To our knowledge, we are the first to apply recurrent neural networks to this task. Using a single model and a single set of hyperparameters, we match state-of-the-art performance for gesture recognition and advance state-of-the-art performance for maneuver recognition, in terms of both accuracy and edit distance. Code is available at https://github.com/rdipietro/miccai-2016-surgical-activity-rec

Short Bio:
Robert DiPietro is a PhD student in the Department of Computer Science at Johns Hopkins, where he is advised by Prof. Gregory D. Hager and Prof. Nassir Navab. His research focuses primarily on recurrent-neural-network based models of time-series data and on applications in the health-care domain. Previously, he was an associate research-staff member at MIT Lincoln Laboratory and a BS/MS student at Northeastern University.