Unlocking the potential of neural networks in resource and data constrained environments
Ramesh Raskar, PhD
Associate Professor of Media Arts and Sciences MIT
Dan Raviv, PhD
School of Electrical Engineering, Faculty of EngineeringTel-Aviv University, Israel
David Cox, PhD
Assistant Professor of Molecular & Cellular Biology & Computer ScienceCenter for Brain Science, Harvard University
Data driven methods based on deep neural networks (DNNs) have ushered in a new era in the field of machine learning computer vision. Conventional algorithmic approaches are being replaced by end-to-end deep learning systems that can leverage big data. Deep learning has begun revolutionizing human centric fields such as health-care and finance, finding its way into automated screening and diagnoses. At present, developing and training artificial neural network architectures requires both human expertise and labor, requiring millions of labeled data-points to train and hours of engineering effort to develop best performing architectures.
In this dissertation, my goal is to make deep learning more accessible by developing algorithms for low shot learning (learning from a few examples). This work includes new semi-supervised approaches to learn from unlabeled datasets with only a fraction of labeled examples, deep learning methods to learn from generated data using simulation based techniques, and learning to optimize neural networks for smaller data sets. Specifically, this dissertation focuses on two proposed directions which will contribute towards both technical and conceptual advances in literature.
1.) How can we use invariant-based approaches when training from small datasets ?
2.) How to enable training from multiple data sources carrying very small amounts of data ?
3.) How to use meta-modeling approach to automatically generate high-performing DNNs ?
To address these questions, this dissertation describes machine learning algorithms as follows (a) an action recognition autoencoder which learns over very small datasets; (b) an algorithm to train deep neural networks over multiple entities carrying small amounts of data; (c) a meta-modeling approach to automatically generate high-performing architectures for small datasets. We also provide following new datasets, made publicly available to researchers through interactive online resources: (i) a dataset for facial video clips containing 250 million face images annotated for facial landmark locations and (ii) a dataset of neural network topologies used for predicting accuracy of a topology before training.