Evan Shelhamer

I am a recently-graduated PhD in computer science from UC Berkeley where I was advised by Trevor Darrell as part of BAIR.

I believe in DIY science and open tooling for research and engineering.
I was the lead developer of the Caffe deep learning framework from version 0.1 to 1.0, and I still engage in open sourcery when I can.

Before Berkeley, I earned dual degrees in computer science (artificial intelligence concentration) and psychology at UMass Amherst advised by Erik Learned-Miller.

I take my coffee black.

shelhamer@cs.berkeley.edu  /  Google Scholar  /  GitHub  /  CV


I'm interested in computer vision and machine learning, in particular the reconciliation of visual structure with end-to-end learning, plus dynamic inference by adaptive model complexity and computation.

Selected Projects

Fully Convolutional Networks for Semantic Segmentation
Evan Shelhamer*, Jon Long*, Trevor Darrell   (*equal contribution)
PAMI, 2017
CVPR, 2015   (Best Paper Honorable Mention)
PAMI arxiv / CVPR arxiv / code & models / slides / bib

Fully convolutional networks are machines for image-to-image learning and inference.
These local models alone, trained end-to-end and pixels-to-pixels, improved semantic segmentation accuracy 30% relative and efficiency 300x on PASCAL VOC.
Skip connections across layers help resolve what and where.

Caffe Deep Learning Framework
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, and our community contributors!
BVLC + BAIR, 2013–2017
ACM MM, 2014   (Winner of the Open Source Software Competition)
project / code / ACM MM'14 arxiv / slides / bib

Caffe is a deep learning framework made with expression, speed, and modularity in mind. The deep learning shift was in part a sea change on the wave of open science and toolkits, including Caffe and its Model Zoo.

Blurring the Line between Structure and Learning to Optimize and Adapt Receptive Fields
Evan Shelhamer, Dequan Wang, Trevor Darrell
In submission, 2019
arxiv / slides / bib

Composing structured Gaussian filters with free-form filters, and learning both, optimizes over receptive field size and shape. In effect this controls the degree of locality:
changes in our parameters would require changes in architecture for standard networks. Dynamic inference adapts receptive field size to cope with scale variation.

Dynamic Scale Inference by Entropy Minimization
Dequan Wang*, Evan Shelhamer*, Bruno Olshausen, Trevor Darrell
In submission, 2019
arxiv / bib

Unsupervised optimization during inference gives top-down feedback to iteratively adjust feedforward prediction. Minimizing output entropy with respect to model parameters optimizes for certainty, and tunes the model to each test input. Extending dynamic scale inference with this optimization refines predictions and generalizes better to scale shifts.

More Projects

Infinite Mixture Prototypes for Few-Shot Learning
Kelsey R. Allen, Evan Shelhamer*, Hanul Shin*, Joshua B. Tenenbaum
ICML, 2019
arxiv / bib

Infinite mixture prototypes adaptively adjust model capacity by representing classes as sets of clusters and inferring their number. This handles both simple and complex few-shot tasks, and improves alphabet recognition accuracy by 25% absolute over uni-modal prototypes.

Few-shot Segmentation Propagation with Guided Networks
Kate Rakelly*, Evan Shelhamer*, Trevor Darrell, Alexei A. Efros, Sergey Levine
arXiv, 2018
arxiv / code / bib

Extracting a latent task representation from local supervision allows for non-local propagation within and across images with quick updates for real-time interaction.

(Note: this subsumes our ICLRW'18 paper on conditional networks).

Deep Layer Aggregation
Fisher Yu, Dequan Wang, Evan Shelhamer, Trevor Darrell
CVPR, 2018   (Oral)
arxiv / code / bib

Deepening aggregation, the iterative and hierarchical merging of features across layers, improves recognition and resolution.

Loss Is Its Own Reward: Self-Supervision for Reinforcement Learning
Evan Shelhamer, Parsa Mahmoudieh, Max Argus, Trevor Darrell
ICLRW, 2017
arxiv / slides / bib

Loss is where you find it. With self-supervision for representation learning, experience without reward need not be so unrewarding for reinforcement learning.

Clockwork Convnets for Video Semantic Segmentation
Evan Shelhamer*, Kate Rakelly*, Judy Hoffman*, Trevor Darrell
ECCVW, 2016
arxiv / code / slides / bib

Adaptively computing layers according to their rate of change improves the efficiency of video processing without sacrificing accuracy.


Tutorial Organizer: DIY Deep Learning with Caffe at CVPR'15 and ECCV'14.


Graduate Student Instructor, CS188 Fall 2013

Graduate Student Instructor, DIY Deep Learning Fall 2014