Code and data

Spatial-aware object embeddings for
Zero-Shot Localization and Classification of Actions

The code to generate spatial-aware object embeddings is available on Github.


The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection

Caffe model Prototext ImageNet IDs Number of classes
Bottom-up [4k] model (92 MB) Bottom-up [4k] deploy Bottom-up [4k] ids 4,437
Bottom-up [8k] model (136 MB) Bottom-up [8k] deploy Bottom-up [8k] ids 8,201
Bottom-up [13k] model (192 MB) Bottom-up [13k] deploy Bottom-up [13k] ids 12,988
Top-down [4k] model (87 MB) Top-down [4k] deploy Top-down [4k] ids 4,000

Spot On: Action Localization from Pointly-Supervised Proposals

The spatio-temporal annotations for Hollywood2Tubes can be downloaded here.


Bag-of-Fragments: Selecting and Encoding Video Fragments for Event Detection and Recounting

The 14,088,893 available tags for 4,770,156 Flickr images in ImageNet are available here.


Water Detection through Spatio-Temporal Invariant Descriptors

The complete Video Water Database can be downloaded here (warning: 14GB).