1 Star 0 Fork 0

arnoldfychen / video-caffe

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
BSD-2-Clause

Video-Caffe: Caffe with C3D implementation and video reader

Build Status

This is 3D convolution (C3D) and video reader implementation in the latest Caffe (Dec 2016). The original Facebook C3D implementation is branched out from Caffe on July 17, 2014 with git commit b80fc86, and has not been rebased with the original Caffe, hence missing out quite a few new features in the recent Caffe. I therefore pulled in C3D concept and an accompanying video reader and applied to the latest Caffe, and will try to rebase this repo with the upstream whenever there is a new important feature. This repo is rebased on 99bd997, on Aug`21 2018. Please reach me for any feedback or question.

Check out the original Caffe readme for Caffe-specific information.

Branches

refactor branch is a recent re-work, based on the original Caffe and Nd convolution and pooling with cuDNN PR. This is a cleaner, less-hacky implementation of 3D convolution/pooling than the master branch, and is supposed to more stable than the master branch. So, feel free to try this branch. One missing feature in the refactor branch (yet) is the python wrapper.

Requirements

In addition to prerequisites for Caffe, video-caffe depends on cuDNN. It is known to work with CuDNN verson 4 and 5, but it may need some efforts to build with v3.

  • If you use "make" to build make sure Makefile.config point to the right paths for CUDA and CuDNN.
  • If you use "cmake" to build, double-check CUDNN_INCLUDE and CUDNN_LIBRARY are correct. If not, you may want something like cmake -DCUDNN_INCLUDE="/your/path/to/include" -DCUDNN_LIBRARY="/your/path/to/lib" ${video-caffe-root}.

Building video-caffe

Key steps to build video-caffe are:

  1. git clone git@github.com:chuckcho/video-caffe.git
  2. cd video-caffe
  3. mkdir build && cd build
  4. cmake ..
  5. Make sure CUDA and CuDNN are detected and their paths are correct.
  6. make all -j8
  7. make install
  8. (optional) make runtest

Usage

Look at ${video-caffe-root}/examples/c3d_ucf101/c3d_ucf101_train_test.prototxt for how 3D convolution and pooling are used. In a nutshell, use NdConvolution or NdPooling layer with {kernel,stride,pad}_shape that specifies 3D shapes in (L x H x W) where L is the temporal length (usually 16).

...
# ----- video/label input -----
layer {
  name: "data"
  type: "VideoData"
  top: "data"
  top: "label"
  video_data_param {
    source: "examples/c3d_ucf101/c3d_ucf101_train_split1.txt"
    batch_size: 50
    new_height: 128
    new_width: 171
    new_length: 16
    shuffle: true
  }
  include {
    phase: TRAIN
  }
  transform_param {
    crop_size: 112
    mirror: true
    mean_value: 90
    mean_value: 98
    mean_value: 102
  }
}
...
# ----- 1st group -----
layer {
  name: "conv1a"
  type: "NdConvolution"
  bottom: "data"
  top: "conv1a"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 64
    kernel_shape { dim: 3 dim: 3 dim: 3 }
    stride_shape { dim: 1 dim: 1 dim: 1 }
    pad_shape    { dim: 1 dim: 1 dim: 1 }
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
...
layer {
  name: "pool1"
  type: "NdPooling"
  bottom: "conv1a"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_shape { dim: 1 dim: 2 dim: 2 }
    stride_shape { dim: 1 dim: 2 dim: 2 }
  }
}
...

UCF-101 training demo

Scripts and training files for C3D training on UCF-101 are located in examples/c3d_ucf101/. Steps to train C3D on UCF-101:

  1. Download UCF-101 dataset from UCF-101 website.
  2. Unzip the dataset: e.g. unrar x UCF101.rar
  3. (Optional) video reader works more stably with extracted frames than directly with video files. Extract frames from UCF-101 videos by revising and running a helper script, ${video-caffe-root}/examples/c3d_ucf101/extract_UCF-101_frames.sh.
  4. Change ${video-caffe-root}/examples/c3d_ucf101/c3d_ucf101_{train,test}_split1.txt to correctly point to UCF-101 videos or directories that contain extracted frames.
  5. Modify ${video-caffe-root}/examples/c3d_ucf101/c3d_ucf101_train_test.prototxt to your taste or HW specification. Especially batch_size may need to be adjusted for the GPU memory.
  6. Run training script: e.g. cd ${video-caffe-root} && examples/c3d_ucf101/train_ucf101.sh (optionally use --gpu to use multiple GPU's)
  7. (Optional) Occasionally run ${video-caffe-root}/tools/extra/plot_training_loss.sh to get training loss / validation accuracy (top1/5) plot. It's pretty hacky, so look at the file to meet your need.
  8. At 7 epochs of training, clip accuracy should be around 45%.

A typical training will yield the following loss and top-1 accuracy: iter-loss-accuracy plot

Pre-trained model

A pre-trained model is available (downloadable link) for UCF101 (trained from scratch), achieving top-1 accuracy of ~47%.

To-do

  1. Feature extractor script.
  2. Python demo script that loads a video and classifies.
  3. Convert Sport1M pre-trained model and make it available.

License and Citation

Caffe is released under the BSD 2-Clause license.

# Video-Caffe: Caffe with C3D implementation and video reader [![Build Status](https://travis-ci.org/chuckcho/video-caffe.svg?branch=master)](https://travis-ci.org/chuckcho/video-caffe) This is 3D convolution (C3D) and video reader implementation in the latest Caffe (Dec 2016). The original [Facebook C3D implementation](https://github.com/facebook/C3D/) is branched out from Caffe on July 17, 2014 with git commit [b80fc86](https://github.com/BVLC/caffe/tree/b80fc862952ba4e068cf74acc0823785ce1cc0e9), and has not been rebased with the original Caffe, hence missing out quite a few new features in the recent Caffe. I therefore pulled in C3D concept and an accompanying video reader and applied to the latest Caffe, and will try to rebase this repo with the upstream whenever there is a new important feature. This repo is rebased on [99bd997](https://github.com/BVLC/caffe/commit/99bd99795dcdf0b1d3086a8d67ab1782a8a08383), on Aug`21 2018. Please reach [me](https://github.com/chuckcho) for any feedback or question. Check out the [original Caffe readme](README-original.md) for Caffe-specific information. ## Branches [`refactor` branch](https://github.com/chuckcho/video-caffe/tree/refactor) is a recent re-work, based on the [original Caffe](https://github.com/BVLC/caffe) and [Nd convolution and pooling with cuDNN PR](https://github.com/BVLC/caffe/pull/3983). This is a cleaner, less-hacky implementation of 3D convolution/pooling than the `master` branch, and is supposed to more stable than the `master` branch. So, feel free to try this branch. One missing feature in the `refactor` branch (yet) is the python wrapper. ## Requirements In addition to [prerequisites for Caffe](http://caffe.berkeleyvision.org/installation.html#prerequisites), video-caffe depends on cuDNN. It is known to work with CuDNN verson 4 and 5, but it may need some efforts to build with v3. * If you use "make" to build make sure `Makefile.config` point to the right paths for CUDA and CuDNN. * If you use "cmake" to build, double-check `CUDNN_INCLUDE` and `CUDNN_LIBRARY` are correct. If not, you may want something like `cmake -DCUDNN_INCLUDE="/your/path/to/include" -DCUDNN_LIBRARY="/your/path/to/lib" ${video-caffe-root}`. ## Building video-caffe Key steps to build video-caffe are: 1. `git clone git@github.com:chuckcho/video-caffe.git` 2. `cd video-caffe` 3. `mkdir build && cd build` 4. `cmake ..` 5. Make sure CUDA and CuDNN are detected and their paths are correct. 6. `make all -j8` 7. `make install` 8. (optional) `make runtest` ## Usage Look at [`${video-caffe-root}/examples/c3d_ucf101/c3d_ucf101_train_test.prototxt`](examples/c3d_ucf101/c3d_ucf101_train_test.prototxt) for how 3D convolution and pooling are used. In a nutshell, use `NdConvolution` or `NdPooling` layer with `{kernel,stride,pad}_shape` that specifies 3D shapes in (L x H x W) where `L` is the temporal length (usually 16). ``` ... # ----- video/label input ----- layer { name: "data" type: "VideoData" top: "data" top: "label" video_data_param { source: "examples/c3d_ucf101/c3d_ucf101_train_split1.txt" batch_size: 50 new_height: 128 new_width: 171 new_length: 16 shuffle: true } include { phase: TRAIN } transform_param { crop_size: 112 mirror: true mean_value: 90 mean_value: 98 mean_value: 102 } } ... # ----- 1st group ----- layer { name: "conv1a" type: "NdConvolution" bottom: "data" top: "conv1a" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_shape { dim: 3 dim: 3 dim: 3 } stride_shape { dim: 1 dim: 1 dim: 1 } pad_shape { dim: 1 dim: 1 dim: 1 } weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } ... layer { name: "pool1" type: "NdPooling" bottom: "conv1a" top: "pool1" pooling_param { pool: MAX kernel_shape { dim: 1 dim: 2 dim: 2 } stride_shape { dim: 1 dim: 2 dim: 2 } } } ... ``` ## UCF-101 training demo Scripts and training files for C3D training on UCF-101 are located in [examples/c3d_ucf101/](examples/c3d_ucf101/). Steps to train C3D on UCF-101: 1. Download UCF-101 dataset from [UCF-101 website](http://crcv.ucf.edu/data/UCF101.php). 2. Unzip the dataset: e.g. `unrar x UCF101.rar` 3. (Optional) video reader works more stably with extracted frames than directly with video files. Extract frames from UCF-101 videos by revising and running a helper script, [`${video-caffe-root}/examples/c3d_ucf101/extract_UCF-101_frames.sh`](examples/c3d_ucf101/extract_UCF-101_frames.sh). 4. Change `${video-caffe-root}/examples/c3d_ucf101/c3d_ucf101_{train,test}_split1.txt` to correctly point to UCF-101 videos or directories that contain extracted frames. 5. Modify [`${video-caffe-root}/examples/c3d_ucf101/c3d_ucf101_train_test.prototxt`](examples/c3d_ucf101/c3d_ucf101_train_test.prototxt) to your taste or HW specification. Especially `batch_size` may need to be adjusted for the GPU memory. 6. Run training script: e.g. `cd ${video-caffe-root} && examples/c3d_ucf101/train_ucf101.sh` (optionally use `--gpu` to use multiple GPU's) 7. (Optional) Occasionally run [`${video-caffe-root}/tools/extra/plot_training_loss.sh`](tools/extra/plot_training_loss.sh) to get training loss / validation accuracy (top1/5) plot. It's pretty hacky, so look at the file to meet your need. 8. At 7 epochs of training, clip accuracy should be around 45%. A typical training will yield the following loss and top-1 accuracy: ![iter-loss-accuracy plot](examples/c3d_ucf101/c3d_ucf101_train_loss_accuracy.png?raw=true "Iteration vs Training loss and top-1 accuracy") ## Pre-trained model A pre-trained model is available ([downloadable link](https://www.dropbox.com/s/gglm2c67154nltr/c3d_ucf101_iter_20000.caffemodel?dl=0)) for UCF101 (trained from scratch), achieving top-1 accuracy of ~47%. ## To-do 1. Feature extractor script. 2. Python demo script that loads a video and classifies. 3. Convert Sport1M pre-trained model and make it available. ## License and Citation Caffe is released under the [BSD 2-Clause license](https://github.com/BVLC/caffe/blob/master/LICENSE).

简介

暂无描述 展开 收起
BSD-2-Clause
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/arnoldfychen/video-caffe.git
git@gitee.com:arnoldfychen/video-caffe.git
arnoldfychen
video-caffe
video-caffe
master

搜索帮助

344bd9b3 5694891 D2dac590 5694891