This is an updated version of 3D-ResNets-PyTorch.
This is the PyTorch code for the following papers:
This code includes training, fine-tuning and testing on Kinetics, Moments in Time, ActivityNet, UCF-101, and HMDB-51.
Note that this code may not exactly reproduce the results of above papers because some updates are included.
Pre-trained models are available here.
All models are trained on Kinetics.
ResNeXt-101 achieved the best performance in our experiments. (See paper in details.)
resnet-18-kinetics.pth: --model resnet --model_depth 18 --resnet_shortcut A
resnet-34-kinetics.pth: --model resnet --model_depth 34 --resnet_shortcut A
resnet-34-kinetics-cpu.pth: CPU ver. of resnet-34-kinetics.pth
resnet-50-kinetics.pth: --model resnet --model_depth 50 --resnet_shortcut B
resnet-101-kinetics.pth: --model resnet --model_depth 101 --resnet_shortcut B
resnet-152-kinetics.pth: --model resnet --model_depth 152 --resnet_shortcut B
resnet-200-kinetics.pth: --model resnet --model_depth 200 --resnet_shortcut B
preresnet-200-kinetics.pth: --model preresnet --model_depth 200 --resnet_shortcut B
wideresnet-50-kinetics.pth: --model wideresnet --model_depth 50 --resnet_shortcut B --wide_resnet_k 2
resnext-101-kinetics.pth: --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32
densenet-121-kinetics.pth: --model densenet --model_depth 121
densenet-201-kinetics.pth: --model densenet --model_depth 201
Some of fine-tuned models on UCF-101 and HMDB-51 (split 1) are also available.
resnext-101-kinetics-ucf101_split1.pth: --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32
resnext-101-64f-kinetics-ucf101_split1.pth: --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32 --sample_duration 64
resnext-101-kinetics-hmdb51_split1.pth: --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32
resnext-101-64f-kinetics-hmdb51_split1.pth: --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32 --sample_duration 64
This table shows the averaged accuracies over top-1 and top-5 on Kinetics.
| Method | Accuracies | 
|---|---|
| ResNet-18 | 66.1 | 
| ResNet-34 | 71.0 | 
| ResNet-50 | 72.2 | 
| ResNet-101 | 73.3 | 
| ResNet-152 | 73.7 | 
| ResNet-200 | 73.7 | 
| ResNet-200 (pre-act) | 73.4 | 
| Wide ResNet-50 | 74.7 | 
| ResNeXt-101 | 75.4 | 
| DenseNet-121 | 70.8 | 
| DenseNet-201 | 72.3 | 
- PyTorch
- v1.0+
 
 
conda install pytorch torchvision cudatoolkit -c pytorch- FFmpeg, FFprobe
 
wget http://johnvansickle.com/ffmpeg/releases/ffmpeg-release-64bit-static.tar.xz
tar xvf ffmpeg-release-64bit-static.tar.xz
cd ./ffmpeg-3.3.3-64bit-static/; sudo cp ffmpeg ffprobe /usr/local/bin;- Python 3
 
- Download videos using the official crawler.
 - Convert from avi to jpg files using 
utils/video_jpg.py 
python utils/video_jpg.py avi_video_directory jpg_video_directory- Generate fps files using 
utils/fps.py 
python utils/fps.py avi_video_directory jpg_video_directory- Download videos using the official crawler.
- Locate test set in 
video_directory/test. 
 - Locate test set in 
 - Convert from avi to jpg files using 
utils/video_jpg_kinetics.py 
python utils/video_jpg_kinetics.py avi_video_directory jpg_video_directory- Generate n_frames files using 
utils/n_frames_kinetics.py 
python utils/n_frames_kinetics.py jpg_video_directory- Generate annotation file in json format similar to ActivityNet using 
utils/kinetics_json.py- The CSV files (kinetics_{train, val, test}.csv) are included in the crawler.
 
 
python utils/kinetics_json.py train_csv_path val_csv_path test_csv_path dst_json_path- Download videos and train/test splits here.
 - Convert from avi to jpg files using 
utils/video_jpg_ucf101_hmdb51.py 
python utils/video_jpg_ucf101_hmdb51.py avi_video_directory jpg_video_directory- Generate n_frames files using 
utils/n_frames_ucf101_hmdb51.py 
python utils/n_frames_ucf101_hmdb51.py jpg_video_directory- Generate annotation file in json format similar to ActivityNet using 
utils/ucf101_json.pyannotation_dir_pathincludes classInd.txt, trainlist0{1, 2, 3}.txt, testlist0{1, 2, 3}.txt
 
python utils/ucf101_json.py annotation_dir_path- Download videos and train/test splits here.
 - Convert from avi to jpg files using 
utils/video_jpg_ucf101_hmdb51.py 
python utils/video_jpg_ucf101_hmdb51.py avi_video_directory jpg_video_directory- Generate n_frames files using 
utils/n_frames_ucf101_hmdb51.py 
python utils/n_frames_ucf101_hmdb51.py jpg_video_directory- Generate annotation file in json format similar to ActivityNet using 
utils/hmdb51_json.pyannotation_dir_pathincludes brush_hair_test_split1.txt, ...
 
python utils/hmdb51_json.py annotation_dir_pathAssume the structure of data directories is the following:
~/
  data/
    kinetics_videos/
      jpg/
        .../ (directories of class names)
          .../ (directories of video names)
            ... (jpg files)
    results/
      save_100.pth
    kinetics.json
Confirm all options.
python main.lua -hTrain ResNets-34 on the Kinetics dataset (400 classes) with 4 CPU threads (for data loading).
Batch size is 128.
Save models at every 5 epochs.
All GPUs is used for the training.
If you want a part of GPUs, use CUDA_VISIBLE_DEVICES=....
python main.py --root_path ~/data --video_path kinetics_videos/jpg --annotation_path kinetics.json \
--result_path results --dataset kinetics --model resnet \
--model_depth 34 --n_classes 400 --batch_size 128 --n_threads 4 --checkpoint 5Continue Training from epoch 101. (~/data/results/save_100.pth is loaded.)
python main.py --root_path ~/data --video_path kinetics_videos/jpg --annotation_path kinetics.json \
--result_path results --dataset kinetics --resume_path results/save_100.pth \
--model_depth 34 --n_classes 400 --batch_size 128 --n_threads 4 --checkpoint 5Fine-tuning conv5_x and fc layers of a pretrained model (~/data/models/resnet-34-kinetics.pth) on UCF-101.
python main.py --root_path ~/data --video_path ucf101_videos/jpg --annotation_path ucf101_01.json \
--result_path results --dataset ucf101 --n_classes 400 --n_finetune_classes 101 \
--pretrain_path models/resnet-34-kinetics.pth --ft_begin_index 4 \
--model resnet --model_depth 34 --resnet_shortcut A --batch_size 128 --n_threads 4 --checkpoint 5