PytorchInsight

a pytorch lib with state-of-the-art architectures, pretrained models and real-time updated results

Github stars Tracking Chart

PytorchInsight

This is a pytorch lib with state-of-the-art architectures, pretrained models and real-time updated results.

This repository aims to accelarate the advance of Deep Learning Research, make reproducible results and easier for doing researches, and in Pytorch.

Including Papers (to be updated):

Attention Models

  • SENet: Squeeze-and-excitation Networks (paper)
  • SKNet: Selective Kernel Networks (paper)
  • CBAM: Convolutional Block Attention Module (paper)
  • GCNet: GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond (paper)
  • BAM: Bottleneck Attention Module (paper)
  • SGENet: Spatial Group-wise Enhance: Enhancing Semantic Feature Learning in Convolutional Networks (paper)
  • SRMNet: SRM: A Style-based Recalibration Module for Convolutional Neural Networks (paper)

Non-Attention Models

  • OctNet: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution (paper)
  • imagenet_tricks.py: Bag of Tricks for Image Classification with Convolutional Neural Networks (paper)
  • Understanding the Disharmony between Weight Normalization Family and Weight Decay: e-shifted L2 Regularizer (to appear)
  • Generalization Bound Regularizer: A Unified Framework for Understanding Weight Decay (to appear)
  • mixup: Beyond Empirical Risk Minimization (paper)
  • CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features (paper)

Trained Models and Performance Table

Single crop validation error on ImageNet-1k (center 224x224 crop from resized image with shorter side = 256)., classifiaction training settings for media and large models, :-:, :-:, Details, RandomResizedCrop, RandomHorizontalFlip; 0.1 init lr, total 100 epochs, decay at every 30 epochs; SGD with naive softmax cross entropy loss, 1e-4 weight decay, 0.9 momentum, 8 gpus, 32 images per gpu, Examples, ResNet50, Note, The newest code adds one default operation: setting all bias wd = 0, please refer to the theoretical analysis of "Generalization Bound Regularizer: A Unified Framework for Understanding Weight Decay" (to appear), thereby the training accuracy can be slightly boosted, classifiaction training settings for mobile/small models, :-:, :-:, Details, RandomResizedCrop, RandomHorizontalFlip; 0.4 init lr, total 300 epochs, 5 linear warm up epochs, cosine lr decay; SGD with softmax cross entropy loss and label smoothing 0.1, 4e-5 weight decay on conv weights, 0 weight decay on all other weights, 0.9 momentum, 8 gpus, 128 images per gpu, Examples, ShuffleNetV2, ## Typical Training & Testing Tips:

Small Models

ShuffleNetV2_1x

python -m torch.distributed.launch --nproc_per_node=8 imagenet_mobile.py --cos -a shufflenetv2_1x --data /path/to/imagenet1k/ \
--epochs 300 --wd 4e-5 --gamma 0.1 -c checkpoints/imagenet/shufflenetv2_1x --train-batch 128 --opt-level O0 --nowd-bn # Triaing

python -m torch.distributed.launch --nproc_per_node=2 imagenet_mobile.py -a shufflenetv2_1x --data /path/to/imagenet1k/ \
-e --resume ../pretrain/shufflenetv2_1x.pth.tar --test-batch 100 --opt-level O0 # Testing, ~69.6% top-1 Acc

Large Models

SGE-ResNet

python -W ignore imagenet.py -a sge_resnet101 --data /path/to/imagenet1k/ --epochs 100 --schedule 30 60 90 \
--gamma 0.1 -c checkpoints/imagenet/sge_resnet101 --gpu-id 0,1,2,3,4,5,6,7 # Training

python -m torch.distributed.launch --nproc_per_node=8 imagenet_fast.py -a sge_resnet101 --data /path/to/imagenet1k/ \ 
--epochs 100 --schedule 30 60 90 --wd 1e-4 --gamma 0.1 -c checkpoints/imagenet/sge_resnet101 --train-batch 32 \ 
--opt-level O0 --wd-all --label-smoothing 0. --warmup 0 # Training (faster) 
python -W ignore imagenet.py -a sge_resnet101 --data /path/to/imagenet1k/ --gpu-id 0,1 -e --resume ../pretrain/sge_resnet101.pth.tar \
# Testing ~78.8% top-1 Acc

python -m torch.distributed.launch --nproc_per_node=2 imagenet_fast.py -a sge_resnet101 --data /path/to/imagenet1k/ -e --resume \
../pretrain/sge_resnet101.pth.tar --test-batch 100 --opt-level O0 # Testing (faster) ~78.8% top-1 Acc

WS-ResNet with e-shifted L2 regularizer, e = 1e-3

python -m torch.distributed.launch --nproc_per_node=8 imagenet_fast.py -a ws_resnet50 --data /share1/public/public/imagenet1k/ \
--epochs 100 --schedule 30 60 90 --wd 1e-4 --gamma 0.1 -c checkpoints/imagenet/es1e-3_ws_resnet50 --train-batch 32 \
--opt-level O0 --label-smoothing 0. --warmup 0 --nowd-conv --mineps 1e-3 --el2

Results of "SGENet: Spatial Group-wise Enhance: Enhancing Semantic Feature Learning in Convolutional Networks"

Note the following results (old) do not set the bias wd = 0 for large models

Classification, Model, #P, GFLOPs, Top-1 Acc, Top-5 Acc, Download1, Download2, log, :-:, :-:, :-:, :-:, :-:, :-:, :-:, :-:, ShuffleNetV2_1x, 2.28M, 0.151, 69.6420, 88.7200, GoogleDrive, shufflenetv2_1x.log, ResNet50, 25.56M, 4.122, 76.3840, 92.9080, BaiduDrive(zuvx), GoogleDrive, old_resnet50.log, SE-ResNet50, 28.09M, 4.130, 77.1840, 93.6720, SK-ResNet50, 26.15M, 4.185, 77.5380, 93.7000, BaiduDrive(tfwn), GoogleDrive, sk_resnet50.log, BAM-ResNet50, 25.92M, 4.205, 76.8980, 93.4020, BaiduDrive(z0h3), GoogleDrive, bam_resnet50.log, CBAM-ResNet50, 28.09M, 4.139, 77.6260, 93.6600, BaiduDrive(bram), GoogleDrive, cbam_resnet50.log, SGE-ResNet50, 25.56M, 4.127, 77.5840, 93.6640, BaiduDrive(gxo9), GoogleDrive, sge_resnet50.log, ResNet101, 44.55M, 7.849, 78.2000, 93.9060, BaiduDrive(js5t), GoogleDrive, old_resnet101.log, SE-ResNet101, 49.33M, 7.863, 78.4680, 94.1020, BaiduDrive(j2ox), GoogleDrive, se_resnet101.log, SK-ResNet101, 45.68M, 7.978, 78.7920, 94.2680, BaiduDrive(boii), GoogleDrive, sk_resnet101.log, BAM-ResNet101, 44.91M, 7.933, 78.2180, 94.0180, BaiduDrive(4bw6), GoogleDrive, bam_resnet101.log, CBAM-ResNet101, 49.33M, 7.879, 78.3540, 94.0640, BaiduDrive(syj3), GoogleDrive, cbam_resnet101.log, SGE-ResNet101, 44.55M, 7.858, 78.7980, 94.3680, BaiduDrive(wqn6), GoogleDrive, sge_resnet101.log, ### Detection, Model, #p, GFLOPs, Detector, Neck, AP50:95 (%), AP50 (%), AP75 (%), Download, :-:, :-:, :-:, :-:, :-:, :-:, :-:, :-:, :-:, ResNet50, 23.51M, 88.0, Faster RCNN, FPN, 37.5, 59.1, 40.6, GoogleDrive, SGE-ResNet50, 23.51M, 88.1, Faster RCNN, FPN, 38.7, 60.8, 41.7, GoogleDrive, ResNet50, 23.51M, 88.0, Mask RCNN, FPN, 38.6, 60.0, 41.9, GoogleDrive, SGE-ResNet50, 23.51M, 88.1, Mask RCNN, FPN, 39.6, 61.5, 42.9, GoogleDrive, ResNet50, 23.51M, 88.0, Cascade RCNN, FPN, 41.1, 59.3, 44.8, GoogleDrive, SGE-ResNet50, 23.51M, 88.1, Cascade RCNN, FPN, 42.6, 61.4, 46.2, GoogleDrive, ResNet101, 42.50M, 167.9, Faster RCNN, FPN, 39.4, 60.7, 43.0, GoogleDrive, SE-ResNet101, 47.28M, 168.3, Faster RCNN, FPN, 40.4, 61.9, 44.2, GoogleDrive, SGE-ResNet101, 42.50M, 168.1, Faster RCNN, FPN, 41.0, 63.0, 44.3, GoogleDrive, ResNet101, 42.50M, 167.9, Mask RCNN, FPN, 40.4, 61.6, 44.2, GoogleDrive, SE-ResNet101, 47.28M, 168.3, Mask RCNN, FPN, 41.5, 63.0, 45.3, GoogleDrive, SGE-ResNet101, 42.50M, 168.1, Mask RCNN, FPN, 42.1, 63.7, 46.1, GoogleDrive, ResNet101, 42.50M, 167.9, Cascade RCNN, FPN, 42.6, 60.9, 46.4, GoogleDrive, SE-ResNet101, 47.28M, 168.3, Cascade RCNN, FPN, 43.4, 62.2, 47.2, GoogleDrive, SGE-ResNet101, 42.50M, 168.1, Cascade RCNN, FPN, 44.4, 63.2, 48.4, GoogleDrive, --------------------------------------------------------

Results of "Understanding the Disharmony between Weight Normalization Family and Weight Decay: e-shifted L2 Regularizer"

Note that the following models are with bias wd = 0.

Classification, Model, Top-1, Download, :-:, :-:, :-:, WS-ResNet50, 76.74, GoogleDrive, WS-ResNet50(e = 1e-3), 76.86, GoogleDrive, WS-ResNet101, 78.07, GoogleDrive, WS-ResNet101(e = 1e-6), 78.29, GoogleDrive, WS-ResNeXt50(e = 1e-3), 77.88, GoogleDrive, WS-ResNeXt101(e = 1e-3), 78.80, GoogleDrive, WS-DenseNet201(e = 1e-8), 77.59, GoogleDrive, WS-ShuffleNetV1(e = 1e-8), 68.09, GoogleDrive, WS-ShuffleNetV2(e = 1e-8), 69.70, GoogleDrive, WS-MobileNetV1(e = 1e-6), 73.60, GoogleDrive, --------------------------------------------------------

Results of "Generalization Bound Regularizer: A Unified Framework for Understanding Weight Decay"

To appear


Citation

If you find our related works useful in your research, please consider citing the paper:

@inproceedings{li2019selective,
  title={Selective Kernel Networks},
  author={Li, Xiang and Wang, Wenhai and Hu, Xiaolin and Yang, Jian},
  journal={IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}

@inproceedings{li2019spatial,
  title={Spatial Group-wise Enhance: Enhancing Semantic Feature Learning in Convolutional Networks},
  author={Li, Xiang and Hu, Xiaolin and Xia, Yan and Yang, Jian},
  journal={arXiv preprint arXiv:1905.09646},
  year={2019}
}

@inproceedings{li2019understanding,
  title={Understanding the Disharmony between Weight Normalization Family and Weight Decay: e-shifted L2 Regularizer},
  author={Li, Xiang and Chen, Shuo and Yang, Jian},
  journal={arXiv preprint arXiv:},
  year={2019}
}

@inproceedings{li2019generalization,
  title={Generalization Bound Regularizer: A Unified Framework for Understanding Weight Decay},
  author={Li, Xiang and Chen, Shuo and Gong, Chen and Xia, Yan and Yang, Jian},
  journal={arXiv preprint arXiv:},
  year={2019}
}

Main metrics

Overview
Name With Ownerimplus/PytorchInsight
Primary LanguagePython
Program languagePython (Language Count: 5)
Platform
License:
所有者活动
Created At2019-05-17 05:53:26
Pushed At2020-12-08 06:36:29
Last Commit At2020-12-08 14:36:28
Release Count0
用户参与
Stargazers Count863
Watchers Count21
Fork Count122
Commits Count35
Has Issues Enabled
Issues Count40
Issue Open Count20
Pull Requests Count0
Pull Requests Open Count1
Pull Requests Close Count0
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private