PyTorch Image Models, etc

What's New

Jan 11/12, 2020

Master may be a bit unstable wrt to training, these changes have been tested but not all combos
Implementations of AugMix added to existing RA and AA. Including numerous supporting pieces like JSD loss (Jensen-Shannon divergence + CE), and AugMixDataset
SplitBatchNorm adaptation layer added for implementing Auxiliary BN as per AdvProp paper
ResNet-50 AugMix trained model w/ 79% top-1 added
seresnext26tn_32x4d - 77.99 top-1, 93.75 top-5 added to tiered experiment, higher img/s than 't' and 'd'
Command lines/hparams and more AugMix and related model updates for above coming soon...

Jan 3, 2020

Add RandAugment trained EfficientNet-B0 weight with 77.7 top-1. Trained by Michael Klachko with this code and recent hparams (see Training section)
Add avg_checkpoints.py script for post training weight averaging and update all scripts with header docstrings and shebangs.

Dec 30, 2019

Merge Dushyant Mehta's PR for SelecSLS (Selective Short and Long Range Skip Connections) networks. Good GPU memory consumption and throughput. Original: https://github.com/mehtadushy/SelecSLS-Pytorch

Dec 28, 2019

Add new model weights and training hparams (see Training Hparams section)
- efficientnet_b3 - 81.5 top-1, 95.7 top-5 at default res/crop, 81.9, 95.8 at 320x320 1.0 crop-pct
  - trained with RandAugment, ended up with an interesting but less than perfect result (see training section)
- seresnext26d_32x4d- 77.6 top-1, 93.6 top-5
  - deep stem (32, 32, 64), avgpool downsample
  - stem/dowsample from bag-of-tricks paper
- seresnext26t_32x4d- 78.0 top-1, 93.7 top-5
  - deep tiered stem (24, 48, 64), avgpool downsample (a modified 'D' variant)
  - stem sizing mods from Jeremy Howard and fastai devs discussing ResNet architecture experiments

Dec 23, 2019

Add RandAugment trained MixNet-XL weights with 80.48 top-1.
--dist-bn argument added to train.py, will distribute BN stats between nodes after each train epoch, before eval

Dec 4, 2019

Added weights from the first training from scratch of an EfficientNet (B2) with my new RandAugment implementation. Much better than my previous B2 and very close to the official AdvProp ones (80.4 top-1, 95.08 top-5).

Nov 29, 2019

Brought EfficientNet and MobileNetV3 up to date with my https://github.com/rwightman/gen-efficientnet-pytorch code. Torchscript and ONNX export compat excluded.
- AdvProp weights added
- Official TF MobileNetv3 weights added
EfficientNet and MobileNetV3 hook based 'feature extraction' classes added. Will serve as basis for using models as backbones in obj detection/segmentation tasks. Lots more to be done here...
HRNet classification models and weights added from https://github.com/HRNet/HRNet-Image-Classification
Consistency in global pooling, reset_classifer, and forward_features across models
- forward_features always returns unpooled feature maps now
Reasonable chance I broke something... let me know

Nov 22, 2019

Add ImageNet training RandAugment implementation alongside AutoAugment. PyTorch Transform compatible format, using PIL. Currently training two EfficientNet models from scratch with promising results... will update.
drop-connect cmd line arg finally added to train.py, no need to hack model fns. Works for efficientnet/mobilenetv3 based models, ignored otherwise.

Introduction

For each competition, personal, or freelance project involving images + Convolution Neural Networks, I build on top of an evolving collection of code and models. This repo contains a (somewhat) cleaned up and paired down iteration of that code. Hopefully it'll be of use to others.

The work of many others is present here. I've tried to make sure all source material is acknowledged:

Training/validation scripts evolved from early versions of the PyTorch Imagenet Examples
CUDA specific performance enhancements have been pulled from NVIDIA's APEX Examples
LR scheduler ideas from AllenNLP, FAIRseq, and SGDR: Stochastic Gradient Descent with Warm Restarts (https://arxiv.org/abs/1608.03983)
Random Erasing from Zhun Zhong (https://arxiv.org/abs/1708.04896)
Optimizers:
- RAdam by Liyuan Liu (https://arxiv.org/abs/1908.03265)
- NovoGrad by Masashi Kimura (https://arxiv.org/abs/1905.11286)
- Lookahead adapted from impl by Liam (https://arxiv.org/abs/1907.08610)

Models

I've included a few of my favourite models, but this is not an exhaustive collection. You can't do better than Cadene's collection in that regard. Most models do have pretrained weights from their respective sources or original authors.

Included models:

ResNet/ResNeXt (from torchvision with mods by myself)
- ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152, ResNeXt50 (32x4d), ResNeXt101 (32x4d and 64x4d)
- 'Bag of Tricks' / Gluon C, D, E, S variations (https://arxiv.org/abs/1812.01187)
- Instagram trained / ImageNet tuned ResNeXt101-32x8d to 32x48d from from facebookresearch
- Res2Net (https://github.com/gasvn/Res2Net, https://arxiv.org/abs/1904.01169)
DLA
- Original (https://github.com/ucbdrive/dla, https://arxiv.org/abs/1707.06484)
- Res2Net (https://github.com/gasvn/Res2Net, https://arxiv.org/abs/1904.01169)
DenseNet (from torchvision)
- DenseNet-121, DenseNet-169, DenseNet-201, DenseNet-161
Squeeze-and-Excitation ResNet/ResNeXt (from Cadene with some pretrained weight additions by myself)
- SENet-154, SE-ResNet-18, SE-ResNet-34, SE-ResNet-50, SE-ResNet-101, SE-ResNet-152, SE-ResNeXt-26 (32x4d), SE-ResNeXt50 (32x4d), SE-ResNeXt101 (32x4d)
Inception-ResNet-V2 and Inception-V4 (from Cadene )
Xception
- Original variant from Cadene
- MXNet Gluon 'modified aligned' Xception-65 and 71 models from Gluon ModelZoo
PNasNet & NASNet-A (from Cadene)
DPN (from me, weights hosted by Cadene)
- DPN-68, DPN-68b, DPN-92, DPN-98, DPN-131, DPN-107
EfficientNet (from my standalone GenMobileNet) - A generic model that implements many of the efficient models that utilize similar DepthwiseSeparable and InvertedResidual blocks
- EfficientNet AdvProp (B0-B8) (https://arxiv.org/abs/1911.09665) -- TF weights ported
- EfficientNet (B0-B7) (https://arxiv.org/abs/1905.11946) -- TF weights ported, B0-B2 finetuned PyTorch
- EfficientNet-EdgeTPU (S, M, L) (https://ai.googleblog.com/2019/08/efficientnet-edgetpu-creating.html) --TF weights ported
- MixNet (https://arxiv.org/abs/1907.09595) -- TF weights ported, PyTorch finetuned (S, M, L) or trained models (XL)
- MNASNet B1, A1 (Squeeze-Excite), and Small (https://arxiv.org/abs/1807.11626) -- trained in PyTorch
- MobileNet-V2 (https://arxiv.org/abs/1801.04381)
- FBNet-C (https://arxiv.org/abs/1812.03443) -- trained in PyTorch
- Single-Path NAS (https://arxiv.org/abs/1904.02877) -- pixel1 variant
MobileNet-V3 (https://arxiv.org/abs/1905.02244) -- pretrained PyTorch model, official TF weights ported
HRNet
- code from https://github.com/HRNet/HRNet-Image-Classification, paper https://arxiv.org/abs/1908.07919
SelecSLS
- code from https://github.com/mehtadushy/SelecSLS-Pytorch, paper https://arxiv.org/abs/1907.00837

Use the --model arg to specify model for train, validation, inference scripts. Match the all lowercase
creation fn for the model you'd like.

Features

Several (less common) features that I often utilize in my projects are included. Many of their additions are the reason why I maintain my own set of models, instead of using others' via PIP:

All models have a common default configuration interface and API for
- accessing/changing the classifier - get_classifier and reset_classifier
- doing a forward pass on just the features - forward_features
- these makes it easy to write consistent network wrappers that work with any of the models
All models have a consistent pretrained weight loader that adapts last linear if necessary, and from 3 to 1 channel input if desired
The train script works in several process/GPU modes:
- NVIDIA DDP w/ a single GPU per process, multiple processes with APEX present (AMP mixed-precision optional)
- PyTorch DistributedDataParallel w/ multi-gpu, single process (AMP disabled as it crashes when enabled)
- PyTorch w/ single GPU single process (AMP optional)
A dynamic global pool implementation that allows selecting from average pooling, max pooling, average + max, or concat([average, max]) at model creation. All global pooling is adaptive average by default and compatible with pretrained weights.
A 'Test Time Pool' wrapper that can wrap any of the included models and usually provide improved performance doing inference with input images larger than the training size. Idea adapted from original DPN implementation when I ported (https://github.com/cypw/DPNs)
Training schedules and techniques that provide competitive results (Cosine LR, Random Erasing, Label Smoothing, etc)
Mixup (as in https://arxiv.org/abs/1710.09412) - currently implementing/testing
An inference script that dumps output to CSV is provided as an example
AutoAugment (https://arxiv.org/abs/1805.09501) and RandAugment (https://arxiv.org/abs/1909.13719) ImageNet configurations modeled after impl for EfficientNet training (https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/autoaugment.py)
AugMix w/ JSD loss (https://arxiv.org/abs/1912.02781), JSD w/ clean + augmented mixing support works with AutoAugment and RandAugment as well
SplitBachNorm - allows splitting batch norm layers between clean and augmented (auxiliary batch norm) data

Results

A CSV file containing an ImageNet-1K validation results summary for all included models with pretrained weights and default configurations is located here

Self-trained Weights

I've leveraged the training scripts in this repository to train a few of the models with missing weights to good levels of performance. These numbers are all for 224x224 training and validation image sizing with the usual 87.5% validation crop.

名称与所有者	huggingface/pytorch-image-models
主编程语言	Python
编程语言	Python (语言数: 3)
平台	Linux, Mac, Windows
许可证	Apache License 2.0

创建于	2019-02-02 05:51:12
推送于	2025-07-10 16:04:41
最后一次提交
发布数	67
最新版本名称	v1.0.17 (发布于 )
第一版名称	v0.1-weights (发布于 )

星数	34.8k
关注者数	317
派生数	5k
提交数	2.9k
已启用问题?
问题数	991
打开的问题数	55
拉请求数	473
打开的拉请求数	22
关闭的拉请求数	148

已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?

PyTorch Image Models (timm)

Github星跟踪图