Implements the ConvNeXt architecture from ConvNeXt: A ConvNet for the 2020s

model_convnext_tiny_1k(
  pretrained = FALSE,
  progress = TRUE,
  channels = 3,
  num_classes = 1000,
  ...
)

model_convnext_tiny_22k(
  pretrained = FALSE,
  progress = TRUE,
  channels = 3,
  num_classes = 21841,
  ...
)

model_convnext_small_22k(
  pretrained = FALSE,
  progress = TRUE,
  channels = 3,
  num_classes = 21841,
  ...
)

model_convnext_small_22k1k(
  pretrained = FALSE,
  progress = TRUE,
  channels = 3,
  num_classes = 21841,
  ...
)

model_convnext_base_1k(
  pretrained = FALSE,
  progress = TRUE,
  channels = 3,
  num_classes = 1000,
  ...
)

model_convnext_base_22k(
  pretrained = FALSE,
  progress = TRUE,
  channels = 3,
  num_classes = 21841,
  ...
)

model_convnext_large_1k(
  pretrained = FALSE,
  progress = TRUE,
  channels = 3,
  num_classes = 1000,
  ...
)

model_convnext_large_22k(
  pretrained = FALSE,
  progress = TRUE,
  channels = 3,
  num_classes = 21841,
  ...
)

Arguments

pretrained

(bool): If TRUE, returns a model pre-trained on ImageNet.

progress

(bool): If TRUE, displays a progress bar of the download to stderr.

channels

The number of channels in the input image. Default: 3.

num_classes

number of output classes (default: 1000).

...

Other parameters passed to the model implementation.

Functions

  • model_convnext_tiny_1k(): ConvNeXt Tiny model trained on Imagenet 1k.

  • model_convnext_tiny_22k(): ConvNeXt Tiny model trained on Imagenet 22k.

  • model_convnext_small_22k(): ConvNeXt Small model trained on Imagenet 22k.

  • model_convnext_small_22k1k(): ConvNeXt Small model pretrained on Imagenet 1k and fine-tuned on Imagenet 22k classes.

  • model_convnext_base_1k(): ConvNeXt Base model trained on Imagenet 1k.

  • model_convnext_base_22k(): ConvNeXt Base model trained on Imagenet 22k.

  • model_convnext_large_1k(): ConvNeXt Large model trained on Imagenet 1k.

  • model_convnext_large_22k(): ConvNeXt Large model trained on Imagenet 22k.

Variants

Model Summary and Performance for pretrained weights

| Model                | Top-1 Acc| Params | GFLOPS | File Size | `num_classes`| image size |
|----------------------|----------|--------|--------|-----------|--------------|------------|
| convnext_tiny_1k     | 82.1%    | 28M    | 4.5    | 109 MB    |         1000 | 224 x 224  |
| convnext_tiny_22k    | 82.9%    | 29M    | 4.5    | 170 MB    |        21841 | 224 x 224  |
| convnext_small_22k   | 84.6%    | 50M    | 8.7    | 252 MB    |        21841 | 224 x 224  |
| convnext_small_22k1k | 84.6%    | 50M    | 8.7    | 252 MB    |        21841 | 224 x 224  |
| convnext_base_1k     | 85.1%    | 89M    | 15.4   | 338 MB    |         1000 | 224 x 224  |
| convnext_base_22k    | 85.8%    | 89M    | 15.4   | 420 MB    |        21841 | 224 x 224  |
| convnext_large_1k    | 84.3%    | 198M   | 34.4   | 750 MB    |         1000 | 224 x 224  |
| convnext_large_22k   | 86.6%    | 198M   | 34.4   | 880 MB    |        21841 | 224 x 224  |

Examples

if (FALSE) { # \dontrun{
# 1. Download sample image (dog)
norm_mean <- c(0.485, 0.456, 0.406) # ImageNet normalization constants, see
# https://pytorch.org/vision/stable/models.html
norm_std  <- c(0.229, 0.224, 0.225)
img_url <- "https://en.wikipedia.org/wiki/Special:FilePath/Felis_catus-cat_on_snow.jpg"
img <- base_loader(img_url)

# 2. Convert to tensor (RGB only), resize and normalize
input <- img %>%
 transform_to_tensor() %>%
 transform_resize(c(224, 224)) %>%
 transform_normalize(norm_mean, norm_std)
batch <- input$unsqueeze(1)

# 3. Load pretrained models
model_small <- convnext_tiny_1k(pretrained = TRUE, root = tempdir())
model_small$eval()

# 4. Forward pass
output_s <- model_small(batch)

# 5. Show Top-5 predictions
topk <- output_s$topk(k = 5, dim = 2)
indices <- as.integer(topk[[2]][1, ])
scores <- as.numeric(topk[[1]][1, ])
glue::glue("{seq_along(indices)}. {imagenet_label(indices)} ({round(scores, 2)}%)")
} # }