Package index • torchvision

Transforms Image transformation functions
Unitary transformation
`transform_adjust_brightness()`	Adjust the brightness of an image
`transform_adjust_contrast()`	Adjust the contrast of an image
`transform_adjust_gamma()`	Adjust the gamma of an RGB image
`transform_adjust_hue()`	Adjust the hue of an image
`transform_adjust_saturation()`	Adjust the color saturation of an image
`transform_affine()`	Apply affine transformation on an image keeping image center invariant
`transform_center_crop()`	Crops the given image at the center
`transform_convert_image_dtype()`	Convert a tensor image to the given `dtype` and scale the values accordingly
`transform_crop()`	Crop the given image at specified location and output size
`transform_grayscale()`	Convert image to grayscale
`transform_hflip()`	Horizontally flip a PIL Image or Tensor
`transform_linear_transformation()`	Transform a tensor image with a square transformation matrix and a mean_vector computed offline
`transform_normalize()`	Normalize a tensor image with mean and standard deviation
`transform_pad()`	Pad the given image on all sides with the given "pad" value
`transform_perspective()`	Perspective transformation of an image
`transform_resize()`	Resize the input image to the given size
`transform_rgb_to_grayscale()`	Convert RGB Image Tensor to Grayscale
`transform_rotate()`	Angular rotation of an image
`transform_to_tensor()`	Convert an image to a tensor
`transform_vflip()`	Vertically flip a PIL Image or Tensor
Random transformation
`transform_color_jitter()`	Randomly change the brightness, contrast and saturation of an image
`transform_random_affine()`	Random affine transformation of the image keeping center invariant
`transform_random_crop()`	Crop the given image at a random location
`transform_random_erasing()`	Randomly selects a rectangular region in an image and erases its pixel values
`transform_random_grayscale()`	Randomly convert image to grayscale with a given probability
`transform_random_horizontal_flip()`	Horizontally flip an image randomly with a given probability
`transform_random_perspective()`	Random perspective transformation of an image with a given probability
`transform_random_resized_crop()`	Crop image to random size and aspect ratio
`transform_random_rotation()`	Rotate the image by angle
`transform_random_vertical_flip()`	Vertically flip an image randomly with a given probability
Combining / multiplying transformations
`transform_five_crop()`	Crop image into four corners and a central crop
`transform_random_apply()`	Apply a list of transformations randomly with a given probability
`transform_random_choice()`	Apply single transformation randomly picked from a list
`transform_random_order()`	Apply a list of transformations in a random order
`transform_resized_crop()`	Crop an image and resize it to a desired size
`transform_ten_crop()`	Crop an image and the flipped image each into four corners and a central crop
Models Computer Vision deep-learning Model architectures
Classification models Model providing a output vector of the size of `num_classes` beeing the logit probability of each class.
`model_alexnet()`	AlexNet Model Architecture
`model_convnext_tiny_1k()` `model_convnext_tiny_22k()` `model_convnext_small_22k()` `model_convnext_small_22k1k()` `model_convnext_base_1k()` `model_convnext_base_22k()` `model_convnext_large_1k()` `model_convnext_large_22k()`	ConvNeXt Implementation
`model_efficientnet_b0()` `model_efficientnet_b1()` `model_efficientnet_b2()` `model_efficientnet_b3()` `model_efficientnet_b4()` `model_efficientnet_b5()` `model_efficientnet_b6()` `model_efficientnet_b7()`	EfficientNet Models
`model_efficientnet_v2_s()` `model_efficientnet_v2_m()` `model_efficientnet_v2_l()`	EfficientNetV2 Models
`model_inception_v3()`	Inception v3 model
`model_maxvit()`	MaxViT Model
`model_mobilenet_v2()`	MobileNetV2 Model
`model_mobilenet_v3_large()` `model_mobilenet_v3_small()`	MobileNetV3 Model
`model_resnet18()` `model_resnet34()` `model_resnet50()` `model_resnet101()` `model_resnet152()` `model_resnext50_32x4d()` `model_resnext101_32x8d()` `model_wide_resnet50_2()` `model_wide_resnet101_2()`	ResNet implementation
`model_vgg11()` `model_vgg11_bn()` `model_vgg13()` `model_vgg13_bn()` `model_vgg16()` `model_vgg16_bn()` `model_vgg19()` `model_vgg19_bn()`	VGG implementation
`model_vit_b_16()` `model_vit_b_32()` `model_vit_l_16()` `model_vit_l_32()` `model_vit_h_14()`	Vision Transformer Implementation
Object detection models Model providing an output list including a bounding-boxes vector of the detected object in the image each with $c(x_{min}, y_{min}, x_{max}, y_{max})$ format.
`model_facenet_pnet()` `model_facenet_rnet()` `model_facenet_onet()` `model_mtcnn()` `model_inception_resnet_v1()`	MTCNN Face Detection Networks
Semantic segmentation models Model providing an output list including a binary mask for each pixel of the image for each covered segmentation class.
`model_deeplabv3_resnet50()` `model_deeplabv3_resnet101()`	DeepLabV3 Models
`model_fcn_resnet50()` `model_fcn_resnet101()`	Fully Convolutional Network for Semantic Segmentation
Other models
Datasets Datasets readily available. All have a `x` variable in each item being the input image.
for Image Classification Dataset having items with “y” for target class identifier.
`caltech101_dataset()` `caltech256_dataset()`	Caltech Datasets
`cifar10_dataset()` `cifar100_dataset()`	CIFAR datasets
`eurosat_dataset()` `eurosat_all_bands_dataset()` `eurosat100_dataset()`	EuroSAT datasets
`fer_dataset()`	FER-2013 Facial Expression Dataset
`fgvc_aircraft_dataset()`	FGVC Aircraft Dataset
`flowers102_dataset()`	Oxford Flowers 102 Dataset
`image_folder_dataset()`	Create an image folder dataset
`lfw_people_dataset()` `lfw_pairs_dataset()`	LFW Datasets
`mnist_dataset()` `kmnist_dataset()` `qmnist_dataset()` `fashion_mnist_dataset()` `emnist_dataset()`	MNIST and Derived Datasets
`oxfordiiitpet_dataset()` `oxfordiiitpet_binary_dataset()`	Oxford-IIIT Pet Classification Datasets
`places365_dataset()` `places365_dataset_large()`	Places365 Dataset
`tiny_imagenet_dataset()`	Tiny ImageNet dataset
`whoi_small_plankton_dataset()` `whoi_plankton_dataset()`	WHOI Plankton Datasets
`whoi_small_coralnet_dataset()`	Coralnet Dataset
for Object Detection Dataset having items with “y” as a named list of bounding-box and labels for object detection.
`coco_detection_dataset()`	COCO Detection Dataset
`pascal_segmentation_dataset()` `pascal_detection_dataset()`	Pascal VOC Datasets
`rf100_biology_collection()`	RoboFlow 100 Biology dataset Collection
`rf100_damage_collection()`	RoboFlow 100 Damages dataset Collection
`rf100_document_collection()`	RF100 Document Collection Datasets
`rf100_infrared_collection()`	RoboFlow 100 Infrared dataset Collection
`rf100_medical_collection()`	RoboFlow 100 Medical dataset Collection
`rf100_underwater_collection()`	RoboFlow 100 Underwater dataset Collection
for Image captionning Dataset having items with “y” as one or multiple captions of the image
`coco_caption_dataset()`	COCO Caption Dataset
`flickr8k_caption_dataset()` `flickr30k_caption_dataset()`	Flickr Caption Datasets
for Semantic segmentation Dataset having items with “y” as a named list containing a segmentation mask and labels for image segmentation.
`oxfordiiitpet_segmentation_dataset()`	Oxford-IIIT Pet Segmentation Dataset
`pascal_segmentation_dataset()` `pascal_detection_dataset()`	Pascal VOC Datasets
`rf100_peixos_segmentation_dataset()`	RF100 Peixos Segmentation Dataset
Displaying
Images loading Tools for Images loading
`magick_loader()`	Load an Image using ImageMagick
`base_loader()`	Base loader
Images visualization Tools for Images manipulation and visualization
`draw_bounding_boxes()`	Draws bounding boxes on image.
`draw_keypoints()`	Draws Keypoints
`draw_segmentation_masks()`	Draw segmentation masks
`tensor_image_browse()`	Display image tensor
`tensor_image_display()`	Display image tensor
`vision_make_grid()`	A simplified version of torchvision.utils.make_grid
Misc
`imagenet_classes()` `imagenet_label()`	ImageNet Class Labels
`batched_nms()`	Batched Non-maximum Suppression (NMS)
`nms()`	Non-maximum Suppression (NMS)
`box_area()`	Box Area
`box_convert()`	Box Convert
`box_cxcywh_to_xyxy()`	box_cxcywh_to_xyxy
`box_iou()`	Box IoU
`box_xywh_to_xyxy()`	box_xywh_to_xyxy
`box_xyxy_to_cxcywh()`	box_xyxy_to_cxcywh
`box_xyxy_to_xywh()`	box_xyxy_to_xywh
`clip_boxes_to_image()`	Clip Boxes to Image
`generalized_box_iou()`	Generalized Box IoU
`remove_small_boxes()`	Remove Small Boxes

Reference

Transforms

Unitary transformation

Random transformation

Combining / multiplying transformations

Models

Classification models

Object detection models

Semantic segmentation models

Other models

Datasets

for Image Classification

for Object Detection

for Image captionning

for Semantic segmentation

Displaying

Images loading

Images visualization

Misc