Loads and preprocesses the QMNIST dataset, including optional support for the NIST digit subset.

qmnist_dataset(
  root = tempdir(),
  split = "train",
  transform = NULL,
  target_transform = NULL,
  download = FALSE
)

Arguments

root

(string): Root directory of dataset where MNIST/processed/training.pt and MNIST/processed/test.pt exist.

split

(string, optional) Which subset to load: one of "train", "test", or "nist". Defaults to "train". The "nist" option loads the full NIST digits set.

transform

(callable, optional): A function/transform that takes in an PIL image and returns a transformed version. E.g, transform_random_crop().

target_transform

(callable, optional): A function/transform that takes in the target and transforms it.

download

(bool, optional): If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

Value

An R6 dataset object compatible with the {torch} package, providing indexed access to (image, label) pairs from the specified QMNIST subset.

Details

This dataset is an extended version of the original MNIST, offering more samples and precise label information. It is suitable for benchmarking modern machine learning models and can serve as a drop-in replacement for MNIST in most image classification tasks.

Supported Subsets

  • "train": 60,000 training examples (compatible with MNIST)

  • "test": 60,000 test examples (extended QMNIST test set)

  • "nist": Entire NIST digit dataset (for advanced benchmarking)

Examples

if (FALSE) { # \dontrun{
qmnist <- qmnist_dataset(split = "train", download = TRUE)
first_item <- qmnist[1]
# image in item 1
first_item$x
# label of item 1
first_item$y
} # }