RoboFlow 100 Document dataset Collection

rf100_document_collection(
  dataset,
  split = c("train", "test", "valid"),
  transform = NULL,
  target_transform = NULL,
  download = FALSE
)

Arguments

dataset

Dataset to select within c("tweeter_post", "tweeter_profile", "document_part", "activity_diagram", "signature", "paper_part", "tabular_data", "paragraph").

split

the subset of the dataset to choose between c("train", "test", "valid").

transform

Optional transform function applied to the image.

target_transform

Optional transform function applied to the target.

download

Logical. If TRUE, downloads the dataset if not present at root.

Value

A torch dataset. Each element is a named list with:

  • x: H x W x 3 array representing the image.

  • y: a list containing the target with:

    • image_id: numeric identifier of the x image.

    • labels: numeric identifier of the N bounding-box object class.

    • boxes: a torch_tensor of shape (N, 4) with bounding boxes, each in \((x_{min}, y_{min}, x_{max}, y_{max})\) format.

The returned item inherits the class image_with_bounding_box so it can be visualised with helper functions such as draw_bounding_boxes().

Details

Loads one of the RoboFlow 100 Document datasets with COCO-style bounding box annotations for object detection tasks.

Examples

if (FALSE) { # \dontrun{
ds <- rf100_document_collection(
  dataset = "tweeter_post",
  split = "train",
  transform = transform_to_tensor,
  download = TRUE
)

# Retrieve a sample and inspect annotations
item <- ds[1]
item$y$labels
item$y$boxes

# Draw bounding boxes and display the image
boxed_img <- draw_bounding_boxes(item)
tensor_image_browse(boxed_img)
} # }