A class that contains a dataset where its samples are backed on disk. Each sample is stored on a file, and when queued operations are executed files are loaded and saved on a new folder.

Details

This class is not exported, but if you want to use it reach us at https://github.com/sipss/GCIMS/issues/ and we will export it.

Super class

GCIMS::DelayedDatasetBase -> DelayedDatasetDisk

Active bindings

sampleNames

The character vector with unique sample names. Renaming samples renames files on obj$getCurrentDir() as well

scratchDir

The directory where intermediate and processed files are saved.

Methods

Inherited methods


Method new()

Create a delayed dataset on disk

Usage

DelayedDatasetDisk$new(
  samples,
  scratch_dir,
  keep_intermediate = FALSE,
  sample_class = NULL
)

Arguments

samples

A named vector. The names are sample ids, the values are either filenames or sample objects. If they are not filenames then the objects are dumped to disk. If they are filenames, the filenames are relative to base_dir.

scratch_dir

The directory where samples being processed will be saved

keep_intermediate

A logical value, whether intermediate realization steps should be saved.

sample_class

The class of the samples in the dataset, used just to validate the contract between the delayed actions and the samples. If NULL action return values are not checked

Returns

The DelayedDatasetDisk object


Method getSample()

Get a sample from the dataset

Usage

DelayedDatasetDisk$getSample(sample, dataset)

Arguments

sample

Either an integer (sample index) or a string (sample name)

dataset

The dataset so we can realize if there are enqueued actions

Returns

The sample object


Method getCurrentDir()

Get the path to the location of the processed samples

Usage

DelayedDatasetDisk$getCurrentDir()

Returns

The full path to the directory where samples are saved


Method checkSampleFiles()

Get missing sample files

Usage

DelayedDatasetDisk$checkSampleFiles(on_error = "nothing")

Arguments

on_error

Either "abort" or "nothing". Action to take if there are missing files

Returns

A vector with missing files named with sample ids.


Method updateScratchDir()

Copies the samples to a new scratch directory and saves the dataset there as well

Usage

DelayedDatasetDisk$updateScratchDir(
  new_scratch_dir,
  dataset = NULL,
  override_current_dir = NULL
)

Arguments

new_scratch_dir

A new scratch directory to store samples

dataset

If an object is given, it is saved under the new_scratch_dir, with the samples.

override_current_dir

If not NULL, assume samples are in this directory, instead of in self$getCurrentDir(). Useful when loading samples from a saved directory.


Method subset()

Subsets some samples

Usage

DelayedDatasetDisk$subset(sample)

Arguments

sample

A numeric vector with indices, a character vector with names or a logical vector

Returns

the delayed dataset modified in-place


Method clone()

The objects of this class are cloneable with this method.

Usage

DelayedDatasetDisk$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.