Build a peak table — peakTable • GCIMS

Extract the volume of each ROI across samples to create a peak table.

peakTable(peak_list_clustered, aggregate_conflicting_peaks = NULL)

Arguments

peak_list_clustered: A peak list with clusters assigned. Also, you can create your own peak table and use it as input value for peak_list_clustered (see first example below)
aggregate_conflicting_peaks: NULL or a function. What to do, in case two peaks from the same sample have been assigned to the same cluster. If NULL, throw an error. If mean, max or any other function, we will summarize all the conflicting volumes into that number (e.g. "take the maximum of the peaks")

Value

A list with three fields: peak_table, peak_table_matrix, and peak_table_duplicity. peak_table, and peak_table_matrix, provide information of the peak table. peak_table is a dataframe containing cluster volumes, whose columns represent samples and rows clusters. peak_table_matrix presents the same information content as peak_table but in matrix form. Note that in peak_table columns represent clusters and rows samples. Finally, peak_table_duplicity is a dataframe that shows ROI duplicity information among clusters. Ideally, only one peak per sample should belong to a cluster.

Examples

# Create your peak table from scratch:
pl <- data.frame(
  SampleID = c("S1", "S1", "S2", "S2"),
  cluster = c("Cluster1", "Cluster2", "Cluster1", "Cluster2"),
  Volume = c(10, 20, 8, 18)
)
peak_table <- peakTable(pl)

peak_table$peak_table_matrix
#>    Cluster1 Cluster2
#> S1       10       20
#> S2        8       18

# You can use imputePeakTable() to fill in the missing values

# If the clustering doesn't work great, you may end up with two peaks
# from the same sample on the same cluster. This does not make sense
# empirically, because it's either one or the other. In case of such
# ambiguity, peakTable() will give an error.
#
# If you want, you can override the error by taking the average volume
# of those ambiguous peaks, or the maximum, using,
# e.g. `aggregate_conflicting_peaks = max`.
#
# In any case, you will get information on how many peaks were aggregated
# in the `peak_table_duplicity` field (ideally should be full of `1`):
peak_table$peak_table_duplicity
#> # A tibble: 2 × 3
#>   cluster     S1    S2
#>   <chr>    <int> <int>
#> 1 Cluster1     1     1
#> 2 Cluster2     1     1