% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/quality_control.R
\name{tof_assess_flow_rate}
\alias{tof_assess_flow_rate}
\title{Detect flow rate abnormalities in high-dimensional cytometry data}
\usage{
tof_assess_flow_rate(
  tof_tibble,
  time_col,
  group_cols,
  num_timesteps = nrow(tof_tibble)/1000,
  alpha_threshold = 0.01,
  visualize = FALSE,
  ...,
  augment = FALSE
)
}
\arguments{
\item{tof_tibble}{A `tof_tbl` or `tibble`.}

\item{time_col}{An unquoted column name indicating which column in `tof_tibble`
contains the time at which each cell was collected.}

\item{group_cols}{Optional. Unquoted column names indicating which columns
should be used to group cells before analysis. Flow rate calculation is then
performed independently within each group. Supports tidyselect helpers.}

\item{num_timesteps}{The number of bins into which `time_col` should be split.
to define "timesteps" of the data collection process. The number of cells analyzed
by the cytometer will be counted in each bin separately and will represent
the relative average flow rate for that timestep in data collection.}

\item{alpha_threshold}{A scalar between 0 and 1 indicating the two-tailed significance level
at which to draw outlier thresholds in the t-distribution with `num_timesteps` - 1
degrees of freedom. Defaults to 0.01.}

\item{visualize}{A boolean value indicating if a plot should be generated to
visualize each timestep's relative flow rate (by group) instead of returning
the tibble directly. Defaults to FALSE.}

\item{...}{Optional additional arguments to pass to \code{\link[ggplot2]{facet_wrap}}.
Ignored if visualize = FALSE.}

\item{augment}{A boolean value indicating if the output should column-bind the
computed flags for each cell (see below) as new columns in `tof_tibble` (TRUE) or if
a tibble including only the computed flags should be returned (FALSE, the default).}
}
\value{
A tibble with the same number of rows as `tof_tibble`. If augment = FALSE
(the default), it will have 3 columns: "\{time_col\}" (the same column as `time_col`),
"timestep" (the numeric timestep to which each cell was assigned based on its
value for `time_col`), and "flagged_window" (a boolean vector indicating if
each cell was collecting during a timestep flagged for having a high or low
flow rate). If augment = TRUE, these 3 columns will be column-bound to `tof_tibble`
to return an augmented version of the input dataset. (Note that in this case, time_col will
not be duplicated). If visualize = TRUE, then a ggplot object is returned instead
of a tibble.
}
\description{
This function performs a simplified version of
\href{https://academic.oup.com/bioinformatics/article/32/16/2473/2240408}{flowAI's}
statistical test to detect time periods with abnormal flow rates over the
course of a flow cytometry experiment. Briefly, the relative flow rates for each timestep
throughout data acquisition are calculated (see \link{tof_calculate_flow_rate}), and
outlier timepoints with particularly high or low flow rates (i.e. those beyond
extreme values of the t-distribution across timesteps) are flagged.
}
\examples{
set.seed(1000L)
sim_data <-
    data.frame(
        cd4 = rnorm(n = 1000, mean = 5, sd = 0.5),
        cd8 = rnorm(n = 1000, mean = 0, sd = 0.1),
        cd33 = rnorm(n = 1000, mean = 10, sd = 0.1),
        file_name = c(rep("a", times = 500), rep("b", times = 500)),
        time =
            c(
                sample(1:100, size = 200, replace = TRUE),
                sample(100:400, size = 300, replace = TRUE),
                sample(1:150, size = 400, replace = TRUE),
                sample(1:500, size = 100, replace = TRUE)
            )
    )

sim_data |>
    tof_assess_flow_rate(
        time_col = time,
        num_timesteps = 20,
        visualize = TRUE
    )

sim_data |>
    tof_assess_flow_rate(
        time_col = time,
        group_cols = file_name,
        num_timesteps = 20,
        visualize = TRUE
    )

}
