% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/transform_counts.R
\name{transform_counts}
\alias{transform_counts}
\title{Transform the raw counts provided by the recount3 project}
\usage{
transform_counts(
  rse,
  by = c("auc", "mapped_reads"),
  targetSize = 4e+07,
  L = 100,
  round = TRUE,
  ...
)
}
\arguments{
\item{rse}{A
\link[SummarizedExperiment:RangedSummarizedExperiment-class]{RangedSummarizedExperiment-class}
created by \code{create_rse()}.}

\item{by}{Either \code{auc} or \code{mapped_reads}. If set to \code{auc} it
will compute the scaling factor by the total coverage of the sample. That is,
the area under the curve (AUC) of the coverage. If set to \code{mapped_reads} it
will scale the counts by the number of mapped reads (in the QC annotation),
whether the library was paired-end or not, and the desired read length (\code{L}).}

\item{targetSize}{A \code{numeric(1)} specifying the target library size in number
of single end reads.}

\item{L}{A \code{integer(1)} specifying the target read length. It is only used
when \code{by = 'mapped_reads'} since it cancels out in the calculation when
using \code{by = 'auc'}.}

\item{round}{A \code{logical(1)} specifying whether to round the transformed
counts or not.}

\item{...}{Further arguments passed to \code{compute_scale_factors()}.}
}
\value{
A \code{matrix()} with the transformed (scaled) counts.
}
\description{
In preparation for a differential expression analysis, you will have to
choose how to scale the raw counts provided by the recount3 project. These
raw counts are similar to those provided by the recount2 project, except
that they were generated with a different aligner and a modified counting
approach. The raw coverage counts for recount2 are described with
illustrative figures at \url{https://doi.org/10.12688/f1000research.12223.1}.
Note that the raw counts are the sum of the base level coverage so you have
to take into account the total base-pair coverage for the given sample
(default option) by using the area under the coverage (AUC), or alternatively
use the mapped read lengths. You might want to do some further scaling to
take into account the gene or exon lengths. If you prefer to calculate read
counts without scaling check the function \code{compute_read_counts()}.
}
\details{
This function is similar to
\code{recount::scale_counts()} but more general and with a different name to
avoid NAMESPACE conflicts.
}
\examples{

## Create a RSE object at the gene level
rse_gene_SRP009615 <- create_rse_manual("SRP009615")

## Scale the counts using the AUC
assays(rse_gene_SRP009615)$counts <- transform_counts(rse_gene_SRP009615)

## See that now we have two assayNames()
rse_gene_SRP009615
assayNames(rse_gene_SRP009615)

## You can compare the scaled counts against those from
## recount::scale_counts() from the recount2 project
## which used a different RNA-seq aligner
## If needed, install recount, the R/Bioconductor package for recount2:
# BiocManager::install("recount")
recount2_sizes <- colSums(assay(recount::scale_counts(
    recount::rse_gene_SRP009615,
    by = "auc"
), "counts")) / 1e6
recount3_sizes <- colSums(assay(rse_gene_SRP009615, "counts")) / 1e6
recount_sizes <- data.frame(
    recount2 = recount2_sizes[order(names(recount2_sizes))],
    recount3 = recount3_sizes[order(names(recount3_sizes))]
)
plot(recount2 ~ recount3, data = recount_sizes)
abline(a = 0, b = 1, col = "purple", lwd = 2, lty = 2)

## Compute RPKMs
assays(rse_gene_SRP009615)$RPKM <- recount::getRPKM(rse_gene_SRP009615)
colSums(assay(rse_gene_SRP009615, "RPKM"))

## Compute TPMs
assays(rse_gene_SRP009615)$TPM <- recount::getTPM(rse_gene_SRP009615)
colSums(assay(rse_gene_SRP009615, "TPM")) / 1e6 ## Should all be equal to 1
}
\seealso{
Other count transformation functions: 
\code{\link{compute_read_counts}()},
\code{\link{compute_scale_factors}()},
\code{\link{is_paired_end}()}
}
\concept{count transformation functions}
