% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/inferTumHetero.R
\name{inferHeterogeneity}
\alias{inferHeterogeneity}
\title{Clusters variants based on Variant Allele Frequencies (VAF).}
\usage{
inferHeterogeneity(
  maf,
  tsb = NULL,
  top = 5,
  vafCol = NULL,
  segFile = NULL,
  ignChr = NULL,
  minVaf = 0,
  maxVaf = 1,
  useSyn = FALSE,
  dirichlet = FALSE
)
}
\arguments{
\item{maf}{an \code{\link{MAF}} object generated by \code{\link{read.maf}}}

\item{tsb}{specify sample names (Tumor_Sample_Barcodes) for which clustering has to be done.}

\item{top}{if \code{tsb} is NULL, uses top n number of most mutated samples. Defaults to 5.}

\item{vafCol}{manually specify column name for vafs. Default looks for column 't_vaf'}

\item{segFile}{path to CBS segmented copy number file. Column names should be Sample, Chromosome, Start, End, Num_Probes and Segment_Mean (log2 scale).}

\item{ignChr}{ignore these chromosomes from analysis. e.g, sex chromsomes chrX, chrY. Default NULL.}

\item{minVaf}{filter low frequency variants. Low vaf variants maybe due to sequencing error. Default 0. (on the scale of 0 to 1)}

\item{maxVaf}{filter high frequency variants. High vaf variants maybe due to copy number alterations or impure tumor. Default 1. (on the scale of 0 to 1)}

\item{useSyn}{Use synonymous variants. Default FALSE.}

\item{dirichlet}{Deprecated! No longer supported. uses nonparametric dirichlet process for clustering. Default FALSE - uses finite mixture models.}
}
\value{
list of clustering tables.
}
\description{
takes output generated by read.maf and clusters variants to infer tumor heterogeneity. This function requires VAF for clustering and density estimation.
VAF can be on the scale 0-1 or 0-100. Optionally if copy number information is available, it can be provided as a segmented file (e.g, from Circular Binary Segmentation). Those variants in
copy number altered regions will be ignored.
}
\details{
This function clusters variants based on VAF to estimate univariate density and cluster classification. There are two methods available
for clustering. Default using parametric finite mixture models and another method using nonparametric inifinite mixture models (Dirichlet process).
}
\examples{
\dontrun{
laml.maf <- system.file("extdata", "tcga_laml.maf.gz", package = "maftools")
laml <- read.maf(maf = laml.maf)
TCGA.AB.2972.clust <- inferHeterogeneity(maf = laml, tsb = 'TCGA-AB-2972', vafCol = 'i_TumorVAF_WU')
}
}
\references{
Chris Fraley and Adrian E. Raftery (2002) Model-based Clustering, Discriminant Analysis and Density Estimation Journal of the American
Statistical Association 97:611-631

Jara A, Hanson TE, Quintana FA, Muller P, Rosner GL. DPpackage: Bayesian Semi- and Nonparametric Modeling in R. Journal of statistical software. 2011;40(5):1-30.

Olshen AB, Venkatraman ES, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics. 2004;5(4):557-72.
}
\seealso{
\code{\link{plotClusters}}
}
