% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/clonality.R
\name{clonality}
\alias{clonality}
\title{Clonality}
\usage{
clonality(file.list)
}
\arguments{
\item{file.list}{A list of data frames consisting of antigen receptor 
sequencing imported by the LymphoSeq function readImmunoSeq. "aminoAcid", "count", 
and "frequencyCount" are required columns.  "estimatedNumberGenomes" is optional.  
Note that clonality is usually calculated from productive nucleotide sequences.  
Therefore, it is not recommended to run this function using a productive sequence
list aggregated by amino acids.}
}
\value{
Returns a data frame giving the total number of sequences, number of 
unique productive sequences, number of genomes, clonality, Gini coefficient, 
and the frequency (\%) of the top productive sequence in each sample.
}
\description{
Creates a data frame giving the total number of sequences, number of unique 
productive sequences, number of genomes, entropy, clonality, Gini 
coefficient, and the frequency (\%) of the top productive sequences in a list 
of sample data frames.
}
\details{
Clonality is derived from the Shannon entropy, which is calculated 
from the frequencies of all productive sequences divided by the logarithm of 
the total number of unique productive sequences.  This normalized entropy 
value is then inverted (1 - normalized entropy) to produce the clonality 
metric.  

The Gini coefficient is an alternative metric used to calculate repertoire 
diversity and is derived from the Lorenz curve.  The Lorenz curve is drawn 
such that x-axis represents the cumulative percentage of unique sequences and 
the y-axis represents the cumulative percentage of reads.  A line passing 
through the origin with a slope of 1 reflects equal frequencies of all clones.  
The Gini coefficient is the ratio of the area between the line of equality 
and the observed Lorenz curve over the total area under the line of equality.  
Both Gini coefficient and clonality are reported on a scale from 0 to 1 where 
0 indicates all sequences have the same frequency and 1 indicates the 
repertoire is dominated by a single sequence.
}
\examples{
file.path <- system.file("extdata", "TCRB_sequencing", package = "LymphoSeq")

file.list <- readImmunoSeq(path = file.path)

clonality(file.list = file.list)
}
\seealso{
\code{\link{lorenzCurve}}
}
