% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/summix_local.R
\name{summix_local}
\alias{summix_local}
\title{summix_local}
\usage{
summix_local(
  data,
  reference,
  observed,
  goodness.of.fit = TRUE,
  type = "variants",
  algorithm = "fastcatch",
  minVariants = 0,
  maxVariants = 0,
  maxWindowSize = 0,
  minWindowSize = 0,
  windowOverlap = 200,
  maxStepSize = 1000,
  diffThreshold = 0.02,
  NSimRef = NULL,
  override_fit = FALSE,
  override_removeSmallAnc = FALSE,
  selection_scan = FALSE,
  position_col = "POS",
  nSimSE = 1000
)
}
\arguments{
\item{data}{a data frame of the observed group and reference group allele frequencies for N genetic variants on a single chromosome. Must contain a column specifying the genetic variant positions.}

\item{reference}{a character vector of the column names for K reference groups.}

\item{observed}{a character value that is the column name for the observed group.}

\item{goodness.of.fit}{an option to override the default scaled objective to return the raw loss from slsqp}

\item{type}{user choice of how to define window size; options "variants" and "bp" are available where "variants" defines window size as the number of variants in a given window and "bp" defines window size as the number of base pairs in a given window. Default is "variants".}

\item{algorithm}{user choice of algorithm to define local substructure blocks; options "fastcatch" and "windows" are available. "windows" uses a fixed window in a sliding windows algorithm. "fastcatch" allows dynamic window sizes. The "fastcatch" algorithm is recommended- though it is computationally slower. Default is "fastcatch".}

\item{minVariants}{Used if algorithm = "fastcatch" and type = "variants". A numeric value that specifies the minimum number of genetic variants allowed to define a given window.}

\item{maxVariants}{Used if type = "variants". A numeric value that specifies the maximum number of genetic variants allowed to define a given window.}

\item{maxWindowSize}{Used if type = "bp". A numeric value that defines the maximum allowed window size by the number of base pairs in a given window.}

\item{minWindowSize}{Used if algorithm = "fastcatch" and type = "bp". A numeric value that specifies the minimum number of base pairs allowed to define a given window.}

\item{windowOverlap}{Used if algorithm = "windows". A numeric value that defines the number of variants or the number of base pairs that overlap between the given sliding windows. Default is 200.}

\item{maxStepSize}{a numeric value that defines the maximum gap in base pairs between two consecutive genetic variants within a given window. Default is 1000.}

\item{diffThreshold}{Used if algorithm = "fastcatch". A numeric value that defines the percent difference threshold to mark the end of a local substructure block. Default is 0.02.}

\item{NSimRef}{Used if f selection_scan = TRUE. A numeric vector of the sample sizes for each of the K reference groups that is in the same order as the reference parameter. This is used in a simulation framework that calculates within local substructure block standard error.}

\item{override_fit}{default is FALSE. If set as TRUE, the user will override the auto-stop of summix_local() that occurs if the global goodness of fit value is greater than 1.5 (indicating a poor fit of the reference data to the observed data).}

\item{override_removeSmallAnc}{default is FALSE. If set as TRUE, the user will override the automatic removal of reference ancestries with <2\% global proportions – this is not recommended.}

\item{selection_scan}{user option to perform a selection scan on the given chromosome. Default is FALSE. If set as TRUE, a test statistic will be calculated for each local substructure block. Note: the user can expect extended computation time if this option is set as TRUE.}

\item{position_col}{a character value that is the column name for the genetic variants positions. Default is "POS".}

\item{nSimSE}{user choice of number of internal simulations to run to calculate standard error of estimates. Default is 1000.}
}
\value{
data frame with a row for each local substructure block and the following columns:

goodness.of.fit: scaled objective reflecting the fit of the reference data. Values between 0.5-1.5 are considered moderate fit and should be used with caution. Values greater than 1.5 indicate poor fit, and users should not perform further analyses using summix

iterations: number of iterations for SLSQP algorithm

time: time in seconds of SLSQP algorithm

filtered: number of SNPs not used in estimation due to missing values

K columns of mixture proportions of reference groups input into the function

nSNPs: number of SNPs in the given local substructure block
}
\description{
Estimates local substructure mixture proportions in genetic summary data; Also performs a selection scan (optional) that identifies potential regions of selection along the given chromosome.
}
\examples{
data(ancestryData)
results <- summix_local(data = ancestryData,
                        reference = c("reference_AF_afr",
                                      "reference_AF_eas",
                                      "reference_AF_eur",
                                      "reference_AF_iam",
                                      "reference_AF_sas"),
                        NSimRef = c(704,787,741,47,545),
                        observed="gnomad_AF_afr",
                        goodness.of.fit = TRUE,
                        type = "variants",
                        algorithm = "fastcatch",
                        minVariants = 150,
                        maxVariants = 250,
                        maxStepSize = 1000,
                        diffThreshold = .02,
                        override_fit = FALSE,
                        override_removeSmallAnc = TRUE,
                        selection_scan = FALSE,
                        position_col = "POS")
print(results$results)

}
\references{
https://github.com/hendriau/Summix2
}
\seealso{
\url{https://github.com/hendriau/Summix2} for further documentation.
}
\author{
Hayley Wolff (Stoneman), \email{hayley.wolff@cuanschutz.edu}

Audrey Hendricks, \email{audrey.hendricks@cuanschutz.edu}
}
\keyword{admixture,}
\keyword{ancestry}
\keyword{distribution,}
\keyword{genetics,}
\keyword{local}
\keyword{mixture}
\keyword{population}
\keyword{stratification,}
