% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/compute_probabilities.R
\name{compute_HMRFHiC_probabilities}
\alias{compute_HMRFHiC_probabilities}
\title{Compute HMRFHiC Probabilities of Assigning an Interaction to Each Component}
\usage{
compute_HMRFHiC_probabilities(data, chain_betas, iterations, dist = "ZINB")
}
\arguments{
\item{data}{A \code{data.frame} containing the following required columns:
\itemize{
  \item \code{start}: Genomic start position for locus i.
  \item \code{end}: Genomic end position for locus j.
  \item \code{interactions}: Observed interaction counts between loci i and j.
  \item \code{GC}: GC content or related genomic feature for the interaction.
  \item \code{TES}: Transposable Elements data or similar regulatory measure.
  \item \code{ACC}: Accessibility measure or another continuous genomic covariate.
}
The \code{start} and \code{end} columns must correspond to the indices of interacting loci (i and j).}

\item{chain_betas}{A list of MCMC chain results generated by the HMRFHiC model-fitting procedure. Each element should correspond to one chain and must include:
\itemize{
  \item \code{chains}: A list of 3 MCMC chains (one for each component), each containing posterior samples of regression coefficients.
  \item \code{theta}: (If applicable) Posterior samples of the zero-inflation parameter \eqn{\theta}.
  \item \code{size}: (If applicable) Posterior samples for the overdispersion parameter used in Negative Binomial or Zero-Inflated Negative Binomial models.
}}

\item{iterations}{An integer specifying the number of total MCMC iterations. The function uses half of these as burn-in (i.e., \code{iterations/2}).}

\item{dist}{A character string indicating the chosen distribution for modeling interaction counts. One of:
\itemize{
  \item \code{"ZIP"}: Zero-Inflated Poisson
  \item \code{"NB"}: Negative Binomial
  \item \code{"Poisson"}: Poisson
  \item \code{"ZINB"}: Zero-Inflated Negative Binomial
}
The default is \code{"ZINB"}.}
}
\value{
A \code{data.frame} containing the original input columns plus three new columns:
\itemize{
  \item \code{prob1}: The posterior probability that the interaction belongs to component 1.
  \item \code{prob2}: The posterior probability that the interaction belongs to component 2.
  \item \code{prob3}: The posterior probability that the interaction belongs to component 3.
}

The returned data frame thus provides a probabilistic classification of each observed interaction into one of the three modeled components.
}
\description{
This function computes the posterior probabilities of assigning genomic interactions to each of three mixture components (or states) in a Hidden Markov Random Field (HMRF) model.
It uses the posterior means of regression parameters obtained from MCMC simulations and combines these with user-specified distributions (zero-inflated or standard)
to produce probabilities for each observed interaction.
}
\details{
This function is part of the HMRFHiC pipeline that models genomic interactions (e.g., Hi-C interaction counts) using a mixture model approach.
The model typically considers three components (or latent states), each characterized by a distinct mean-structure and potentially different
zero-inflation or overdispersion parameters, depending on the chosen distribution.

The function:
\enumerate{
  \item Extracts posterior means of regression parameters from MCMC chains, discarding the initial half of the samples as burn-in.
  \item Estimates mean interaction intensities (\eqn{\lambda}) for each component using log-linear models with covariates: distance (log of |end-start|),GC, TES, and ACC (each transformed by a log(\eqn{x+1}) operation).
  \item Given the specified distribution (\code{dist}), calculates the probability (on the natural scale) of observing the recorded interaction count for each component.
  \item Normalizes these probabilities so that each interaction is assigned a set of three probabilities summing to 1.
}

For zero-inflated distributions (\code{"ZIP"}, \code{"ZINB"}), a \eqn{\theta} parameter captures the probability of an excess zero.
For negative binomial distributions (\code{"NB"}, \code{"ZINB"}), an overdispersion parameter is included.

The computed probabilities can be used for downstream analysis, such as segmenting interactions into classes or modeling spatial dependence in a hidden Markov field.
}
\examples{
#
#
# Synthetic data
set.seed(123)


large_data <- data.frame(
  start = c(1, 10, 20),
  end = c(5, 15, 30),
  interactions = c(10, 20, 30),
  GC = c(0.5, 0.8, 0.3),
  TES = c(0.2, 0.5, 0.7),
  ACC = c(0.9, 0.4, 0.6)
)

chain_betas <- list(
  list(
    chains = list(
      matrix(runif(25, 0.1, 1), ncol = 5),
      matrix(runif(25, 0.1, 1), ncol = 5),
      matrix(runif(25, 0.1, 1), ncol = 5)
    ),
    theta = runif(5, 0.1, 0.9),
    size = matrix(runif(15, 1, 10), nrow = 3)
  )
)

result <- compute_HMRFHiC_probabilities(
  data = large_data,
  chain_betas = chain_betas,
  iterations = 5,
  dist = "Poisson"
)
print(result)
# See vignette("HMRFHiC_vignette") for detailed examples with real Hi-C data.
#
#

}
\seealso{
\code{\link{dpois}}, \code{\link{dnbinom}}, for probability calculations.
}
