% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ColnamesToFactors.R
\name{ColnamesToFactors}
\alias{ColnamesToFactors}
\title{Extraction of factors information and suitable column names creation
from the column names of a dataset.}
\usage{
ColnamesToFactors(
  ExprData,
  Column.gene,
  Group.position,
  Time.position,
  Individual.position
)
}
\arguments{
\item{ExprData}{Data.frame with \eqn{N_g} rows and (\eqn{N_{s+k}}) columns,
where \eqn{N_g} is the number of genes,
\eqn{N_s} is the number of samples and
\eqn{k=1} if a column is used to specify gene names, or \eqn{k=0} otherwise.
If \eqn{k=1}, the position of the column containing gene names is given
by \code{Column.gene}.
The data.frame contains numeric values giving gene expressions of each gene
in each sample.
Gene expressions can be raw counts or normalized raw counts.
Column names of the data.frame must describe each sample's information
(individual, biological condition and time) and
have the structure described in the section \code{Details}.}

\item{Column.gene}{Integer indicating the column where gene names are given.
Set \code{Column.gene=NULL} if there is no such column.}

\item{Group.position}{Integer indicating the position of group information
in the string of characters in each sample names (see \code{Details}).
Set \code{Group.position=NULL} if there is only one or no biological
information in the string of character in each sample name.}

\item{Time.position}{Integer indicating the position of time measurement
information in the string of characters in each sample names
(see \code{Details}).
Set \code{Time.position=NULL} if there is only one or no time measurement
information in the string of character in each sample name.}

\item{Individual.position}{Integer indicating the position of the name of
the individual (e.g patient, replicate, mouse, yeasts culture ...)
in the string of characters in each sample names (see \code{Details}).
The names of different individuals must be all different.
Furthermore, if individual names are just numbers, they will be transform
in a vector of class "character" by
\code{\link[=CharacterNumbers]{CharacterNumbers()}} and
a "r" will be added to each individual name ("r" for replicate).}
}
\value{
The function returns new column names of the dataset,
a vector indicating the name of the individual for each sample,
a vector indicating the time for each sample and/or
a vector indicating the biological condition for each sample.
}
\description{
This function generates new reduced column names according to the presence
of biological conditions and/or time points, and extract the different
factors (individual's names, time measurements, biological conditions)
from the column names of the dataset (see \code{Details}).
}
\details{
The column names of \code{ExprData} must be a vector of strings
of characters containing
\itemize{
\item a string of characters (if \eqn{k=1}) which is the label of the column
containing gene names.
\item \eqn{N_s} sample names which must be strings of characters containing
at least : the name of the individual (e.g patient, mouse, yeasts culture),
its biological condition (if there is at least two) and
the time where data have been collected if there is at least two;
(must be either 't0', 'T0' or '0' for time 0,
't1', 'T1' or '1' for time 1, ...).
}

All these sample information must be separated by underscores
in the sample name. For instance 'CLL_P_t0_r1',
corresponds to the patient 'r1' belonging to the biological condition 'P'
and where data were collected at time 't0'.
I this example, 'CLL' describe the type of cells
(here chronic lymphocytic leukemia) and is not used in our analysis.

In the string of characters 'CLL_P_t0_r1',
'r1' is localized after the third underscore,
so \code{Individual.position=4},
'P' is localized after the first underscore, so \code{Group.position=2} and
't0' is localized after the second underscore, so \code{Time.position=3}.
}
\examples{
## Data simulated with our function RawCountsSimulation()
Data.sim <- RawCountsSimulation(Nb.Group=3, Nb.Time=2, Nb.per.GT=3,
                                Nb.Gene=10)
##------------------------------------------------------------------------##
res.test.colnames <- ColnamesToFactors(ExprData=Data.sim$Sim.dat,
                                       Column.gene=1,
                                       Group.position=1,
                                       Time.position=2,
                                       Individual.position=3)
print(res.test.colnames)
}
\seealso{
The \code{\link[=ColnamesToFactors]{ColnamesToFactors()}} function
is used by the following
functions of our package :
\code{\link[=DATAprepSE]{DATAprepSE()}},
\code{\link[=PCApreprocessing]{PCApreprocessing()}},
\code{\link[=MFUZZclustersNumber]{MFUZZclustersNumber()}} and
\code{\link[=MFUZZanalysis]{MFUZZanalysis()}}.
}
