% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/tr2g.R
\name{tr2g_TxDb}
\alias{tr2g_TxDb}
\title{Get transcript and gene info from TxDb objects}
\usage{
tr2g_TxDb(
  txdb,
  Genome = NULL,
  get_transcriptome = TRUE,
  out_path = ".",
  write_tr2g = TRUE,
  chrs_only = TRUE,
  compress_fa = FALSE,
  overwrite = FALSE
)
}
\arguments{
\item{txdb}{A \code{\link{TxDb}} object with gene annotation.}

\item{Genome}{Either a \code{\link{BSgenome}} or a \code{\link{XStringSet}}
object of genomic sequences, where the intronic sequences will be extracted
from. Use \code{\link{genomeStyles}} to check which styles are supported for
your organism of interest; supported styles can be interconverted. If the
style in your genome or annotation is not supported, then the style of
chromosome names in the genome and annotation should be manually set to be
consistent.}

\item{get_transcriptome}{Logical, whether to extract transcriptome from
genome with the GTF file. If filtering biotypes or chromosomes, the filtered
\code{GRanges} will be used to extract transcriptome.}

\item{out_path}{Directory to save the outputs written to disk. If this
directory does not exist, then it will be created. Defaults to the current
working directory.}

\item{write_tr2g}{Logical, whether to write tr2g to disk. If \code{TRUE}, then
a file \code{tr2g.tsv} will be written into \code{out_path}.}

\item{chrs_only}{Logical, whether to include chromosomes only, for GTF and
GFF files can contain annotations for scaffolds, which are not incorporated
into chromosomes. This will also exclude haplotypes. Defaults to \code{TRUE}.
Only applicable to species found in \code{genomeStyles()}.}

\item{compress_fa}{Logical, whether to compress the output fasta file. If
\code{TRUE}, then the fasta file will be gzipped.}

\item{overwrite}{Logical, whether to overwrite if files with names of outputs
written to disk already exist.}
}
\value{
A data frame with 3 columns: \code{gene} for gene ID, \code{transcript}
for transcript ID, and \code{tx_id} for internal transcript IDs used to avoid
duplicate transcript names. For TxDb packages from Bioconductor, gene ID is
Entrez ID, while transcript IDs are Ensembl IDs with version numbers for
\code{TxDb.Hsapiens.UCSC.hg38.knownGene}. In some cases, the transcript ID
have duplicates, and this is resolved by adding numbers to make the IDs
unique.

A data frame with 3 columns: \code{gene} for gene ID, \code{transcript}
for transcript ID, and \code{gene_name} for gene names. If \code{other_attrs}
has been specified, then those will also be columns in the data frame returned.
}
\description{
The genome and gene annotations of some species can be conveniently obtained
from Bioconductor packages. This is more convenient than downloading GTF
files from Ensembl and reading it into R. In these packages, the gene
annotation is stored in a \code{\link{TxDb}} object, which has standardized
names for gene IDs, transcript IDs, exon IDs, and so on, which are stored in
the metadata fields in GTF and GFF3 files, which are not standardized.
This function extracts transcript and corresponding gene information from
gene annotation stored in a \code{\link{TxDb}} object.
}
\examples{
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
library(BSgenome.Hsapiens.UCSC.hg38)
tr2g_TxDb(TxDb.Hsapiens.UCSC.hg38.knownGene, BSgenome.Hsapiens.UCSC.hg38)
# Clean up
file.remove("transcriptome.fa", "tr2g.tsv")
}
\seealso{
Other functions to retrieve transcript and gene info: 
\code{\link{sort_tr2g}()},
\code{\link{tr2g_EnsDb}()},
\code{\link{tr2g_ensembl}()},
\code{\link{tr2g_fasta}()},
\code{\link{tr2g_gff3}()},
\code{\link{tr2g_gtf}()},
\code{\link{transcript2gene}()}

Other functions to retrieve transcript and gene info: 
\code{\link{sort_tr2g}()},
\code{\link{tr2g_EnsDb}()},
\code{\link{tr2g_ensembl}()},
\code{\link{tr2g_fasta}()},
\code{\link{tr2g_gff3}()},
\code{\link{tr2g_gtf}()},
\code{\link{transcript2gene}()}
}
\concept{functions to retrieve transcript and gene info}
