Generate a complete ChromSCape analysis
generate_analysis(input_data_folder, analysis_name = "Analysis_1", output_directory = "./", input_data_type = c("scBED", "DenseMatrix", "SparseMatrix", "scBAM")[1], feature_count_on = c("bins","genebody","peaks")[1], feature_count_parameter = 50000, ref_genome = c("hg38","mm10")[1], run = c("filter", "CNA","cluster", "consensus","peak_call", "coverage", "DA", "GSA", "report")[c(1,3,6,7,8,9)], min_reads_per_cell = 1000, max_quantile_read_per_cell = 99, n_top_features = 40000, norm_type = "CPM", subsample_n = NULL, exclude_regions = NULL, n_clust = NULL, corr_threshold = 99, percent_correlation = 1, maxK = 10, qval.th = 0.1, logFC.th = 1, enrichment_qval = 0.1, doBatchCorr = FALSE, batch_sels = NULL, control_samples_CNA = NULL, genes_to_plot = c("Krt8","Krt5","Tgfb1", "Foxq1", "Cdkn2b", "Cdkn2a", "chr7:15000000-20000000") )
| input_data_folder | Directory containing the input data. |
|---|---|
| analysis_name | Name given to the analysis. |
| output_directory | Directory where to create the analysis and the HTML report. |
| input_data_type | The type of input data. |
| feature_count_on | For raw data type, on which features to count the cells. |
| feature_count_parameter | Additional parameter corresponding to the 'feature_count_on' parameter. E.g. for 'bins' must be a numeric, e.g. 50000, for 'peaks' must be a character containing path towards a BED peak file. |
| ref_genome | The genome of reference. |
| run | What steps to run. By default runs everything. Some steps are required in order to run downstream steps. |
| min_reads_per_cell | Minimum number of reads per cell. |
| max_quantile_read_per_cell | Upper quantile above which to consider cells doublets. |
| n_top_features | Number of features to keep in the analysis. |
| norm_type | Normalization type. |
| subsample_n | Number of cells per condition to downsample to, for performance principally. |
| exclude_regions | Path towards a BED file containing CNA to exclude from the analysis (optional). |
| n_clust | Number of clusters to force choice of clusters. |
| corr_threshold | Quantile of correlation above which two cells are considered as correlated. |
| percent_correlation | Percentage of the total cells that a cell must be correlated with in order to be kept in the analysis. |
| maxK | Upper cluster number to rest for ConsensusClusterPlus. |
| qval.th | Adjusted p-value below which to consider features differential. |
| logFC.th | Log2-fold-change above/below which to consider a feature depleted/enriched. |
| enrichment_qval | Adjusted p-value below which to consider a gene set as significantly enriched in differential features. |
| doBatchCorr | Logical indicating if batch correction using fastMNN should be run. |
| batch_sels | If doBatchCorr is TRUE, a named list containing the samples in each batch. |
| control_samples_CNA | If running CopyNumber Analysis, a character vector of the sample names that are 'normal'. |
| genes_to_plot | A character vector containing genes of interest of which to plot the coverage. |
Creates a ChromSCape-readable directory and saved objects, as well as a multi-tabbed HTML report resuming the analysis.
if (FALSE) { generate_analysis("/path/to/data/", "Analysis_1") }