# Chimeric mouse embryo (10X Genomics)

<script>
document.addEventListener("click", function (event) {
    if (event.target.classList.contains("rebook-collapse")) {
        event.target.classList.toggle("active");
        var content = event.target.nextElementSibling;
        if (content.style.display === "block") {
            content.style.display = "none";
        } else {
            content.style.display = "block";
        }
    }
})
</script>

<style>
.rebook-collapse {
  background-color: #eee;
  color: #444;
  cursor: pointer;
  padding: 18px;
  width: 100%;
  border: none;
  text-align: left;
  outline: none;
  font-size: 15px;
}

.rebook-content {
  padding: 0 18px;
  display: none;
  overflow: hidden;
  background-color: #f1f1f1;
}
</style>

## Introduction

This performs an analysis of the @pijuansala2019single dataset on mouse gastrulation.
Here, we examine chimeric embryos at the E8.5 stage of development 
where td-Tomato-positive embryonic stem cells (ESCs) were injected into a wild-type blastocyst.

## Data loading


``` r
library(MouseGastrulationData)
sce.chimera <- WTChimeraData(samples=5:10)
sce.chimera
```

```
## class: SingleCellExperiment 
## dim: 29453 20935 
## metadata(0):
## assays(1): counts
## rownames(29453): ENSMUSG00000051951 ENSMUSG00000089699 ...
##   ENSMUSG00000095742 tomato-td
## rowData names(2): ENSEMBL SYMBOL
## colnames(20935): cell_9769 cell_9770 ... cell_30702 cell_30703
## colData names(11): cell barcode ... doub.density sizeFactor
## reducedDimNames(2): pca.corrected.E7.5 pca.corrected.E8.5
## mainExpName: NULL
## altExpNames(0):
```




``` r
library(scater)
rownames(sce.chimera) <- uniquifyFeatureNames(
    rowData(sce.chimera)$ENSEMBL, rowData(sce.chimera)$SYMBOL)
```

## Quality control

Quality control on the cells has already been performed by the authors, so we will not repeat it here.
We additionally remove cells that are labelled as stripped nuclei or doublets.


``` r
drop <- sce.chimera$celltype.mapped %in% c("stripped", "Doublet")
sce.chimera <- sce.chimera[,!drop]
```

## Normalization

We use the pre-computed size factors in `sce.chimera`.


``` r
sce.chimera <- logNormCounts(sce.chimera)
```

## Variance modelling

We retain all genes with any positive biological component, to preserve as much signal as possible across a very heterogeneous dataset.


``` r
library(scran)
dec.chimera <- modelGeneVar(sce.chimera, block=sce.chimera$sample)
chosen.hvgs <- dec.chimera$bio > 0
```


``` r
par(mfrow=c(1,2))
blocked.stats <- dec.chimera$per.block
for (i in colnames(blocked.stats)) {
    current <- blocked.stats[[i]]
    plot(current$mean, current$total, main=i, pch=16, cex=0.5,
        xlab="Mean of log-expression", ylab="Variance of log-expression")
    curfit <- metadata(current)
    curve(curfit$trend(x), col='dodgerblue', add=TRUE, lwd=2)
}
```

<div class="figure">
<img src="pijuan-embryo_files/figure-html/unref-pijuan-var-1.png" alt="Per-gene variance as a function of the mean for the log-expression values in the Pijuan-Sala chimeric mouse embryo dataset. Each point represents a gene (black) with the mean-variance trend (blue) fitted to the variances." width="672" />
<p class="caption">(\#fig:unref-pijuan-var-1)Per-gene variance as a function of the mean for the log-expression values in the Pijuan-Sala chimeric mouse embryo dataset. Each point represents a gene (black) with the mean-variance trend (blue) fitted to the variances.</p>
</div><div class="figure">
<img src="pijuan-embryo_files/figure-html/unref-pijuan-var-2.png" alt="Per-gene variance as a function of the mean for the log-expression values in the Pijuan-Sala chimeric mouse embryo dataset. Each point represents a gene (black) with the mean-variance trend (blue) fitted to the variances." width="672" />
<p class="caption">(\#fig:unref-pijuan-var-2)Per-gene variance as a function of the mean for the log-expression values in the Pijuan-Sala chimeric mouse embryo dataset. Each point represents a gene (black) with the mean-variance trend (blue) fitted to the variances.</p>
</div><div class="figure">
<img src="pijuan-embryo_files/figure-html/unref-pijuan-var-3.png" alt="Per-gene variance as a function of the mean for the log-expression values in the Pijuan-Sala chimeric mouse embryo dataset. Each point represents a gene (black) with the mean-variance trend (blue) fitted to the variances." width="672" />
<p class="caption">(\#fig:unref-pijuan-var-3)Per-gene variance as a function of the mean for the log-expression values in the Pijuan-Sala chimeric mouse embryo dataset. Each point represents a gene (black) with the mean-variance trend (blue) fitted to the variances.</p>
</div>

## Merging

We use a hierarchical merge to first merge together replicates with the same genotype, 
and then merge samples across different genotypes.


``` r
library(batchelor)
set.seed(01001001)
merged <- correctExperiments(sce.chimera, 
    batch=sce.chimera$sample, 
    subset.row=chosen.hvgs,
    PARAM=FastMnnParam(
        merge.order=list(
            list(1,3,5), # WT (3 replicates)
            list(2,4,6)  # td-Tomato (3 replicates)
        )
    )
)
```

We use the percentage of variance lost as a diagnostic:


``` r
metadata(merged)$merge.info$lost.var
```

```
##              5         6         7         8        9       10
## [1,] 0.000e+00 0.0204238 0.000e+00 0.0169321 0.000000 0.000000
## [2,] 0.000e+00 0.0007403 0.000e+00 0.0004431 0.000000 0.015455
## [3,] 3.089e-02 0.0000000 2.012e-02 0.0000000 0.000000 0.000000
## [4,] 9.042e-05 0.0000000 8.298e-05 0.0000000 0.018044 0.000000
## [5,] 4.318e-03 0.0072489 4.123e-03 0.0078254 0.003827 0.007779
```

## Clustering


``` r
g <- buildSNNGraph(merged, use.dimred="corrected")
clusters <- igraph::cluster_louvain(g)
colLabels(merged) <- factor(clusters$membership)
```

We examine the distribution of cells across clusters and samples.


``` r
table(Cluster=colLabels(merged), Sample=merged$sample)
```

```
##        Sample
## Cluster   5   6   7   8   9  10
##      1   87  20  62  53 151  73
##      2  146  37 132 110 231 215
##      3   96  16 162 124 364 272
##      4  127  97 185 433 362 457
##      5  103  41 284 362 145 195
##      6  207  52 344 208 556 646
##      7  153  73  86  89 166 383
##      8  130  97 110  63 159 311
##      9   82  20  75  33 165 203
##      10  97  19  36  18  50  35
##      11 114  43  43  38  39 154
##      12 122  64  62  51  63 139
##      13 150  75 128  99 134 391
##      14 110  69  73  96 127 255
##      15  99  54 195 408 255 682
##      16  42  34  81  80  85 355
##      17 180  47 225 182 212 384
##      18  74  38 179 106 318 458
##      19  50  27  94  62  98 158
##      20  39  41  50  49 130 127
##      21   1   5   0  84   0  66
##      22  17   7  13  17  20  37
##      23  50  24  76  63  75 179
##      24   9   7  18  13  30  27
##      25  11  16  20   9  47  57
##      26   2   1   7   3  75 137
##      27   0   2   0  51   0   5
```

## Dimensionality reduction

We use an external algorithm to compute nearest neighbors for greater speed.


``` r
merged <- runTSNE(merged, dimred="corrected", external_neighbors=TRUE)
merged <- runUMAP(merged, dimred="corrected", external_neighbors=TRUE)
```


``` r
gridExtra::grid.arrange(
    plotTSNE(merged, colour_by="label", text_by="label", text_colour="red"),
    plotTSNE(merged, colour_by="batch")
)
```

<div class="figure">
<img src="pijuan-embryo_files/figure-html/unref-pijuan-tsne-1.png" alt="Obligatory $t$-SNE plots of the Pijuan-Sala chimeric mouse embryo dataset, where each point represents a cell and is colored according to the assigned cluster (top) or sample of origin (bottom)." width="672" />
<p class="caption">(\#fig:unref-pijuan-tsne)Obligatory $t$-SNE plots of the Pijuan-Sala chimeric mouse embryo dataset, where each point represents a cell and is colored according to the assigned cluster (top) or sample of origin (bottom).</p>
</div>

## Session Info {-}

<button class="rebook-collapse">View session info</button>
<div class="rebook-content">
```
R Under development (unstable) (2025-10-20 r88955)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.3 LTS

Matrix products: default
BLAS:   /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB              LC_COLLATE=C              
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] batchelor_1.27.0             scran_1.39.0                
 [3] scater_1.39.0                ggplot2_4.0.1               
 [5] scuttle_1.21.0               MouseGastrulationData_1.25.0
 [7] SpatialExperiment_1.21.0     SingleCellExperiment_1.33.0 
 [9] SummarizedExperiment_1.41.0  Biobase_2.71.0              
[11] GenomicRanges_1.63.1         Seqinfo_1.1.0               
[13] IRanges_2.45.0               S4Vectors_0.49.0            
[15] BiocGenerics_0.57.0          generics_0.1.4              
[17] MatrixGenerics_1.23.0        matrixStats_1.5.0           
[19] BiocStyle_2.39.0             rebook_1.21.0               

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3        jsonlite_2.0.0           
  [3] CodeDepends_0.6.6         magrittr_2.0.4           
  [5] ggbeeswarm_0.7.3          magick_2.9.0             
  [7] farver_2.1.2              rmarkdown_2.30           
  [9] vctrs_0.6.5               memoise_2.0.1            
 [11] DelayedMatrixStats_1.33.0 htmltools_0.5.9          
 [13] S4Arrays_1.11.1           AnnotationHub_4.1.0      
 [15] curl_7.0.0                BiocNeighbors_2.5.0      
 [17] SparseArray_1.11.9        sass_0.4.10              
 [19] bslib_0.9.0               httr2_1.2.2              
 [21] cachem_1.1.0              ResidualMatrix_1.21.0    
 [23] igraph_2.2.1              lifecycle_1.0.4          
 [25] pkgconfig_2.0.3           rsvd_1.0.5               
 [27] Matrix_1.7-4              R6_2.6.1                 
 [29] fastmap_1.2.0             digest_0.6.39            
 [31] AnnotationDbi_1.73.0      dqrng_0.4.1              
 [33] irlba_2.3.5.1             ExperimentHub_3.1.0      
 [35] RSQLite_2.4.5             beachmat_2.27.0          
 [37] filelock_1.0.3            labeling_0.4.3           
 [39] httr_1.4.7                abind_1.4-8              
 [41] compiler_4.6.0            bit64_4.6.0-1            
 [43] withr_3.0.2               S7_0.2.1                 
 [45] BiocParallel_1.45.0       viridis_0.6.5            
 [47] DBI_1.2.3                 rappdirs_0.3.3           
 [49] DelayedArray_0.37.0       rjson_0.2.23             
 [51] bluster_1.21.0            tools_4.6.0              
 [53] vipor_0.4.7               otel_0.2.0               
 [55] beeswarm_0.4.0            glue_1.8.0               
 [57] grid_4.6.0                Rtsne_0.17               
 [59] cluster_2.1.8.1           gtable_0.3.6             
 [61] BiocSingular_1.27.1       ScaledMatrix_1.19.0      
 [63] metapod_1.19.1            XVector_0.51.0           
 [65] ggrepel_0.9.6             BiocVersion_3.23.1       
 [67] pillar_1.11.1             limma_3.67.0             
 [69] BumpyMatrix_1.19.0        dplyr_1.1.4              
 [71] BiocFileCache_3.1.0       lattice_0.22-7           
 [73] bit_4.6.0                 tidyselect_1.2.1         
 [75] locfit_1.5-9.12           Biostrings_2.79.2        
 [77] knitr_1.50                gridExtra_2.3            
 [79] bookdown_0.46             edgeR_4.9.1              
 [81] xfun_0.54                 statmod_1.5.1            
 [83] yaml_2.3.12               evaluate_1.0.5           
 [85] codetools_0.2-20          tibble_3.3.0             
 [87] BiocManager_1.30.27       graph_1.89.1             
 [89] cli_3.6.5                 uwot_0.2.4               
 [91] jquerylib_0.1.4           dichromat_2.0-0.1        
 [93] Rcpp_1.1.0.8.1            dir.expiry_1.19.0        
 [95] dbplyr_2.5.1              png_0.1-8                
 [97] XML_3.99-0.20             parallel_4.6.0           
 [99] blob_1.2.4                sparseMatrixStats_1.23.0 
[101] viridisLite_0.4.2         scales_1.4.0             
[103] purrr_1.2.0               crayon_1.5.3             
[105] rlang_1.1.6               cowplot_1.2.0            
[107] KEGGREST_1.51.1          
```
</div>
