xchrom.tl.calc_nsls_score

xchrom.tl.calc_nsls_score(ad_rna: AnnData, ad_atac: AnnData, n: int = 100, label: str = 'celltype', test_cells: list | ndarray | None = None, use_rep_rna: str = 'X_pca', use_rep_atac: str = 'X_pca')[source]

Calculate the cluster metrics of scATAC data, including the number of shared neighbors and labels.

Parameters:
  • ad_rna (anndata.AnnData) – scRNA-seq data, used to calculate scRNA cell neighborhoods, which should have been processed with scanpy or others. Need to have raw cell represenation from scRNA-seq data, such as ‘X_pca’.

  • ad_atac (anndata.AnnData) – scATAC-seq raw or predicted data, used to calculate scATAC cell neighborhoods Must contain cell types, or clustering results from paired scRNA-seq data to calculate label scores

  • n (int) – The number of neighbors in different scales, such as 100, 50, 10.

  • label (str) – The key name of the cell type labels, default is ‘celltype’, or ‘leiden’ from scRNA-seq data. Should be assighed to ad_atac.obs[label]

  • test_cells (list) – The cells to be computed, if None, all cells will be computed.

  • use_rep_rna (str) – The key name of the scRNA cells dimension reduction, default is ‘X_pca’, to compute and generate scRNA neighbors list, which can be regarded as a genuine neighbor relationship.

  • use_rep_atac (str) – The key name of the scATAC cells dimension reduction, default is ‘X_pca’ from scanpy(scanpy.tl.pca), to compute and generate scATAC neighbors list.

Returns:

  • the ratio of shared neighbors (float) – The number of shared neighbors divided by the number of neighbors.

  • the ratio of shared labels (float) – The number of shared labels divided by the number of neighbors.

  • Data Requirements

  • —————–

  • ad_rna.obsm must contain

    • Cell dimension reduction (default: ‘X_pca’) for computing neighborhoods from scRNA-seq data

  • ad_atac.obsm must contain

    • Cell dimension reduction (default: ‘X_pca’) for computing neighborhoods from scRNA-seq data

  • ad_atac.obs must contain

    • Cell type labels (default: ‘celltype’) for label consistency evaluation,which can be from true cell type labels or paired scRNA-seq clustering results

Examples

>>> # Calculate the cluster metrics of scATAC data, including the number of shared neighbors and labels.
>>> ns, ls = calc_nsls_score(ad_rna, ad_atac, n=100, label='celltype', test_cells=None, use_rep_rna='X_pca', use_rep_atac='X_pca')
>>> print(f"The number of shared neighbors: {ns:.4f}, The number of shared labels: {ls:.4f}")