xchrom.tl.denoise_nsls

xchrom.tl.denoise_nsls(cell_embedding_ad: str | Path | AnnData, input_folder: str | Path = './train_data', output_path: str | Path = './eval_out', model_path: str | Path = './train_out/E1000best_model.h5', cal_preddata: bool = True, cellembed_raw: str = 'X_pca', celltype: str = 'celltype', save_pred: bool = False, plot_umap: bool = False)[source]

Evaluate the performance of denoise in within-sample data, calculate neighbor score(k=10,50,100) and label score(k=10,50,100) for all cells.

Parameters:
  • cell_embedding_ad (str or Path or anndata.AnnData) – Path to the cell embedding adata file. provide cell input embedding for XChrom model prediction.

  • input_folder (str or Path) – Path to the train & test data folder. Should generate by XChrom_preprocess.py. Must contain ‘splits.h5’, ‘ad.h5ad’, ‘all_seqs.h5’.

  • output_path (str or Path) – Path to the output folder.

  • model_path (str or Path) – Path to the trained model.

  • cal_preddata (bool) – Whether to calculate the nsls score based on the predicted data. If False, calculate nsls score based on the raw atac data.

  • cellembed_raw (str) – Key of the raw cell input embedding in the cell embedding adata,to calculate RNA neighbors.

  • save_pred (bool) – Whether to save the prediction matrix with npy format.

  • plot_umap (bool) – Whether to plot the UMAP of the test cells.

Returns:

metrics – Dictionary containing neighbor score(k=10,50,100) and label score(k=10,50,100) for all cells.

Return type:

dict

Examples

>>> import xchrom as xc
metrics5 = xc.tl.denoise_nsls(
    cell_embedding_ad='./data/1_within_sample/m_brain_paired_rna.h5ad',
    input_folder='./data/1_within_sample/train_data',
    model_path='./data/1_within_sample/train_out/E1000best_model.h5',