xchrom.tl.crosscell_aucprc

xchrom.tl.crosscell_aucprc(cell_embedding_ad: str | Path | AnnData, input_folder: str | Path = './train_data', model_path: str | Path = './train_out/E1000best_model.h5', output_path: str | Path = './eval_out', cellembed_raw: str = 'X_pca', save_pred: bool = False, scatter_plot: bool = False) dict[source]

Evaluate the performance in cross-cell prediction with within-sample data, calculate auROC & auPRC for overall, per-cell and per-peak.

Parameters:
  • cell_embedding_ad (str or Path or anndata.AnnData) – Path to the cell embeddings adata file. provide cell input embeddings for XChrom model prediction.

  • input_folder (str or Path) – Path to the train data folder. Should generate by XChrom_preprocess.py. Must contain ‘splits.h5’, ‘ad_crosscell.h5ad’, ‘m_crosscell.npz’, ‘trainval_seqs.h5’.

  • model_path (str or Path) – Path to the trained model.

  • output_path (str or Path) – Path to the output folder.

  • cellembed_raw (str) – Key of the raw cell embeddings in the cell embeddings adata.

  • save_pred (bool) – Whether to save the prediction matrix with npy format.

  • scatter_plot (bool) – Whether to plot the scatter plot of the per-cell & per peak auROC and auPRC.

Returns:

Dictionary containing:

’overall_auroc’: overall auROC, ‘overall_auprc’: overall auPRC, ‘percell_auroc’: per-cell auROC, ‘percell_auprc’: per-cell auPRC, ‘perpeak_auroc’: per-peak auROC, ‘perpeak_auprc’: per-peak auPRC.

Return type:

dict

Examples

>>> import xchrom as xc
metrics1 = xc.tl.crosscell_aucprc(
    cell_embedding_ad='./data/1_within_sample/m_brain_paired_rna.h5ad',
    input_folder='./data/1_within_sample/train_data',
    model_path='./data/1_within_sample/train_out/E1000best_model.h5',
    output_path='./data/1_within_sample/eval_out',
    cellembed_raw='X_pca',
    save_pred=True,
    scatter_plot=True
    )