xchrom.tl.crosscell_nsls
- xchrom.tl.crosscell_nsls(cell_embedding_ad: str | Path | AnnData, input_folder: str | Path = './train_data', model_path: str | Path = './train_out/E1000best_model.h5', cal_preddata: bool = True, output_path: str | Path = './eval_out', cellembed_raw: str = 'X_pca', celltype: str = 'celltype', save_pred: bool = False, plot_umap: bool = False)[source]
Evaluate the performance in cross-cell prediction with within-sample data, calculate neighbor score and label score for test cells. Predict all data (excluding cross-peak peaks), then extract test cells to calculate nsls
- Parameters:
cell_embedding_ad (str or Path or anndata.AnnData) – Path to the cell embeddings adata file. provide cell input embeddings for XChrom model prediction.
input_folder (str or Path) – Path to the train data folder. Should generate by XChrom_preprocess.py. Must contain ‘splits.h5’, ‘ad_crosscell.h5ad’, ‘m_crosscell.npz’, ‘trainval_seqs.h5’.
model_path (str or Path) – Path to the trained model.
cal_preddata (bool) – Whether to calculate the nsls score based on the predicted data. If False, calculate nsls score based on the raw atac data.
output_path (str or Path) – Path to the output folder.
cellembed_raw (str) – Key of the raw cell embeddings in the cell embeddings adata,to calculate RNA neighbors.
celltype (str) – Key of the cell type in the cell embeddings adata.
save_pred (bool) – Whether to save the prediction matrix with h5ad format.
plot_umap (bool) – Whether to plot the UMAP of the test cells.
- Returns:
metrics – Dictionary containing neighbor score(k=10,50,100) and label score(k=10,50,100) for test cells.
- Return type:
dict
Examples
>>> import xchrom as xc metrics4 = xc.tl.crosscell_nsls( cell_embedding_ad='./data/1_within_sample/m_brain_paired_rna.h5ad', input_folder='./data/1_within_sample/train_data', model_path='./data/1_within_sample/train_out/E1000best_model.h5', output_path='./data/1_within_sample/eval_out', cellembed_raw='X_pca', celltype='pc32_leiden', save_pred=True, plot_umap=True )