xchrom.tl.calc_tf_activity
- xchrom.tl.calc_tf_activity(motif_dir: str | Path, background_fasta: str | Path, model_path: str | Path, ad_rna: AnnData, output_file: str | Path = PosixPath('/tf_activity.h5ad'), cell_embed_raw: str = 'X_pca_harmony', regenerate_motif_h5: bool = False, regenerate_bg_h5: bool = False, seq_len: int = 1344, seed: int = 20, **model_kwargs) AnnData[source]
Calculate motif insertion scores.
- Parameters:
motif_dir (Union[str, Path]) – The path to the directory containing motif insertion fasta files
background_fasta (Union[str, Path]) – The path to the background sequence fasta file
model_path (Union[str, Path]) – The path to the trained XChrom model weights file
ad_rna (anndata.AnnData) – scRNA-seq data, must contain raw cell embedding in ad_rna.obsm
output_file (Union[str, Path], default='./tf_activity.h5ad') – Output file path for TF activity results
cell_embed_raw (str, default='X_pca_harmony') – The key name of raw cell embedding from ad_rna.obsm
regenerate_motif_h5 (bool, default False) – Whether to regenerate motif insertion sequence h5 files, if False, will use existing h5 files
regenerate_bg_h5 (bool, default False) – Whether to regenerate background sequence h5 files, if False, will use existing h5 file
seq_len (int, default 1344) – Sequence length for background sequence and motif insertion sequence
seed (int, default 20) – Random seed
**model_kwargs (dict) – Additional parameters passed to XChrom_model
- Returns:
TF activity results, X is the activity matrix of cells × motifs save to ‘./tf_activity.h5ad’ by default
- Return type:
anndata.AnnData
Examples
>>> import xchrom as xc >>> tf_act = xc.tl.calc_tf_activity( ... motif_dir='./motif_fasta/', ... background_fasta='./shuffled_peaks.fasta', ... model_path='./best_model.h5', ... ad_rna=m1d1_rna, ... cell_embed_raw='X_pca_harmony', ... regenerate_bg_h5=True, ... regenerate_motif_h5=True, ... seed=20 ... )