xchrom.tl.calc_tf_activity

xchrom.tl.calc_tf_activity(motif_dir: str | Path, background_fasta: str | Path, model_path: str | Path, ad_rna: AnnData, output_file: str | Path = PosixPath('/tf_activity.h5ad'), cell_embed_raw: str = 'X_pca_harmony', regenerate_motif_h5: bool = False, regenerate_bg_h5: bool = False, seq_len: int = 1344, seed: int = 20, **model_kwargs) AnnData[source]

Calculate motif insertion scores.

Parameters:
  • motif_dir (Union[str, Path]) – The path to the directory containing motif insertion fasta files

  • background_fasta (Union[str, Path]) – The path to the background sequence fasta file

  • model_path (Union[str, Path]) – The path to the trained XChrom model weights file

  • ad_rna (anndata.AnnData) – scRNA-seq data, must contain raw cell embedding in ad_rna.obsm

  • output_file (Union[str, Path], default='./tf_activity.h5ad') – Output file path for TF activity results

  • cell_embed_raw (str, default='X_pca_harmony') – The key name of raw cell embedding from ad_rna.obsm

  • regenerate_motif_h5 (bool, default False) – Whether to regenerate motif insertion sequence h5 files, if False, will use existing h5 files

  • regenerate_bg_h5 (bool, default False) – Whether to regenerate background sequence h5 files, if False, will use existing h5 file

  • seq_len (int, default 1344) – Sequence length for background sequence and motif insertion sequence

  • seed (int, default 20) – Random seed

  • **model_kwargs (dict) – Additional parameters passed to XChrom_model

Returns:

TF activity results, X is the activity matrix of cells × motifs save to ‘./tf_activity.h5ad’ by default

Return type:

anndata.AnnData

Examples

>>> import xchrom as xc
>>> tf_act = xc.tl.calc_tf_activity(
...     motif_dir='./motif_fasta/',
...     background_fasta='./shuffled_peaks.fasta',
...     model_path='./best_model.h5',
...     ad_rna=m1d1_rna,
...     cell_embed_raw='X_pca_harmony',
...     regenerate_bg_h5=True,
...     regenerate_motif_h5=True,
...     seed=20
... )