xchrom.tr.train_XChrom
- xchrom.tr.train_XChrom(input_folder: str | Path, cell_embedding_ad: str | Path, out_path: str | Path = './train_out', bottleneck: int = 32, batch_size: int = 128, lr: float = 0.01, epochs: int = 1000, save_freq: int = 1000, trackscore: bool = False, celltype: str = 'celltype', seed: int = 20, train_split: float = 0.9, cellembed_raw: str = 'X_pca', verbose: Literal[0, 1, 2] = 1, print_scores: bool = False, **kwargs) Dict[str, Any][source]
Train XChrom model
- Parameters:
input_folder (Union[str, Path]) – Preprocessed data folder, should contain: trainval_seqs.h5, splits.h5, ad_trainval.h5ad, m_trainval.npz
cell_embedding_ad (Union[str, Path]) – scRNA-seq data file path containing raw cell embedding
out_path (Union[str, Path], default 'train_out') – Output path
bottleneck (int, default 32) – Bottleneck layer size,should be the same as the dimension of raw cell embedding
batch_size (int, default 128) – Batch size
lr (float, default 0.01) – Learning rate
epochs (int, default 1000) – Number of training epochs
save_freq (int, default 1000) – Model saving frequency
trackscore (bool, default False) – Whether to compute score metrics every epoch
celltype (str, default 'cell_type') – Cell type label column name (used when trackscore=True)
seed (int, default 20) – Random seed
train_split (float, default 0.9) – Training set/validation set ratio
cellembed_raw (str, default 'X_pca') – Raw cell embedding key in cell embedding adata
verbose (int, default 1) – Training verbosity mode. 0=silent, 1=progress bar, 2=one line per epoch
print_scores (bool, default False) – Whether to print ns,ls scores every epoch when trackscore=True
**kwargs (dict) – Additional parameters
- Returns:
Dictionary containing training history and model information
- Return type:
Dict[str, Any]
Examples
>>> import xchrom as xc >>> history = xc.tr.train_XChrom( input_folder='./data/1_within_sample/train_data/', cell_embedding_ad='./data/1_within_sample/m_brain_paired_rna.h5ad', cellembed_raw='X_pca', out_path='./data/1_within_sample/train_out/', trackscore = True, celltype = 'pc32_leiden', epochs = 1000, save_freq = 1000, verbose = 0, # silent mode, no progress bar print_scores = False # whether to print ns,ls scores every epoch when trackscore=True )