xchrom.tl.bed_to_fasta
- xchrom.tl.bed_to_fasta(bed_input: str | Path | DataFrame, fasta_file: str | Path, output_file: str | Path, seq_len: int = 1344, stranded: bool = False)[source]
Extract sequences from BED file and write to FASTA file.
- Parameters:
bed_input (str, Path, or DataFrame) – The path to the BED file, or a pandas DataFrame with ‘chr’, ‘start’, ‘end’ columns.
fasta_file (Union[str, Path]) – The path to the reference genome FASTA file.
output_file (Union[str, Path]) – The path to the output FASTA file.
seq_len (int, default 1344) – The length of the sequences to extract.
stranded (bool, default False) – Whether to consider strand information.
- Returns:
(seqs, coords) - The list of sequences and coordinates.
- Return type:
tuple
Examples
# Convert BED file to FASTA file seqs, coords = write_fasta(“peaks.bed”, “genome.fasta”, “output.fasta”, seq_len=1344)
# Convert BED DataFrame to FASTA file seqs, coords = write_fasta(bed_df, “genome.fasta”, “output.fasta”, seq_len=1344)
# Consider strand information seqs, coords = write_fasta(“peaks.bed”, “genome.fasta”, “output.fasta”,
seq_len=1344, stranded=True)