xchrom.tl.bed_to_fasta

xchrom.tl.bed_to_fasta(bed_input: str | Path | DataFrame, fasta_file: str | Path, output_file: str | Path, seq_len: int = 1344, stranded: bool = False)[source]

Extract sequences from BED file and write to FASTA file.

Parameters:
  • bed_input (str, Path, or DataFrame) – The path to the BED file, or a pandas DataFrame with ‘chr’, ‘start’, ‘end’ columns.

  • fasta_file (Union[str, Path]) – The path to the reference genome FASTA file.

  • output_file (Union[str, Path]) – The path to the output FASTA file.

  • seq_len (int, default 1344) – The length of the sequences to extract.

  • stranded (bool, default False) – Whether to consider strand information.

Returns:

(seqs, coords) - The list of sequences and coordinates.

Return type:

tuple

Examples

# Convert BED file to FASTA file seqs, coords = write_fasta(“peaks.bed”, “genome.fasta”, “output.fasta”, seq_len=1344)

# Convert BED DataFrame to FASTA file seqs, coords = write_fasta(bed_df, “genome.fasta”, “output.fasta”, seq_len=1344)

# Consider strand information seqs, coords = write_fasta(“peaks.bed”, “genome.fasta”, “output.fasta”,

seq_len=1344, stranded=True)