Preprocessing

Note

This here is an API version of all the functions applicable in SPOT, For more detailed and thorough reference please visit our Build your first project site.

Preprocessing

Filtering using preprocessing.filter_guide_reads()

filter_guide_reads(gem_path, guide_prefix=None, output_path=None, binarilize=False, assign_pattern='max', filter_threshold=None)

Filter and process guide reads from a GEM file.

Parameters:
  • gem_path – Path to the input GEM file

  • guide_prefix – Optional prefix to filter guide names. Only guides starting with this prefix will be kept

  • output_path – Optional path to save filtered results. If None, returns the filtered DataFrame

  • binarilize – Whether to set all bin counts to 1. Recommended for high library size and resolution

  • assign_pattern – How to handle bins with multiple guides. Can be ‘max’ (keep guide with max count), ‘drop’ (remove multi-guide bins), or ‘all’ (keep all guides)

  • filter_threshold – Minimum guide count threshold. Bins with guides below this count will be filtered out

Returns:

Filtered pandas DataFrame if output_path is None, otherwise None