High-performance genomic interval operations and bioinformatics file I/O on Polars DataFrames. Overlap, nearest, merge, coverage, complement, subtract for…
polars-bio Overview polars-bio is a high-performance Python library for genomic interval operations and bioinformatics file I/O, built on Polars, Apache Arrow, and Apache DataFusion. It provides a familiar DataFrame-centric API for interval arithmetic (overlap, nearest, merge, coverage, complement, subtract) and reading/writing common bioinformatics formats (BED, VCF, BAM, CRAM, GFF/GTF, FASTA, FASTQ). Key value propositions: 6-38x faster than bioframe on real-world genomic benchmarks Streaming/out-of-core support for large genomes via DataFusion Cloud-native file I/O (S3, GCS, Azure) with predicate pushdown Two API styles: functional (pb.overlap(df1, df2)) and method-chaining (df1.lazy().pb.overlap(df2)) SQL interface for genomic data via DataFusion SQL engine When to Use This Skill
don't have the plugin yet? install it then click "run inline in claude" again.