Skip to content

Memory leaking in rewrite_images? #4720

@kuhnroyal

Description

@kuhnroyal

Description of the bug

I noticed memory building up in my application and traced this to the rewrite_images call, if I remove this, everything is fine.
Not sure if I miss some cleanup? I tried pymupdf.TOOLS.store_shrink(100) but that didn't help.

How to reproduce the bug

import gc
from pathlib import Path

import pymupdf


def main() -> None:
    dpi_threshold: int = 150
    dpi_target: int = 100
    quality: int = 50

    input_pdf: Path = Path("input.pdf")
    output_pdf: Path = Path("output.pdf")

    for i in range(10):
        with pymupdf.open(input_pdf) as doc:
            doc.rewrite_images(
                dpi_threshold=dpi_threshold,
                dpi_target=dpi_target,
                quality=quality,
            )

            save_opts = {
                "garbage": 4,  # Maximum garbage collection
                "deflate": True,  # Use deflate compression
                "clean": True,  # Clean up redundant objects
                "pretty": False,  # Don't pretty-print (saves space)
                "ascii": False,  # Don't use ASCII encoding (saves space)
                "expand": 0,  # Don't expand content streams
                "linear": False,  # Don't linearize (can increase size)
                "deflate_images": True,  # Compress images
                "deflate_fonts": True,  # Compress fonts
                "use_objstms": True,  # Use object streams for better compression
                "compression_effort": True,  # Use maximum compression effort
            }

            doc.save(
                output_pdf,
                **save_opts,
            )
            print(f"Done {i}")

    gc.collect()
    pymupdf.TOOLS.store_shrink(1000)
    print("Cleaned")


if __name__ == "__main__":
    main()

My test PDF has 6 pages with full screen images.
I created a memray graph: memray-flamegraph-mem_test.py.99494.html

PyMuPDF version

1.26.4

Operating system

MacOS

Python version

3.13

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions