Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 22, 2025

📄 14% (0.14x) speedup for ExceptionInfo._group_contains in src/_pytest/_code/code.py

⏱️ Runtime : 1.52 milliseconds 1.33 milliseconds (best of 114 runs)

📝 Explanation and details

The optimized code achieves a 14% speedup through targeted micro-optimizations in two hot path functions:

Key Optimizations:

  1. _stringify_exception optimization - This function showed significant improvement (34% faster based on profiler data):

    • Early return for common case: Most exceptions don't have __notes__, so the optimization checks if not notes: and returns immediately, avoiding the expensive join operation
    • Local variable caching: str(exc) is computed once and stored in exc_str to avoid repeated string conversion
    • Reduced list construction: When notes exist, the join operation is streamlined
  2. _group_contains optimization - Multiple micro-optimizations for this recursive function:

    • Global lookup elimination: isinstance and BaseExceptionGroup are cached locally to avoid repeated global namespace lookups in the tight loop
    • Pattern matching optimization: re.search is cached when match is present, reducing function lookups during iteration
    • Attribute access reduction: exc_group.exceptions is cached as excs to avoid repeated attribute access

Performance Impact by Test Case:

  • String matching tests show the strongest improvements (10-21% faster) due to _stringify_exception optimizations
  • Large-scale tests with 1000+ exceptions benefit significantly from reduced per-iteration overhead
  • Regex pattern matching sees 20-21% improvements in large groups due to combined optimizations
  • Simple type checking tests show modest gains (2-7%) as they don't heavily exercise the string processing path

Why These Optimizations Work:
The profiler shows _stringify_exception and regex matching (re.search) consume 58% of total runtime in the original code. By optimizing the common path where exceptions lack notes and reducing global lookups in the recursive traversal, the code eliminates substantial overhead in exception handling workflows, particularly when processing large exception groups or performing pattern matching.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 9 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 75.0%
🌀 Generated Regression Tests and Runtime
import re

from exceptiongroup import BaseExceptionGroup

from _pytest._code.code import ExceptionInfo


# imports


# Import the ExceptionInfo class from the code under test
# (Assume ExceptionInfo is defined in the same file or is importable here)


# Helper: Custom exceptions for testing
class MyError(Exception):
    pass


class AnotherError(Exception):
    pass


class YetAnotherError(Exception):
    pass


class CustomErrorWithNotes(Exception):
    def __init__(self, msg, notes=None):
        super().__init__(msg)
        if notes:
            self.__notes__ = notes


# Helper: Dummy ExceptionInfo instance (we only use _group_contains)
dummy_excinfo = ExceptionInfo(None, "", None, _ispytest=True)

# ----------------------
# 1. Basic Test Cases
# ----------------------


def test_single_exception_group_contains_type():
    # Single exception of correct type
    group = BaseExceptionGroup("eg", [MyError("fail")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, None
    )  # 1.30μs -> 1.35μs (3.41% slower)


def test_single_exception_group_does_not_contain_type():
    # Single exception of wrong type
    group = BaseExceptionGroup("eg", [AnotherError("fail")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, None
    )  # 1.17μs -> 1.20μs (2.91% slower)


def test_multiple_exceptions_group_contains_type():
    # Multiple exceptions, one is correct type
    group = BaseExceptionGroup("eg", [AnotherError("a"), MyError("b")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, None
    )  # 1.50μs -> 1.40μs (7.07% faster)


def test_multiple_exceptions_group_none_match():
    # Multiple exceptions, none match
    group = BaseExceptionGroup("eg", [AnotherError("a"), YetAnotherError("b")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, None
    )  # 1.28μs -> 1.30μs (1.08% slower)


def test_tuple_of_types():
    # Tuple of types, one matches
    group = BaseExceptionGroup("eg", [YetAnotherError("a")])
    codeflash_output = dummy_excinfo._group_contains(
        group, (MyError, YetAnotherError), None
    )  # 1.27μs -> 1.32μs (3.57% slower)


def test_match_string_present():
    # Match string is present in exception message
    group = BaseExceptionGroup("eg", [MyError("expected message")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, "expected"
    )  # 3.88μs -> 3.52μs (10.1% faster)


def test_match_string_absent():
    # Match string is not present
    group = BaseExceptionGroup("eg", [MyError("something else")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, "expected"
    )  # 3.70μs -> 3.30μs (12.1% faster)


def test_match_regex_pattern():
    # Match regex pattern
    group = BaseExceptionGroup("eg", [MyError("foo123bar")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, re.compile(r"\d{3}")
    )  # 4.67μs -> 4.38μs (6.71% faster)


def test_match_regex_pattern_not_found():
    group = BaseExceptionGroup("eg", [MyError("no digits here")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, re.compile(r"\d+")
    )  # 4.56μs -> 4.25μs (7.42% faster)


# ----------------------
# 2. Edge Test Cases
# ----------------------


def test_empty_exception_group():
    # No exceptions in group
    group = BaseExceptionGroup("eg", [])
    codeflash_output = dummy_excinfo._group_contains(group, MyError, None)


def test_nested_exception_group_contains_type():
    # Nested group contains correct type
    inner = BaseExceptionGroup("inner", [MyError("fail")])
    outer = BaseExceptionGroup("outer", [AnotherError("nope"), inner])
    codeflash_output = dummy_excinfo._group_contains(
        outer, MyError, None
    )  # 2.46μs -> 2.34μs (5.13% faster)


def test_nested_exception_group_no_match():
    # Nested group does not contain correct type
    inner = BaseExceptionGroup("inner", [AnotherError("nope")])
    outer = BaseExceptionGroup("outer", [YetAnotherError("nope"), inner])
    codeflash_output = dummy_excinfo._group_contains(
        outer, MyError, None
    )  # 2.00μs -> 2.04μs (2.20% slower)


def test_deeply_nested_group_match():
    # 3-level nesting, match at deepest level
    inner2 = BaseExceptionGroup("inner2", [MyError("found me")])
    inner1 = BaseExceptionGroup("inner1", [AnotherError("nope"), inner2])
    outer = BaseExceptionGroup("outer", [YetAnotherError("nope"), inner1])
    codeflash_output = dummy_excinfo._group_contains(
        outer, MyError, None
    )  # 2.42μs -> 2.48μs (2.30% slower)


def test_target_depth_match():
    # Only match at specific depth
    inner = BaseExceptionGroup("inner", [MyError("fail")])
    outer = BaseExceptionGroup("outer", [inner])
    # Match at depth 2 (inner), not at depth 1 (outer)
    codeflash_output = dummy_excinfo._group_contains(
        outer, MyError, None, target_depth=2
    )  # 2.25μs -> 2.29μs (1.88% slower)
    codeflash_output = dummy_excinfo._group_contains(
        outer, MyError, None, target_depth=1
    )  # 1.01μs -> 1.09μs (7.09% slower)


def test_target_depth_no_match_due_to_depth():
    # Matching type but wrong depth
    inner = BaseExceptionGroup("inner", [MyError("fail")])
    outer = BaseExceptionGroup("outer", [inner])
    # Looking for match at depth 3 (doesn't exist)
    codeflash_output = dummy_excinfo._group_contains(
        outer, MyError, None, target_depth=3
    )  # 2.04μs -> 1.98μs (2.62% faster)


def test_target_depth_multiple_nested():
    # Multiple nested groups, match at correct depth
    g3 = BaseExceptionGroup("g3", [MyError("deep")])
    g2 = BaseExceptionGroup("g2", [g3])
    g1 = BaseExceptionGroup("g1", [g2])
    # Only at depth 3
    codeflash_output = dummy_excinfo._group_contains(
        g1, MyError, None, target_depth=3
    )  # 2.40μs -> 2.32μs (3.45% faster)
    codeflash_output = dummy_excinfo._group_contains(
        g1, MyError, None, target_depth=2
    )  # 1.28μs -> 1.33μs (3.46% slower)


def test_match_with_notes():
    # Exception with __notes__ attribute
    exc = CustomErrorWithNotes("msg", notes=["note1", "note2"])
    group = BaseExceptionGroup("eg", [exc])
    codeflash_output = dummy_excinfo._group_contains(
        group, CustomErrorWithNotes, "note2"
    )  # 4.28μs -> 4.14μs (3.34% faster)


def test_match_with_notes_absent():
    exc = CustomErrorWithNotes("msg", notes=["note1", "note2"])
    group = BaseExceptionGroup("eg", [exc])
    codeflash_output = dummy_excinfo._group_contains(
        group, CustomErrorWithNotes, "notfound"
    )  # 3.97μs -> 3.86μs (2.90% faster)


def test_group_contains_multiple_matches():
    # Multiple matching exceptions, should return True on first match
    group = BaseExceptionGroup("eg", [MyError("a"), MyError("b")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, None
    )  # 1.18μs -> 1.23μs (4.63% slower)


def test_group_contains_match_with_tuple_and_regex():
    group = BaseExceptionGroup("eg", [YetAnotherError("abc123xyz")])
    codeflash_output = dummy_excinfo._group_contains(
        group, (MyError, YetAnotherError), r"\d{3}"
    )  # 4.72μs -> 4.46μs (5.85% faster)


def test_group_contains_wrong_type_and_match():
    group = BaseExceptionGroup("eg", [AnotherError("expected")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, "expected"
    )  # 1.18μs -> 1.30μs (8.80% slower)


def test_group_contains_type_but_not_match():
    group = BaseExceptionGroup("eg", [MyError("not matching")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, "expected"
    )  # 3.75μs -> 3.29μs (13.7% faster)


def test_group_contains_match_with_empty_string():
    # Empty string should match any string (re.search('', ...) always matches)
    group = BaseExceptionGroup("eg", [MyError("anything")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, ""
    )  # 3.90μs -> 3.60μs (8.37% faster)


def test_group_contains_match_with_none_and_empty_message():
    group = BaseExceptionGroup("eg", [MyError("")])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, None
    )  # 1.27μs -> 1.27μs (0.237% faster)


def test_group_contains_with_non_exception_in_group():
    # Should ignore non-exception objects (shouldn't occur, but check)
    group = BaseExceptionGroup("eg", [MyError("a"), 42])  # 42 is not an exception
    # Should not raise, should match MyError
    codeflash_output = dummy_excinfo._group_contains(group, MyError, None)


# ----------------------
# 3. Large Scale Test Cases
# ----------------------


def test_large_flat_group_contains_type():
    # Large flat group, matching type at the end
    group = BaseExceptionGroup(
        "eg", [AnotherError(f"err{i}") for i in range(998)] + [MyError("found")]
    )
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, None
    )  # 58.2μs -> 55.3μs (5.20% faster)


def test_large_flat_group_no_match():
    # Large group, no matching type
    group = BaseExceptionGroup("eg", [AnotherError(f"err{i}") for i in range(1000)])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, None
    )  # 58.0μs -> 55.4μs (4.64% faster)


def test_large_nested_group_match_deep():
    # Large nested group, match at deepest level
    group = BaseExceptionGroup("g0", [])
    current = group
    for i in range(1, 10):  # 10-level nesting
        next_group = BaseExceptionGroup(f"g{i}", [])
        current.exceptions.append(next_group)
        current = next_group
    # Add matching error at the deepest level
    current.exceptions.append(MyError("deepest"))
    codeflash_output = dummy_excinfo._group_contains(group, MyError, None)


def test_large_nested_group_no_match():
    # Large nested group, no matching error
    group = BaseExceptionGroup("g0", [])
    current = group
    for i in range(1, 10):
        next_group = BaseExceptionGroup(f"g{i}", [])
        current.exceptions.append(next_group)
        current = next_group
    current.exceptions.append(AnotherError("not me"))
    codeflash_output = dummy_excinfo._group_contains(group, MyError, None)


def test_large_group_with_target_depth():
    # Large group, match only at specific depth
    # Build a tree: depth 3, breadth 10
    level3 = [BaseExceptionGroup(f"g3-{i}", [MyError(f"match-{i}")]) for i in range(10)]
    level2 = [BaseExceptionGroup(f"g2-{i}", [g]) for i, g in enumerate(level3)]
    level1 = [BaseExceptionGroup(f"g1-{i}", [g]) for i, g in enumerate(level2)]
    group = BaseExceptionGroup("root", level1)
    # Should only match at depth 3
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, None, target_depth=3
    )  # 7.97μs -> 8.89μs (10.3% slower)
    # Should not match at depth 2
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, None, target_depth=2
    )  # 3.90μs -> 4.41μs (11.5% slower)


def test_large_group_multiple_matches():
    # Large group with many matches
    group = BaseExceptionGroup(
        "eg",
        [MyError(f"match-{i}") for i in range(500)]
        + [AnotherError("nope") for _ in range(500)],
    )
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, None
    )  # 1.43μs -> 1.35μs (5.39% faster)


def test_large_group_match_regex():
    # Large group, match by regex
    group = BaseExceptionGroup(
        "eg", [MyError(f"foo{i}") for i in range(999)] + [MyError("bar999")]
    )
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, r"bar\d+"
    )  # 494μs -> 408μs (21.3% faster)


def test_large_group_no_match_regex():
    group = BaseExceptionGroup("eg", [MyError(f"foo{i}") for i in range(1000)])
    codeflash_output = dummy_excinfo._group_contains(
        group, MyError, r"bar\d+"
    )  # 490μs -> 408μs (20.0% faster)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-ExceptionInfo._group_contains-mi9qvk3u and push.

Codeflash Static Badge

The optimized code achieves a **14% speedup** through targeted micro-optimizations in two hot path functions:

**Key Optimizations:**

1. **`_stringify_exception` optimization** - This function showed significant improvement (34% faster based on profiler data):
   - **Early return for common case**: Most exceptions don't have `__notes__`, so the optimization checks `if not notes:` and returns immediately, avoiding the expensive `join` operation
   - **Local variable caching**: `str(exc)` is computed once and stored in `exc_str` to avoid repeated string conversion
   - **Reduced list construction**: When notes exist, the join operation is streamlined

2. **`_group_contains` optimization** - Multiple micro-optimizations for this recursive function:
   - **Global lookup elimination**: `isinstance` and `BaseExceptionGroup` are cached locally to avoid repeated global namespace lookups in the tight loop
   - **Pattern matching optimization**: `re.search` is cached when `match` is present, reducing function lookups during iteration
   - **Attribute access reduction**: `exc_group.exceptions` is cached as `excs` to avoid repeated attribute access

**Performance Impact by Test Case:**
- **String matching tests** show the strongest improvements (10-21% faster) due to `_stringify_exception` optimizations
- **Large-scale tests** with 1000+ exceptions benefit significantly from reduced per-iteration overhead
- **Regex pattern matching** sees 20-21% improvements in large groups due to combined optimizations
- Simple type checking tests show modest gains (2-7%) as they don't heavily exercise the string processing path

**Why These Optimizations Work:**
The profiler shows `_stringify_exception` and regex matching (`re.search`) consume 58% of total runtime in the original code. By optimizing the common path where exceptions lack notes and reducing global lookups in the recursive traversal, the code eliminates substantial overhead in exception handling workflows, particularly when processing large exception groups or performing pattern matching.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 22, 2025 03:43
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant