Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 22, 2025

📄 10% (0.10x) speedup for FormattedExcinfo.get_source in src/_pytest/_code/code.py

⏱️ Runtime : 325 microseconds 296 microseconds (best of 69 runs)

📝 Explanation and details

The optimized code achieves a 10% speedup by reducing redundant computations and replacing inefficient for-loops with faster list comprehensions and conditional checks.

Key optimizations applied:

  1. Avoided repeated len() calls: The original code called len(source) and len(source.lines) multiple times. The optimized version caches these values as source_len and reuses source.lines as source_lines, eliminating redundant attribute lookups and length calculations.

  2. Replaced for-loops with list comprehensions: The original code used explicit for-loops to append lines with prefixes:

    for line in source.lines[:line_index]:
        lines.append(space_prefix + line)

    The optimized version uses list comprehensions with extend():

    lines.extend([space_prefix + line for line in source_lines[:line_index]])

    This is faster because list comprehensions are optimized at the C level in Python.

  3. Added conditional checks to avoid unnecessary iterations: The optimized code adds if line_index > 0: and if line_index + 1 < source_len: checks to skip empty slice iterations, avoiding the overhead of creating and iterating over empty lists.

Performance impact by test case:

  • Large-scale tests show the biggest gains (15-23% faster): These benefit most from the loop optimizations since they process many lines
  • Out-of-bounds tests improve (5-20% faster): Benefit from reduced redundant computations
  • Small/simple cases see modest gains (1-10% faster): Less iteration overhead to optimize
  • Some edge cases are slightly slower (4-17%): The additional conditional checks add minor overhead for very small inputs

The optimizations are particularly effective for larger source files where the reduced loop overhead and cached values provide substantial benefits, while maintaining identical functionality for all input scenarios.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 22 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import Literal

from _pytest._code.code import FormattedExcinfo

# imports
import pytest  # used for our unit tests


# Minimal stub for _pytest._code.Source
class Source:
    def __init__(self, source: str):
        # Split into lines, preserve empty lines
        self.lines = source.splitlines() or [""]

    def __len__(self):
        return len(self.lines)


# Minimal stub for ExceptionInfo
class ExceptionInfo:
    def __init__(self, exc: Exception):
        self._exc = exc

    @property
    def type(self):
        return type(self._exc)

    @property
    def value(self):
        return self._exc

    @property
    def tb(self):
        return None


_TracebackStyle = Literal["long", "short", "line", "no", "native", "value", "auto"]

# ------------------- UNIT TESTS -------------------


@pytest.fixture
def excinfo_fixture():
    # Fixture for ExceptionInfo
    return ExceptionInfo(ValueError("bad value"))


@pytest.mark.parametrize(
    "source_text, line_index, expected",
    [
        # Basic: Single line, default line_index (-1)
        ("print('hello')", -1, [">   print('hello')"]),
        # Basic: Multi-line, default line_index (-1)
        ("a = 1\nb = 2\nc = 3", -1, ["    a = 1", "    b = 2", ">   c = 3"]),
        # Basic: Multi-line, explicit line_index
        ("a = 1\nb = 2\nc = 3", 1, ["    a = 1", ">   b = 2", "    c = 3"]),
        # Basic: Empty string source
        ("", -1, [">   "]),
        # Basic: Single line, explicit line_index
        ("x = 42", 0, [">   x = 42"]),
    ],
)
def test_get_source_basic(source_text, line_index, expected):
    """Test basic scenarios with single and multi-line input."""
    fei = FormattedExcinfo()
    src = Source(source_text)
    codeflash_output = fei.get_source(src, line_index)
    result = codeflash_output  # 12.9μs -> 11.7μs (10.4% faster)


def test_get_source_short_flag():
    """Test short flag formatting."""
    fei = FormattedExcinfo()
    src = Source("a = 1\nb = 2\nc = 3")
    # Should only return the marked line, stripped and prefixed
    codeflash_output = fei.get_source(src, 1, short=True)
    result = codeflash_output  # 1.79μs -> 1.88μs (4.94% slower)


def test_get_source_none_source():
    """Test when source is None."""
    fei = FormattedExcinfo()
    # Should return fallback "???" source
    codeflash_output = fei.get_source(None, 0)
    result = codeflash_output  # 7.55μs -> 7.25μs (4.14% faster)


def test_get_source_out_of_bounds_high():
    """Test when line_index is too high."""
    fei = FormattedExcinfo()
    src = Source("x = 1\ny = 2")
    # line_index = 10 is out of bounds, fallback to "???"
    codeflash_output = fei.get_source(src, 10)
    result = codeflash_output  # 7.11μs -> 6.42μs (10.6% faster)


def test_get_source_out_of_bounds_negative():
    """Test when line_index is negative and source is None."""
    fei = FormattedExcinfo()
    codeflash_output = fei.get_source(None, -5)
    result = codeflash_output  # 6.33μs -> 6.02μs (5.18% faster)


def test_get_source_out_of_bounds_negative_with_source():
    """Test negative line_index less than -len(source)."""
    fei = FormattedExcinfo()
    src = Source("a\nb\nc")
    # line_index = -5, should fallback to "???"
    codeflash_output = fei.get_source(src, -5)
    result = codeflash_output  # 7.51μs -> 6.26μs (19.8% faster)


def test_get_source_empty_lines():
    """Test source with empty lines."""
    fei = FormattedExcinfo()
    src = Source("\n\n")
    codeflash_output = fei.get_source(src, -1)
    result = codeflash_output  # 2.75μs -> 2.63μs (4.45% faster)


def test_get_source_with_excinfo(excinfo_fixture):
    """Test appending exception info."""
    fei = FormattedExcinfo()
    src = Source("x = 1\ny = 2")
    codeflash_output = fei.get_source(src, 1, excinfo_fixture)
    result = codeflash_output


def test_get_source_with_excinfo_short(excinfo_fixture):
    """Test short mode with excinfo."""
    fei = FormattedExcinfo()
    src = Source("x = 1\ny = 2")
    codeflash_output = fei.get_source(src, 0, excinfo_fixture, short=True)
    result = codeflash_output


def test_get_source_large_scale():
    """Test with a large source (1000 lines)."""
    fei = FormattedExcinfo()
    lines = [f"line {i}" for i in range(1000)]
    src = Source("\n".join(lines))
    # Mark the last line
    codeflash_output = fei.get_source(src, -1)
    result = codeflash_output  # 47.2μs -> 38.4μs (22.9% faster)


def test_get_source_large_scale_middle():
    """Test large source, marking a middle line."""
    fei = FormattedExcinfo()
    lines = [f"line {i}" for i in range(1000)]
    src = Source("\n".join(lines))
    mid = 500
    codeflash_output = fei.get_source(src, mid)
    result = codeflash_output  # 44.9μs -> 38.8μs (15.7% faster)


def test_get_source_large_scale_short():
    """Test large source, short mode."""
    fei = FormattedExcinfo()
    lines = [f"line {i}" for i in range(1000)]
    src = Source("\n".join(lines))
    codeflash_output = fei.get_source(src, 999, short=True)
    result = codeflash_output  # 1.91μs -> 2.00μs (4.49% slower)


def test_get_source_unicode():
    """Test with unicode characters in source."""
    fei = FormattedExcinfo()
    src = Source("α = 1\nβ = 2\nγ = 3")
    codeflash_output = fei.get_source(src, 2)
    result = codeflash_output  # 2.77μs -> 3.15μs (11.9% slower)


def test_get_source_tab_indentation():
    """Test with tab-indented source."""
    fei = FormattedExcinfo()
    src = Source("\ta = 1\n\tb = 2")
    codeflash_output = fei.get_source(src, 1)
    result = codeflash_output  # 2.22μs -> 2.68μs (17.3% slower)


def test_get_source_trailing_newline():
    """Test source with trailing newline."""
    fei = FormattedExcinfo()
    src = Source("a = 1\nb = 2\n")
    codeflash_output = fei.get_source(src, -1)
    result = codeflash_output  # 2.87μs -> 2.61μs (9.81% faster)


def test_get_source_all_empty():
    """Test with all empty lines."""
    fei = FormattedExcinfo()
    src = Source("\n\n\n")
    codeflash_output = fei.get_source(src, -1)
    result = codeflash_output  # 2.77μs -> 2.74μs (1.06% faster)


def test_get_source_marker_constants():
    """Test that marker constants are used correctly."""
    fei = FormattedExcinfo()
    src = Source("foo\nbar")
    codeflash_output = fei.get_source(src, 0)
    result = codeflash_output  # 2.35μs -> 2.61μs (10.0% slower)


def test_get_source_chain_flag():
    """Test that chain flag does not affect get_source output."""
    fei = FormattedExcinfo(chain=False)
    src = Source("foo\nbar")
    codeflash_output = fei.get_source(src, 1)
    result = codeflash_output  # 2.24μs -> 2.59μs (13.6% slower)


def test_get_source_abspath_flag():
    """Test that abspath flag does not affect get_source output."""
    fei = FormattedExcinfo(abspath=False)
    src = Source("foo\nbar")
    codeflash_output = fei.get_source(src, 0)
    result = codeflash_output  # 2.37μs -> 2.64μs (10.2% slower)


def test_get_source_funcargs_flag():
    """Test that funcargs flag does not affect get_source output."""
    fei = FormattedExcinfo(funcargs=True)
    src = Source("foo\nbar")
    codeflash_output = fei.get_source(src, 1)
    result = codeflash_output  # 2.12μs -> 2.58μs (17.8% slower)


def test_get_source_truncate_locals_flag():
    """Test that truncate_locals flag does not affect get_source output."""
    fei = FormattedExcinfo(truncate_locals=False)
    src = Source("foo\nbar")
    codeflash_output = fei.get_source(src, 1)
    result = codeflash_output  # 2.19μs -> 2.59μs (15.3% slower)


def test_get_source_tbfilter_flag():
    """Test that tbfilter flag does not affect get_source output."""
    fei = FormattedExcinfo(tbfilter=False)
    src = Source("foo\nbar")
    codeflash_output = fei.get_source(src, 0)
    result = codeflash_output  # 2.22μs -> 2.59μs (14.2% slower)


def test_get_source_style_flag():
    """Test that style flag does not affect get_source output."""
    fei = FormattedExcinfo(style="short")
    src = Source("foo\nbar")
    codeflash_output = fei.get_source(src, 1)
    result = codeflash_output  # 2.23μs -> 2.52μs (11.4% slower)


def test_get_source_showlocals_flag():
    """Test that showlocals flag does not affect get_source output."""
    fei = FormattedExcinfo(showlocals=True)
    src = Source("foo\nbar")
    codeflash_output = fei.get_source(src, 1)
    result = codeflash_output  # 2.27μs -> 2.55μs (11.3% slower)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-FormattedExcinfo.get_source-mi9r5fom and push.

Codeflash Static Badge

The optimized code achieves a 10% speedup by reducing redundant computations and replacing inefficient for-loops with faster list comprehensions and conditional checks.

**Key optimizations applied:**

1. **Avoided repeated `len()` calls**: The original code called `len(source)` and `len(source.lines)` multiple times. The optimized version caches these values as `source_len` and reuses `source.lines` as `source_lines`, eliminating redundant attribute lookups and length calculations.

2. **Replaced for-loops with list comprehensions**: The original code used explicit for-loops to append lines with prefixes:
   ```python
   for line in source.lines[:line_index]:
       lines.append(space_prefix + line)
   ```
   The optimized version uses list comprehensions with `extend()`:
   ```python
   lines.extend([space_prefix + line for line in source_lines[:line_index]])
   ```
   This is faster because list comprehensions are optimized at the C level in Python.

3. **Added conditional checks to avoid unnecessary iterations**: The optimized code adds `if line_index > 0:` and `if line_index + 1 < source_len:` checks to skip empty slice iterations, avoiding the overhead of creating and iterating over empty lists.

**Performance impact by test case:**
- **Large-scale tests show the biggest gains** (15-23% faster): These benefit most from the loop optimizations since they process many lines
- **Out-of-bounds tests improve** (5-20% faster): Benefit from reduced redundant computations 
- **Small/simple cases see modest gains** (1-10% faster): Less iteration overhead to optimize
- **Some edge cases are slightly slower** (4-17%): The additional conditional checks add minor overhead for very small inputs

The optimizations are particularly effective for larger source files where the reduced loop overhead and cached values provide substantial benefits, while maintaining identical functionality for all input scenarios.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 22, 2025 03:51
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Nov 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant