Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 22, 2025

📄 6% (0.06x) speedup for ApproxMapping._repr_compare in src/_pytest/python_api.py

⏱️ Runtime : 11.4 milliseconds 10.7 milliseconds (best of 155 runs)

📝 Explanation and details

The optimized code achieves a 6% speedup through several micro-optimizations that reduce function call overhead and improve memory access patterns:

Key Optimizations:

  1. Eliminated redundant list conversion in _compare_approx: Added type check to avoid converting message_data to list when it's already a list, saving unnecessary allocation.

  2. Optimized column width calculation: Replaced expensive max() function calls with simple comparison branches (if l0 > max0), reducing function call overhead in the tight loop that processes message formatting.

  3. Cached method references: Stored explanation.append as a local variable and pre-formatted the string template to avoid repeated attribute lookups during message formatting.

  4. Improved type checking: Replaced isinstance(x, Decimal) with type(x) is Decimal for faster type detection, avoiding MRO traversal.

  5. Reduced attribute lookups in hot path: In _repr_compare, cached self.rel, self.abs, self.nan_ok as local variables and used direct __setitem__ calls instead of dictionary assignment.

  6. Optimized difference calculations: Eliminated nested max() calls by using temporary variables and direct comparisons for max_abs_diff and max_rel_diff updates.

  7. Streamlined message data construction: Used method references (__getitem__) to avoid repeated attribute lookups when building the final message data.

Performance Impact:
The optimizations show strongest gains for larger datasets with many mismatches - test cases with 300-1000 elements and high mismatch rates see 9-10% improvements, while smaller datasets see modest gains or slight regressions due to the overhead of additional type checks. The optimizations are particularly effective when processing large comparison results where the formatting and string operations dominate runtime.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 5144 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from decimal import Decimal

from _pytest.python_api import ApproxMapping

# imports
import pytest


# function to test
# (see previous block for full class definitions, assumed present here)

# ---- UNIT TESTS FOR ApproxMapping._repr_compare ----


class DummyScalar:
    """A dummy scalar type to test type robustness."""

    def __init__(self, value):
        self.value = value

    def __float__(self):
        return float(self.value)

    def __eq__(self, other):
        return float(self) == float(other)

    def __sub__(self, other):
        return float(self) - float(other)

    def __repr__(self):
        return f"DummyScalar({self.value})"


def make_amap(expected, rel=None, abs=None, nan_ok=False):
    """Helper to construct an ApproxMapping with the right numeric types."""
    return ApproxMapping(expected, rel=rel, abs=abs, nan_ok=nan_ok)


# ---- BASIC TEST CASES ----


def test_identical_simple_dicts():
    """Test with two identical simple dicts: should yield no mismatches."""
    amap = make_amap({"a": 1.0, "b": 2.0})
    other = {"a": 1.0, "b": 2.0}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 13.4μs -> 13.6μs (1.45% slower)
    # No further lines after header except column header


def test_one_difference():
    """Test with one value differing."""
    amap = make_amap({"x": 1.0, "y": 2.0})
    other = {"x": 1.0, "y": 3.0}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 26.7μs -> 27.1μs (1.39% slower)


def test_multiple_differences():
    """Test with all values differing."""
    amap = make_amap({"a": 1.0, "b": 2.0})
    other = {"a": 2.0, "b": 4.0}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 29.4μs -> 28.1μs (4.74% faster)


def test_float_precision():
    """Test with values differing by a small float precision."""
    amap = make_amap({"p": 1.000001})
    other = {"p": 1.0}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 14.8μs -> 15.1μs (1.73% slower)


def test_decimal_values():
    """Test with Decimal values."""
    amap = make_amap({"d": Decimal("1.1")})
    other = {"d": Decimal("1.0")}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 32.2μs -> 32.1μs (0.461% faster)


# ---- EDGE TEST CASES ----


def test_empty_dicts():
    """Test with empty dicts."""
    amap = make_amap({})
    other = {}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 7.85μs -> 8.18μs (4.11% slower)


def test_all_nan():
    """Test with NaN values, nan_ok=False (default)."""
    amap = make_amap({"a": float("nan")})
    other = {"a": float("nan")}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 20.9μs -> 20.9μs (0.220% faster)


def test_inf_values():
    """Test with inf and -inf values."""
    amap = make_amap({"a": float("inf"), "b": float("-inf")})
    other = {"a": float("inf"), "b": float("inf")}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 19.0μs -> 18.7μs (1.42% faster)


def test_zero_expected_value():
    """Test with expected value of zero (tests division by zero for rel diff)."""
    amap = make_amap({"a": 0.0})
    other = {"a": 1.0}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 23.1μs -> 23.2μs (0.319% slower)


def test_unordered_keys():
    """Test with dicts having same keys in different order."""
    amap = make_amap({"x": 1.0, "y": 2.0, "z": 3.0})
    other = {"z": 3.0, "y": 2.0, "x": 1.0}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 31.2μs -> 30.5μs (2.36% faster)


def test_non_numeric_values():
    """Test with non-numeric values: should raise TypeError when constructing ApproxMapping."""
    with pytest.raises(TypeError):
        make_amap({"a": "foo", "b": 2.0})


def test_extra_keys_in_other_side():
    """Test with other_side having extra keys: should only compare matching keys by order."""
    amap = make_amap({"a": 1.0, "b": 2.0})
    other = {"a": 1.0, "b": 2.0, "c": 3.0}
    # Only first two keys compared
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 16.9μs -> 17.2μs (1.39% slower)


def test_missing_keys_in_other_side():
    """Test with other_side missing keys: should only compare up to length of expected."""
    amap = make_amap({"a": 1.0, "b": 2.0, "c": 3.0})
    other = {"a": 1.0, "b": 2.0}
    # Only first two keys compared, third is ignored
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 14.4μs -> 15.1μs (4.44% slower)


def test_nonstandard_scalar_type():
    """Test with a custom scalar type that implements numeric protocol."""
    amap = make_amap({"a": DummyScalar(1.0)})
    other = {"a": 1.0}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 12.8μs -> 12.9μs (0.713% slower)


# ---- LARGE SCALE TEST CASES ----


def test_large_identical_dict():
    """Test with a large dict of identical values."""
    N = 500
    amap = make_amap({i: float(i) for i in range(N)})
    other = {i: float(i) for i in range(N)}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 498μs -> 506μs (1.50% slower)


def test_large_all_different():
    """Test with a large dict where all values differ."""
    N = 300
    amap = make_amap({i: float(i) for i in range(N)})
    other = {i: float(i + 1) for i in range(N)}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 1.45ms -> 1.31ms (10.7% faster)


def test_large_half_different():
    """Test with a large dict where half the values differ."""
    N = 400
    amap = make_amap({i: float(i) for i in range(N)})
    other = {i: (float(i) if i % 2 == 0 else float(i + 1)) for i in range(N)}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 1.20ms -> 1.10ms (9.35% faster)


def test_large_sparse_mismatch():
    """Test with a large dict where only a few values differ."""
    N = 500
    amap = make_amap({i: float(i) for i in range(N)})
    other = {
        i: (float(i) if i != 123 and i != 456 else float(i + 10)) for i in range(N)
    }
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 515μs -> 520μs (0.984% slower)


def test_large_dict_zero_expected():
    """Test large dict with some expected values zero, to check rel diff inf handling."""
    N = 100
    amap = make_amap({i: 0.0 if i % 10 == 0 else float(i) for i in range(N)})
    other = {i: 1.0 if i % 10 == 0 else float(i) for i in range(N)}
    codeflash_output = amap._repr_compare(other)
    result = codeflash_output  # 158μs -> 154μs (2.48% faster)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from decimal import Decimal

# imports
from _pytest.python_api import ApproxMapping


# -------------------------
# Unit tests for ApproxMapping._repr_compare
# -------------------------

# ----------- BASIC TEST CASES ------------


def test_basic_all_elements_match():
    """Test when all elements are equal (should return header only, no mismatches)."""
    expected = {"a": 1.0, "b": 2.0}
    actual = {"a": 1.0, "b": 2.0}
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 13.4μs -> 13.4μs (0.283% slower)


def test_basic_one_mismatch():
    """Test with one mismatched element."""
    expected = {"x": 10.0, "y": 20.0}
    actual = {"x": 10.0, "y": 21.0}
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 25.9μs -> 25.2μs (2.60% faster)


def test_basic_multiple_mismatches():
    """Test with multiple mismatched elements."""
    expected = {"a": 1.0, "b": 2.0, "c": 3.0}
    actual = {"a": 1.1, "b": 2.1, "c": 3.0}
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 32.9μs -> 32.1μs (2.33% faster)


# ----------- EDGE TEST CASES ------------


def test_edge_empty_mapping():
    """Test with empty mappings."""
    expected = {}
    actual = {}
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 7.79μs -> 7.92μs (1.63% slower)


def test_edge_all_mismatches():
    """Test where all elements mismatch."""
    expected = {"a": 1.0, "b": 2.0}
    actual = {"a": 2.0, "b": 3.0}
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 30.2μs -> 29.0μs (4.02% faster)


def test_edge_zero_expected_value():
    """Test with expected value zero, which should set max_rel_diff to infinity."""
    expected = {"zero": 0.0}
    actual = {"zero": 1.0}
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 21.2μs -> 21.5μs (1.64% slower)


def test_edge_nan_values_nan_ok_false():
    """Test with NaN values and nan_ok=False (should report mismatch)."""
    expected = {"x": float("nan")}
    actual = {"x": float("nan")}
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 20.2μs -> 19.7μs (2.88% faster)


def test_edge_nan_values_nan_ok_true():
    """Test with NaN values and nan_ok=True (should NOT report mismatch)."""
    expected = {"x": float("nan")}
    actual = {"x": float("nan")}
    approx = ApproxMapping(expected, nan_ok=True)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 13.0μs -> 13.3μs (2.21% slower)


def test_edge_decimal_values():
    """Test with Decimal values."""
    expected = {"d": Decimal("1.000000")}
    actual = {"d": Decimal("1.000001")}
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 21.3μs -> 21.8μs (2.24% slower)


def test_edge_keys_ordering():
    """Test that ordering of keys does not affect the result."""
    expected = {"x": 1.0, "y": 2.0, "z": 3.0}
    actual = {"z": 3.0, "y": 2.0, "x": 1.1}
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 33.0μs -> 32.3μs (2.21% faster)


def test_large_scale_all_match():
    """Test with large mapping where all elements match."""
    expected = {i: float(i) for i in range(1000)}
    actual = {i: float(i) for i in range(1000)}
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 991μs -> 1.01ms (1.83% slower)


def test_large_scale_single_mismatch():
    """Test with large mapping where only one element mismatches."""
    expected = {i: float(i) for i in range(1000)}
    actual = {i: float(i) for i in range(1000)}
    # Introduce a mismatch at index 500
    actual[500] = 12345.0
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 1.00ms -> 1.02ms (1.05% slower)


def test_large_scale_many_mismatches():
    """Test with large mapping where many elements mismatch."""
    expected = {i: float(i) for i in range(1000)}
    actual = {i: float(i + 1) for i in range(1000)}
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 4.76ms -> 4.31ms (10.2% faster)


def test_large_scale_decimal():
    """Test large mapping with Decimal values."""
    expected = {i: Decimal(str(i)) for i in range(100)}
    actual = {i: Decimal(str(i)) for i in range(100)}
    # Introduce mismatch
    actual[50] = Decimal("9999")
    approx = ApproxMapping(expected)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 149μs -> 150μs (0.416% slower)


def test_large_scale_nan_ok_true():
    """Test large mapping where all values are NaN and nan_ok=True."""
    expected = {i: float("nan") for i in range(100)}
    actual = {i: float("nan") for i in range(100)}
    approx = ApproxMapping(expected, nan_ok=True)
    codeflash_output = approx._repr_compare(actual)
    result = codeflash_output  # 154μs -> 154μs (0.023% faster)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-ApproxMapping._repr_compare-mi9t0d69 and push.

Codeflash Static Badge

The optimized code achieves a **6% speedup** through several micro-optimizations that reduce function call overhead and improve memory access patterns:

**Key Optimizations:**

1. **Eliminated redundant list conversion** in `_compare_approx`: Added type check to avoid converting `message_data` to list when it's already a list, saving unnecessary allocation.

2. **Optimized column width calculation**: Replaced expensive `max()` function calls with simple comparison branches (`if l0 > max0`), reducing function call overhead in the tight loop that processes message formatting.

3. **Cached method references**: Stored `explanation.append` as a local variable and pre-formatted the string template to avoid repeated attribute lookups during message formatting.

4. **Improved type checking**: Replaced `isinstance(x, Decimal)` with `type(x) is Decimal` for faster type detection, avoiding MRO traversal.

5. **Reduced attribute lookups in hot path**: In `_repr_compare`, cached `self.rel`, `self.abs`, `self.nan_ok` as local variables and used direct `__setitem__` calls instead of dictionary assignment.

6. **Optimized difference calculations**: Eliminated nested `max()` calls by using temporary variables and direct comparisons for `max_abs_diff` and `max_rel_diff` updates.

7. **Streamlined message data construction**: Used method references (`__getitem__`) to avoid repeated attribute lookups when building the final message data.

**Performance Impact:**
The optimizations show **strongest gains for larger datasets with many mismatches** - test cases with 300-1000 elements and high mismatch rates see 9-10% improvements, while smaller datasets see modest gains or slight regressions due to the overhead of additional type checks. The optimizations are particularly effective when processing large comparison results where the formatting and string operations dominate runtime.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 22, 2025 04:43
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Nov 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant