⚡️ Speed up function create_rename_keys by 119%
#395
+98
−73
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 119% (1.19x) speedup for
create_rename_keysinsrc/transformers/models/deprecated/deta/convert_deta_swin_to_pytorch.py⏱️ Runtime :
8.31 milliseconds→3.79 milliseconds(best of250runs)📝 Explanation and details
The optimized code achieves a 119% speedup by eliminating redundant list operations and reducing attribute lookups in nested loops.
Key optimizations applied:
Batch operations with
extend()instead of individualappend()calls: The original code made thousands of individuallist.append()calls, each requiring list resizing and memory allocation. The optimized version groups related keys into lists and usesextend()to add them in batches, reducing the overhead from O(n) individual operations to O(1) batch operations per group.Cached attribute lookups: The original code repeatedly accessed
config.backbone_config.depths[i]andconfig.encoder_layers/decoder_layerswithin loops. The optimized version caches these values (depths = config.backbone_config.depths,depth_i = depths[i], etc.) to eliminate redundant attribute lookups.String prefix caching: In tight loops that generate many f-strings with the same prefixes, the optimized code pre-calculates common string prefixes (
src_prefix,tgt_prefix) and reuses them, reducing string formatting overhead.Why this leads to speedup:
append()operations have overhead for bounds checking, potential resizing, and individual memory allocationsobj.attr.subattr) traverse the object hierarchy each timePerformance characteristics from test results:
The optimization is particularly effective for this function because it processes hundreds to thousands of key mappings in nested loops, making the reduction in per-operation overhead highly impactful.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-create_rename_keys-miaeme4zand push.