Skip to content

Conversation

@DmitryNekrasov
Copy link
Contributor

  • Replace direct list concatenation in ParserStructure.append() with ConcatenatedListView for improved efficiency.
  • Add ConcatenatedListView implementation to lazily combine two lists without creating a new collection.

@DmitryNekrasov DmitryNekrasov self-assigned this Nov 3, 2025
@DmitryNekrasov
Copy link
Contributor Author

@dkhalanskyjb Hello! What do you think about this idea?

@DmitryNekrasov DmitryNekrasov marked this pull request as draft November 3, 2025 14:33
@dkhalanskyjb
Copy link
Collaborator

Hi! This may help with the first stage (building a parser before the normalisation), but normalisation has quadratic complexity, too, and it wouldn't benefit from the proposed approach, as the lists themselves will also need to be reconstructed.

We could extract the happy fast path where there are no adjacent numeric parser operations and simplify normalisation there. That is a common case, so it would be nice to provide brilliant performance there. I'm not yet convinced the common case can't be drastically improved by a better algorithm.

@dkhalanskyjb
Copy link
Collaborator

Now that I think about it, it doesn't even fix the quadratic complexity of the initial stage. Concatenating n parsers will give us a binary tree with n leaves. We will need to traverse the tree at least once, and even a single enumeration of all these elements will have quadratic complexity: the depth n to access the first parser, n - 1 to access the second one, and so on. The construction of the new list does indeed become O(n), but then, each traversal is O(n^2).

Most parsers we concatenate are going to be single-element, so n parsers basically means n operations.

@dkhalanskyjb
Copy link
Collaborator

Yep, the initial stage also doesn't benefit from this. A quick run of a benchmark shows this:

Before the change:

Benchmark                        Mode  Cnt  Score   Error  Units
FormattingBenchmark.buildFormat  avgt   25  5.205 ± 0.090  us/op

After the change:

Benchmark                        Mode  Cnt  Score   Error  Units
FormattingBenchmark.buildFormat  avgt   25  7.830 ± 0.160  us/op

Here, less is better (the numbers 5.2 and 7.8 show how long in milliseconds an operation takes).

The benchmark itself is creating the datetime format used in Python:

     @Benchmark
     fun buildFormat(blackhole: Blackhole) {
         val v = LocalDateTime.Format {
             year()
             char('-')
             monthNumber()
             char('-')
             day()
             char(' ')
             hour()
             char(':')
             minute()
             optional {
                 char(':')
                 second()
                 optional {
                     char('.')
                     secondFraction()
                 }
             }
         }
         blackhole.consume(v)
     }

@DmitryNekrasov
Copy link
Contributor Author

I have avoided $$O(n^2)$$ time complexity for the creation of a naive serial parser:

open class ParserStructureConcatBenchmark {

    @Param("1", "2", "4", "8", "16", "32", "64", "128", "256", "512", "1024")
    var n = 0

    @Benchmark
    fun largeSerialFormat(blackhole: Blackhole) {
        val format = LocalDateTime.Format {
            repeat(n) {
                char('^')
                monthNumber()
                char('&')
                day()
                char('!')
                hour()
                char('$')
                minute()
                char('#')
                second()
                char('@')
            }
        }
        blackhole.consume(format)
    }
}

ParserStructure(operations + other.operations, other.followedBy)

Benchmark                                          (n)  Mode  Cnt         Score         Error  Units
ParserStructureConcatBenchmark.largeSerialFormat     1  avgt    5      1405.703 ±       7.771  ns/op
ParserStructureConcatBenchmark.largeSerialFormat     2  avgt    5      2661.292 ±      74.637  ns/op
ParserStructureConcatBenchmark.largeSerialFormat     4  avgt    5      5079.791 ±      14.050  ns/op
ParserStructureConcatBenchmark.largeSerialFormat     8  avgt    5     10676.345 ±      87.379  ns/op
ParserStructureConcatBenchmark.largeSerialFormat    16  avgt    5     22338.753 ±     325.866  ns/op
ParserStructureConcatBenchmark.largeSerialFormat    32  avgt    5     55037.234 ±     131.355  ns/op
ParserStructureConcatBenchmark.largeSerialFormat    64  avgt    5    136245.744 ±    4255.752  ns/op
ParserStructureConcatBenchmark.largeSerialFormat   128  avgt    5    402370.485 ±    4214.158  ns/op
ParserStructureConcatBenchmark.largeSerialFormat   256  avgt    5   1525375.832 ±   52333.725  ns/op
ParserStructureConcatBenchmark.largeSerialFormat   512  avgt    5   6293168.210 ±  158860.438  ns/op
ParserStructureConcatBenchmark.largeSerialFormat  1024  avgt    5  23487634.639 ± 1012301.386  ns/op

ParserStructure(ConcatenatedListView(operations, other.operations), other.followedBy)

Benchmark                                          (n)  Mode  Cnt        Score       Error  Units
ParserStructureConcatBenchmark.largeSerialFormat     1  avgt    5     1033.226 ±    15.759  ns/op
ParserStructureConcatBenchmark.largeSerialFormat     2  avgt    5     1971.151 ±     6.071  ns/op
ParserStructureConcatBenchmark.largeSerialFormat     4  avgt    5     3929.353 ±    19.891  ns/op
ParserStructureConcatBenchmark.largeSerialFormat     8  avgt    5     8286.685 ±    57.521  ns/op
ParserStructureConcatBenchmark.largeSerialFormat    16  avgt    5    16827.044 ±   120.837  ns/op
ParserStructureConcatBenchmark.largeSerialFormat    32  avgt    5    32460.756 ±  1708.745  ns/op
ParserStructureConcatBenchmark.largeSerialFormat    64  avgt    5    62202.661 ±   963.763  ns/op
ParserStructureConcatBenchmark.largeSerialFormat   128  avgt    5   120877.958 ±  1474.372  ns/op
ParserStructureConcatBenchmark.largeSerialFormat   256  avgt    5   245431.433 ±  4975.299  ns/op
ParserStructureConcatBenchmark.largeSerialFormat   512  avgt    5   483199.434 ±  2113.089  ns/op
ParserStructureConcatBenchmark.largeSerialFormat  1024  avgt    5  1038053.801 ± 28129.845  ns/op

@DmitryNekrasov
Copy link
Contributor Author

formatCreationWithAlternativeParsing (before caching)

Benchmark                             (n)  Mode  Cnt          Score          Error  Units
formatCreationWithAlternativeParsing    2  avgt    5       8810.577 ±       62.323  ns/op
formatCreationWithAlternativeParsing    3  avgt    5      24660.018 ±      194.929  ns/op
formatCreationWithAlternativeParsing    4  avgt    5      70321.838 ±     1128.198  ns/op
formatCreationWithAlternativeParsing    5  avgt    5     204549.353 ±     1720.009  ns/op
formatCreationWithAlternativeParsing    6  avgt    5     604779.515 ±     3666.215  ns/op
formatCreationWithAlternativeParsing    7  avgt    5    1830192.394 ±    18695.866  ns/op
formatCreationWithAlternativeParsing    8  avgt    5    5449726.801 ±    28945.062  ns/op
formatCreationWithAlternativeParsing    9  avgt    5   16326281.316 ±   153527.963  ns/op
formatCreationWithAlternativeParsing   10  avgt    5   49075976.571 ±  1297210.911  ns/op
formatCreationWithAlternativeParsing   11  avgt    5  148064885.743 ± 12308758.456  ns/op
formatCreationWithAlternativeParsing   12  avgt    5  454970041.600 ± 70482925.523  ns/op

formatCreationWithAlternativeParsing (after caching)

Benchmark                             (n)  Mode  Cnt      Score      Error  Units
formatCreationWithAlternativeParsing    2  avgt    5   7891.114 ±   13.541  ns/op
formatCreationWithAlternativeParsing    3  avgt    5  13171.110 ±   78.865  ns/op
formatCreationWithAlternativeParsing    4  avgt    5  18710.440 ±  991.032  ns/op
formatCreationWithAlternativeParsing    5  avgt    5  23920.832 ±  639.878  ns/op
formatCreationWithAlternativeParsing    6  avgt    5  28424.127 ±   88.730  ns/op
formatCreationWithAlternativeParsing    7  avgt    5  33483.529 ± 2935.929  ns/op
formatCreationWithAlternativeParsing    8  avgt    5  39414.801 ± 2181.813  ns/op
formatCreationWithAlternativeParsing    9  avgt    5  43943.165 ±  261.219  ns/op
formatCreationWithAlternativeParsing   10  avgt    5  48417.492 ± 2198.409  ns/op
formatCreationWithAlternativeParsing   11  avgt    5  53704.932 ±  844.236  ns/op
formatCreationWithAlternativeParsing   12  avgt    5  59041.803 ±  546.692  ns/op

formatCreationWithNestedAlternativeParsing (before caching)

Benchmark                                   (n)  Mode  Cnt            Score           Error  Units
formatCreationWithNestedAlternativeParsing    2  avgt    5       519390.642 ±      2809.698  ns/op
formatCreationWithNestedAlternativeParsing    3  avgt    5     27569772.303 ±    388683.519  ns/op
formatCreationWithNestedAlternativeParsing    4  avgt    5    110676487.984 ±   5760474.687  ns/op
formatCreationWithNestedAlternativeParsing    5  avgt    5  13426672641.800 ± 841426222.880  ns/op

formatCreationWithNestedAlternativeParsing (after caching)

Benchmark                                   (n)  Mode  Cnt       Score       Error  Units
formatCreationWithNestedAlternativeParsing    2  avgt    5   41034.554 ± 10912.763  ns/op
formatCreationWithNestedAlternativeParsing    3  avgt    5   65044.120 ±  7292.569  ns/op
formatCreationWithNestedAlternativeParsing    4  avgt    5   73721.035 ±  7653.916  ns/op
formatCreationWithNestedAlternativeParsing    5  avgt    5  100544.511 ±   400.049  ns/op
formatCreationWithNestedAlternativeParsing    6  avgt    5  118522.459 ± 30279.436  ns/op
formatCreationWithNestedAlternativeParsing    7  avgt    5  139539.603 ± 17732.571  ns/op
formatCreationWithNestedAlternativeParsing    8  avgt    5  143597.460 ±  1106.854  ns/op
formatCreationWithNestedAlternativeParsing    9  avgt    5  173742.102 ±   889.090  ns/op
formatCreationWithNestedAlternativeParsing   10  avgt    5  195273.334 ± 55181.847  ns/op
formatCreationWithNestedAlternativeParsing   11  avgt    5  211645.084 ±   553.810  ns/op
formatCreationWithNestedAlternativeParsing   12  avgt    5  216165.437 ±   284.256  ns/op

@DmitryNekrasov DmitryNekrasov marked this pull request as ready for review November 10, 2025 16:26
@dkhalanskyjb
Copy link
Collaborator

It's cool that the quadratic complexity can be mitigated in some scenarios, and I still believe that it's currently contributing significantly to the runtime we essentially observe.

That said, the PR as a whole looks very much like a Pyrrhic victory to me, as the buildPythonDateTimeFormat is consistently a bit slower with the proposed changes than it was without it on my machine:

Before:
Benchmark                                                Mode  Cnt     Score     Error  Units
PythonDateTimeFormatBenchmark.buildPythonDateTimeFormat  avgt   20  5517.285 ± 155.130  ns/op

After:
Benchmark                                                Mode  Cnt     Score     Error  Units
PythonDateTimeFormatBenchmark.buildPythonDateTimeFormat  avgt   20  5703.280 ± 105.218  ns/op

(Less is better here)

Performance improvements are pointless if they negatively impact the actual use cases our users will encounter. Yes, the quadratic complexity was eliminated, but the increase in the resulting constant just seems too high.

I propose that we add all the common formats we already provide as benchmarks (LocalTime.Formats.ISO, UtcOffset.Formats.FOUR_DIGITS, DateTimeComponents.Formats.RFC_1123, etc.) and rely on them to check for the improvements. These formats are very representative of those that people are actually likely to write.

…ation

- Implement SerialFormatBenchmark to test repeated datetime format sequences.
- Implement PythonDateTimeFormatBenchmark to evaluate Python-compatible datetime formats.
- Implement ParallelFormatBenchmark to test creation of formats using nested and alternative parsing logic.
…tions` for consistency and clarity in `simplify` method logic
…ist construction and improved readability.
…me format!

Benchmark                                                       Mode  Cnt     Score     Error  Units
PythonDateTimeFormatBenchmark.buildPythonDateTimeFormat         avgt    5  4142.002 ± 374.247  ns/op
…aner handling of operation merging and reduce duplication.

Benchmark                                                       Mode  Cnt     Score    Error  Units
PythonDateTimeFormatBenchmark.buildPythonDateTimeFormat         avgt    5  3708.643 ± 29.908  ns/op
… usage by passing `ParserStructure` directly instead of separating operations and followedBy.
@DmitryNekrasov DmitryNekrasov force-pushed the dmitry.nekrasov/feature/571-concat-list-view branch from 9639740 to b6e80f8 Compare November 18, 2025 16:11
@DmitryNekrasov
Copy link
Contributor Author

Current state

current branch

Benchmark                       Mode  Cnt      Score    Error  Units
buildFourDigitsUtcOffsetFormat  avgt   60    541.269 ±  1.467  ns/op
buildIsoDateTimeFormat          avgt   60   6995.245 ± 37.785  ns/op
buildIsoDateTimeOffsetFormat    avgt   60  11927.428 ± 27.203  ns/op
buildPythonDateTimeFormat       avgt   60   3534.148 ± 18.882  ns/op
buildRfc1123DateTimeFormat      avgt   60  11583.920 ± 56.766  ns/op
Benchmark                             (n)  Mode  Cnt      Score    Error  Units
formatCreationWithAlternativeParsing    2  avgt   20   3370.042 ±  4.170  ns/op
formatCreationWithAlternativeParsing    3  avgt   20   5279.642 ± 11.493  ns/op
formatCreationWithAlternativeParsing    4  avgt   20   7088.464 ± 11.943  ns/op
formatCreationWithAlternativeParsing    5  avgt   20   8928.868 ± 13.066  ns/op
formatCreationWithAlternativeParsing    6  avgt   20  10830.971 ± 16.790  ns/op
formatCreationWithAlternativeParsing    7  avgt   20  12684.456 ± 19.450  ns/op
formatCreationWithAlternativeParsing    8  avgt   20  14495.178 ± 31.157  ns/op
formatCreationWithAlternativeParsing    9  avgt   20  16045.135 ± 28.893  ns/op
formatCreationWithAlternativeParsing   10  avgt   20  17773.640 ± 51.527  ns/op
formatCreationWithAlternativeParsing   11  avgt   20  19232.090 ± 29.020  ns/op
formatCreationWithAlternativeParsing   12  avgt   20  21182.522 ± 28.883  ns/op
Benchmark                                   (n)  Mode  Cnt       Score     Error  Units
formatCreationWithNestedAlternativeParsing    2  avgt   20   27016.957 ±  57.653  ns/op
formatCreationWithNestedAlternativeParsing    3  avgt   20   47974.316 ±  51.903  ns/op
formatCreationWithNestedAlternativeParsing    4  avgt   20   53139.664 ± 553.336  ns/op
formatCreationWithNestedAlternativeParsing    5  avgt   20   72680.745 ±  66.997  ns/op
formatCreationWithNestedAlternativeParsing    6  avgt   20   79016.410 ±  84.175  ns/op
formatCreationWithNestedAlternativeParsing    7  avgt   20  100976.336 ± 186.882  ns/op
formatCreationWithNestedAlternativeParsing    8  avgt   20  105233.473 ± 169.936  ns/op
formatCreationWithNestedAlternativeParsing    9  avgt   20  124636.109 ± 208.281  ns/op
formatCreationWithNestedAlternativeParsing   10  avgt   20  131118.085 ± 121.851  ns/op
formatCreationWithNestedAlternativeParsing   11  avgt   20  154301.570 ± 252.804  ns/op
formatCreationWithNestedAlternativeParsing   12  avgt   20  157376.652 ± 320.162  ns/op

master

Benchmark                       Mode  Cnt      Score     Error  Units
buildFourDigitsUtcOffsetFormat  avgt   60    789.775 ±   3.311  ns/op
buildIsoDateTimeFormat          avgt   60  13424.931 ± 109.256  ns/op
buildIsoDateTimeOffsetFormat    avgt   60  30401.345 ±  41.154  ns/op
buildPythonDateTimeFormat       avgt   60   6174.762 ±  49.195  ns/op
buildRfc1123DateTimeFormat      avgt   60  22671.321 ±  24.196  ns/op
Benchmark                             (n)  Mode  Cnt          Score         Error  Units
formatCreationWithAlternativeParsing    2  avgt   20       7598.233 ±      14.639  ns/op
formatCreationWithAlternativeParsing    3  avgt   20      21236.078 ±      59.174  ns/op
formatCreationWithAlternativeParsing    4  avgt   20      57254.240 ±     102.590  ns/op
formatCreationWithAlternativeParsing    5  avgt   20     162057.430 ±     314.437  ns/op
formatCreationWithAlternativeParsing    6  avgt   20     479192.466 ±    1120.848  ns/op
formatCreationWithAlternativeParsing    7  avgt   20    1404977.271 ±    2664.719  ns/op
formatCreationWithAlternativeParsing    8  avgt   20    4214611.787 ±   11481.371  ns/op
formatCreationWithAlternativeParsing    9  avgt   20   12654319.685 ±   72087.313  ns/op
formatCreationWithAlternativeParsing   10  avgt   20   37741105.256 ±   70027.785  ns/op
formatCreationWithAlternativeParsing   11  avgt   20  114531032.986 ±  606545.228  ns/op
formatCreationWithAlternativeParsing   12  avgt   20  346368518.417 ± 4296012.144  ns/op
Benchmark                                   (n)  Mode  Cnt            Score           Error  Units
formatCreationWithNestedAlternativeParsing    2  avgt   20       402940.975 ±       509.028  ns/op
formatCreationWithNestedAlternativeParsing    3  avgt   20     25349633.611 ±     40886.039  ns/op
formatCreationWithNestedAlternativeParsing    4  avgt   20     87055812.456 ±    255905.781  ns/op
formatCreationWithNestedAlternativeParsing    5  avgt   20  11457637695.800 ± 183641973.861  ns/op

@DmitryNekrasov
Copy link
Contributor Author

@dkhalanskyjb Hi! PR is ready for review.

@fzhinkin fzhinkin linked an issue Nov 19, 2025 that may be closed by this pull request
Copy link
Collaborator

@dkhalanskyjb dkhalanskyjb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it actually ready? A new set of changes arrived after I've started reviewing the code. Please let me know when you're no longer actively working on this.

}
firstOperation is NumberSpanParserOperation -> {
add(NumberSpanParserOperation(numberSpan + firstOperation.consumers))
addAll(operationsToMerge.drop(1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop(1) creates an additional new list, so I'd expect a manual iteration over the indices here instead.

} else {
if (currentNumberSpan != null) {
newOperations.add(NumberSpanParserOperation(currentNumberSpan))
currentNumberSpan = null
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea: it's possible to also empty out unconditionalModifications at this point, it won't affect the correctness, but may save some time down the line.

@DmitryNekrasov
Copy link
Contributor Author

Now it's ready

@dkhalanskyjb dkhalanskyjb self-requested a review November 20, 2025 13:20
@DmitryNekrasov
Copy link
Contributor Author

SerialFormatBenchmark results after accumulatedOperations optimization.

Before:

Benchmark                                 (n)  Mode  Cnt         Score       Error  Units
SerialFormatBenchmark.largeSerialFormat     1  avgt   20      1419.379 ±     5.552  ns/op
SerialFormatBenchmark.largeSerialFormat     2  avgt   20      2699.420 ±    14.656  ns/op
SerialFormatBenchmark.largeSerialFormat     4  avgt   20      5076.431 ±    10.963  ns/op
SerialFormatBenchmark.largeSerialFormat     8  avgt   20     10624.101 ±    10.361  ns/op
SerialFormatBenchmark.largeSerialFormat    16  avgt   20     22255.467 ±    91.975  ns/op
SerialFormatBenchmark.largeSerialFormat    32  avgt   20     55256.940 ±   622.350  ns/op
SerialFormatBenchmark.largeSerialFormat    64  avgt   20    136084.051 ±   833.972  ns/op
SerialFormatBenchmark.largeSerialFormat   128  avgt   20    402579.029 ±   305.550  ns/op
SerialFormatBenchmark.largeSerialFormat   256  avgt   20   1516988.746 ±  1121.923  ns/op
SerialFormatBenchmark.largeSerialFormat   512  avgt   20   6178697.180 ± 10183.572  ns/op
SerialFormatBenchmark.largeSerialFormat  1024  avgt   20  23088691.763 ± 33341.007  ns/op

After:

Benchmark                                 (n)  Mode  Cnt       Score      Error  Units
SerialFormatBenchmark.largeSerialFormat     1  avgt   20    1024.999 ±    3.304  ns/op
SerialFormatBenchmark.largeSerialFormat     2  avgt   20    1955.393 ±    7.231  ns/op
SerialFormatBenchmark.largeSerialFormat     4  avgt   20    3633.342 ±    3.291  ns/op
SerialFormatBenchmark.largeSerialFormat     8  avgt   20    7200.159 ±   22.536  ns/op
SerialFormatBenchmark.largeSerialFormat    16  avgt   20   15434.221 ±  114.190  ns/op
SerialFormatBenchmark.largeSerialFormat    32  avgt   20   28727.602 ±  279.476  ns/op
SerialFormatBenchmark.largeSerialFormat    64  avgt   20   55723.290 ±   44.746  ns/op
SerialFormatBenchmark.largeSerialFormat   128  avgt   20  114686.151 ±  116.638  ns/op
SerialFormatBenchmark.largeSerialFormat   256  avgt   20  237912.418 ±  277.519  ns/op
SerialFormatBenchmark.largeSerialFormat   512  avgt   20  465611.494 ± 1977.705  ns/op
SerialFormatBenchmark.largeSerialFormat  1024  avgt   20  936852.380 ± 2241.462  ns/op

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The performance of creating new datetime formats can be improved

3 participants