Skip to content

Conversation

@arnaud-daroussin
Copy link
Contributor

@arnaud-daroussin arnaud-daroussin commented Nov 6, 2025

Hi @novakov-alexey,

I'm completing issue #286 with Scala mutable collection serializers:

  • scala.collection.mutable.ArrayDeque
  • scala.collection.mutable.Buffer & scala.collection.mutable.ArrayBuffer
  • scala.collection.mutable.Queue
  • scala.collection.mutable.Map & scala.collection.mutable.HashMap
  • scala.collection.mutable.Set & scala.collection.mutable.HashSet

Serializers are implementing de/ser for collection traits (Buffer, Map, Set, etc...) but there are 2 implicit functions for each serializers: one for the trait (e.i. Buffer) and one for the default implementation (e.i. ArrayBuffer). It allows to serialize any implementation of a collection trait, but there is a drawback at deserialization: the default implementation is instantiated.

That's why I've made the choice to implement an ArrayDeque first class serializer to keep the benefits of an ArrayDeque over a "plain" ArrayBuffer.

It may be possible to make things generic like it's done in CollectionSerializerSnapshot but it increases state size to serialize class names, etc. so I prefer to keep things simple for this first version. And maybe adding more simple implementations later is a better strategy.

@novakov-alexey
Copy link
Collaborator

Hi @arnaud-daroussin ,

Thanks for another great PR.

May I ask you what is the use case of having all these mutable collections in your Flink pipelines? Do they offer better performance than their immutable counterparts? Just curious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants