-
-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add diffing algorithm to maps #199
Add diffing algorithm to maps #199
Conversation
Oh wow, just wow! Is it Christmas?! This is the most exciting PR this repository has received from a 3rd party so far! 🥳 Danke so sehr Florian! Before merging, I'd like to discuss a little bit the API. Note that these are not requests to change the code by itself, I'm just curious about what you think here. 1. Now I'm also thinking about the API. Should it maybe be a free standing function? diff(a, b, on_added, on_updated, on_removed); 2. Also, alternatively, should we group the handler functions into an object? template <class AddedFn, class ChangedFn, class RemovedFn>
struct differ {
AddedFn added;
ChangedFn changed;
RemovedFn removed;
};
template <class AddedFn, class ChangedFn, class RemovedFn>
auto make_differ(AddedFn&& added, ChangedFn&& changed, RemovedFn&& removed)
-> differ<std::decay_t<AddedFn>, std::decay_t<ChangedFn>, std::decay_t<RemoveFn>> {
return {std::forward<AddedFn>(added), std::forward<ChangedFn>(changed std::forward<RemovedFn>(removed)};
}
...
diff(a, b, differ);
diff_with(a, b, added, removed, changed) { diff(a, b, make_differ(...))} 3. Have you looked into diffing vectors as well? API wise, I think 3 functions like that could be provided, excepting the "value" should maybe be a 4. One thing that would be amazing would be to add special integration to Lager such that when you have cursors that descend from a 5. Also, this is very much outside of the scope of this PR, but one could imagine in the future leveraging a similar traversal structure to produce: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the questions above, I just saw one issue with the implementation:
Inside champ
, you are accessing .first
and .second
on the values of the map. However, champ
itself is agnostic of whether the tree represents a Map or a Set. You can see how immer::detail::champ
is instantiated differently in immer::set<>
and immer::map<>
in order to achieve the two structures. I believe that you can achieve the desired genericity by using Equal{}
instead of a.first == b.first
and passing an additional function type to compare values (like we do for champ::equals
).
Extra points if you also add tests for immer::set
😄
But then again, thank you so much for this brilliant PR!
Thanks for the comments!
|
Regarding the last point @mvtec-richter, you do not need to add another template parameter to |
I implemented the usage of a template variable for Equals and EqualsValue and also added sets. |
@mvtec-richter thank you a lot for the latest changes! I've noticed that you have added a Wrt. the |
Of course I can remove the Equal template parameter from the interface. That will simplify the interface. Wrt. the However, if we have a free standing I have another question concerning the use of std::forward for lambdas. I wonder, if our usage in champ::diff is correct. Should the callback be passed as & instead of &&, because they are forwarded multiple times in the function? |
Hi!
Cool 👍
I agree 👍
Yes, I agree that the interface should be homogeneous, so one can write generic code that uses
You are correct in that it's incorrect :) Taking the lambdas in as && is fine, but the |
* also remove EqualValue from diff interface
Codecov Report
@@ Coverage Diff @@
## master #199 +/- ##
==========================================
+ Coverage 91.27% 91.47% +0.19%
==========================================
Files 104 104
Lines 9574 9791 +217
==========================================
+ Hits 8739 8956 +217
Misses 835 835
Continue to review full report at Codecov.
|
Thanks for your comments! |
I'm merging this and will work on polishing the details myself. Thank you very much for this work Florian! |
Hi @arximboldi ! I've been holding onto some pretty nasty architecture in my application, waiting and hoping for someone to extend the diff mechanism to I'm hitting the point in my app where it's getting silly to work around this, so I've decided to block further development on having the ability to diff immer vectors. Before jumping into implementation to contribute these diffs (which I've put off since it seems a bit intimidating...), I wonder if you have any thoughts/tips/suggestions, or on the outside chance you or someone else has already started to look into this? It sounds like the current map diff implementation would be a good starting point?
Thank you! |
Hi @khiner! Thank you for looking into this... that would be an incredibly useful contribution. In fact... I do miss this feature all the time! One of the reasons it hasn't been tackled though is that it's actually not easy work. It is definitely harder than HAMT diffing. In fact, I suspect that it would require some degree of original algorithmic work (potentially worth an academic paper though, in case that's an encouragement for you!). As a starting point, I would recommend getting some familiarity with how Relaxed Radix Balanced Tries work. You can check the first parts of the Immer paper. I also find L'Orange's master's thesis one of the clearest descriptions of this data-structure. Then we'd need to find some way to adapt some of the existing sequence diffing algorithms to leverage structural sharing, by operating somewhat recursively in the RRB-trie. You can look into this Rust library for a list of existing algorithms (and their implementation). I have this discussion in my notes also as it contains interesting pointers in that field. Let me know if there is some way I can further help! |
Hi @arximboldi , thanks for getting back on this and for the detailed notes and pointers! I see, it sounds like this is far from trivial, but in theory possible to achieve the same type of |
Yes, my intuition says we we should be able to achieve that! Big big reward indeed! Game changer for lots of applications. |
This is the implementation of a diff algorithm for immer::map. It compares two immer::map instances and calls respective lambda functions when keys were added or removed or when the value of a key changed. When data is shared among the two maps, the algorithm will not need to compare shared subtrees of the map.
We plan to use this for an undo feature.
The diff algorithm has following signature:
̂
map::diff(map const& other, AddedFn&& added, ChangedFn&& changed, RemovedFn&& Removed)
where the lambdas have the signature:
Usage:
This signature works for our usecase, however we are happy for further input and would like to help to upstream this feature.
This partially addresses: #120