Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Various thoughts about implementing undo/redo for TruFont #614

Open
wants to merge 12 commits into
base: wx
Choose a base branch
from

Conversation

justvanrossum
Copy link

Probably best to use this PR as a public discussion before merging.

The document is mostly a brainstorm session about things that could and should, and does not yet contain a very concrete guideline about what to do.

At the same time, BlackFoundry has been experimenting with an undo system in their fork (referenced in my doc). It would be great to learn from their experience as well.


- An edit action is about to happen (for example the selection will be
dragged)
- The undo manager is told the name of the operation (In this case
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And possibly more meta data, e.g. the concerned layer, so the user knows which drag he's undoing. Maybe this info can even be used for e.g. a visual indicator effect in the UI.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would only be needed if one can edit multiple layers at the same time. If that's not the case, one can only undo whatever happened in the active layer, so it should be clear what will happen.

@raphlinus
Copy link

raphlinus commented Dec 11, 2018

Apologies for swooping in, and for suggesting something that's a fairly major architectural change. I want to get my basic ideas for undo out there, and am happy to expand. I think my suggestion is a valid approach for undo, but also pays off in making load / save cleaner, providing a "quicksave log" (which can be recovered in the case of crash or sudden exit), and will ultimately scale much more nicely for collaborative editing. It is similar in some ways to xi (the emphasis on small deltas) but avoids all the complexity of operational transform or CRDT, which I think is overkill. It also avoids the complexity of scaling to large documents, which I think is not necessary.

My personal view on how editor should be structured is that the "source of truth" is a JSON-like value. I say JSON-like because it can be Python lists, dicts, etc.; it doesn't need to be serialized to JSON, though of course doing that gives you load/save almost for free. Then, all edits are represented as deltas on this object. For JSON values, a simple delta is a JSON path, then either "delete", "insert", or "replace", and in the latter two cases a value for the location. A JSON path can be actiual jsonpath, or it can just be a list of key (for dict) and number (for list); the details don't matter much. Also note this is pretty similar to the Firebase realtime database. A compound delta is a list of simple deltas.

So an edit doesn't directly mutate the objects, it is effectively a side-effect free function that returns a compound delta. Applying the delta gives you the new JSON-like value.

All objects reflected in the UI have two main methods. One is "create from JSON-like value". The other is "apply delta". The idea here is that "apply delta" leaves the object in the same state as creating from the updated JSON-like value.

UI classes don't interact directly with the undo manager. Rather, it looks something like this. When the user does an action through the UI, it creates a delta. The undo manager records that delta into a stack, along with the inverse delta (computing inverses is easy - inserts become deletes and vice versa). The app framework also invokes "apply delta" on the UI objects, which ultimately updates the screen. The undo action pops a delta from the stack, and invokes "apply delta" with the inverse delta. Similarly, redo just applies the delta again.

Writing deltas to a log gives you "quick save" functionality almost for free. Note that saving doesn't require serialization code in UI objects - the source of truth is the JSON-like value. Loading of course uses the "create from JSON-like value" method.

Obviously this structure can be the basis of collaborative editing, but that requires additional logic for detecting and resolving conflicts (doing merges of some kind).

Having explicit deltas is useful for performance. For example, if there are multiple panes, let's say one for editing a component, one for showing glyphs, another for previewing text, each pane can respond to the "apply delta" and lazily invalidate only the stuff that's changed (I'll note that this involves keeping track of a dependency graph for component->glyph deps).

Note that (nearly) all UI state is encoded in the JSON-like value. That especially includes selection. It doesn't need to include ultra fine grained state like the current mouse position in a drag gesture though. It's also the case that you don't want to check selection state into a version control, so at some point there needs to be explicit filtering of what gets saved and what doesn't. I might refer people to LibrePCB for a modern instance of careful design of a textual save format that's version control friendly; their FOSDEM talk is a good introduction.

It's possible to do undo (and quick-save and collaboration) using a model of direct mutation of object state, and integrating it tightly with an undo manager. That's the classical way a lot of software is written. But I think it'll be tedious and there are a lot of things that can go wrong.

I know I've just sketched things out in a pretty abstract way here. There are other details I've been thinking of that are somewhat orthogonal to undo (for example, using UUID for graph structure such as component references). I'm happy to expand if there are questions.

@adrientetar
Copy link
Member

adrientetar commented Dec 11, 2018

Thanks for the comment. Agree that incremental undo is doable & desirable (I started to experiment with it in my C# app). One thing is the need to bubble-up the change, as there are various caches (e.g. there is a cached GUI path object in the model's Path - that should be cleared when there's a point change. Similarly when a change occurs the Glyph should be timestamped for change, etc. so the change goes up the tree all the way to the glyph, which stores the undo actions).

I had started playing with a change record, e.g. if you do point.x = 25 it'll create a change record such as this:

interface IChange {
  Time { get; } = new Unix.CurrentTime();

  Undo() {
    Debug.Assert(_obj.value == _newValue);
    _obj.value = _newValue;
    _obj.Parent?.ApplyChange(this, addToStack=false);
  }

  Redo() {
    ...
  }
}

public struct PointXChange : IChange

I guess the difference with your suggestion is you wanna implement the undo/redo logic in the objects themselves and not in the change class ?

I was thinking of storing a reference to the object such that I don't have to go down the tree (e.g., if you apply a change from the Glyph's undo stack you would have to know which path was this change made in and which point? if you store a reference to the changed point you can go up the tree again instead of down).

For collections, something like INotifyCollectionChanged's NotifyCollectionChangedEventArgs is all that's needed to stash changes made to a collection (it has Action - move/replace/delete etc. -, NewItems, NewStartingIndex, OldItems, OldStartingIndex which can model any change to a collection).

When the user does an action through the UI, it creates a delta.

Note that you can also perform actions as part of executing a script, that's why I thought of a model where point.x = 25 creates a change record internally and sends it up to the undo manager (so no need to explicit care for undo on the user's part). But I understand your suggestion is to calculate the list of all changes to be made and then apply them?

It'd be cool if you can add a simple code sketch, given the the tree of objects (Glyph -> Path -> Point changes get stored inside glyph). As I see it there are two kinds of changes, value change and collection change.

I'll watch the talk you linked, thansk.

docs/undo.md Outdated
In the case of an outline editor, a lot of data will simply be copied, as
it’s not always possible to simply “perform the reverse operation”.
Transformations are a good example of that. But also point behavior: when
a handle is implicitly being moved because it has a “smooth” connection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no implicit movement. Every movement is performed by an algorithm. So every movement can be recorded. YOu don't even have to round in your data class, you can round in your display pipeline.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I'll reformulate.

But what about transformations? If one rotates an entire glyph, would one record the changes to all points? What I was trying to say there is that recording the inverse transformation is problematic, as it will cause rounding errors (even) in floating point numbers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I thought you meant rounding as in the font editor rounds values to the grid. My bad.

I agree that doing floating point arithmetic in undo isn't great - that's why I rather record oldValue and newValue.

If one rotates an entire glyph, would one record the changes to all points?

I'd do that. Also keeps the set of possible operations simpler

docs/undo.md Outdated Show resolved Hide resolved
@YvesDup
Copy link

YvesDup commented Dec 12, 2018

first of all, the following implementation affects only to the layer of a glyph. Other glyph modifications will be done later with the same model.

This work was done in three parts:

1 - A new class UndoRedoMgr as a classical undoredo manager (URM) with two stacks: a undo stack and a redo stack

    A new action is pushed on the undo stack (on the top). An action is a tuple of 4 items: an operation, a function to run when undoes, a function to run when redoes and extra datas.

        function to run is stored as a functools.partial object
        extra datas are are the two snapshots of the layer - before and after operation. These datas are used in the functools.partial objects and to write unit tests.

    When undoes, action is pop from the undo stack and pushes to the redo stack only if undo function execution is correct (Else a DialogBox message will be display and a cleaning of stacks will done). The execution function will be controlled by a context manager (Not implemented yet)

    When redoes action, action moves from redo stack to undo stack after a succeed execution of redo function

2 - There is a creation of a new class UndoRedoGlyph which is

    a subclass of tfont.glyph

    and adding a UndoRedoMgr object as attribut

When opening a font, each glyph is loaded in UndoRedoGlyph. The UndoRedoMgr will create only when the glyph is really open in TruFont

3 - Each target action in the code (as transformation, alignment, move, select path …) is added to the URM with a decorated function « layer_decorate_undoredo » that runs the following instructions :

    Get a snapshot of the layer before operation

    Run the operation (this is the decorated function)

    Get an another snapshot of the modified layer

    Create an action and add it to the UMR

Sometimes action can be placed in two different functions (as a move selection operation with mouse). We can use two joined decorators, « prepare_layer_decorate_undoredo  » to initiate the mechanism, and « perform_layer_decorate_undoredo » to finalize it

So if a part of code to be « undoredoable » is inside a big function, I isolate these instructions in a new target function. And add it the appropriate decorator.

A traditional approach is also possible, inserting code at the good place around the « action » instructions. This code is the same as describe below, in the first decorator « layer_decorate_undoredo  »

The main advantage of the decorator is that the code is never modified and do not add new bug. But a decorator is hard to develop

The UndoRedoMgr class and all associated decorators are located in the objects.undoredomgr.py file.

The UndoRedoGlyph class is located in the objects.undoredoglyph.py file

Samples of implementation could be find in controls.propertiesView.py (a single decorator) and drawning.pentools.py (the twin decorators)

We talked with Jeremy and Samuel about the idea to include an UMR directly in tfont but we do not think that the perfect place, because there is no undo/redo function in font system.

At this time, my point of view is that work is a solid approach that must be finalize as soon as possible. But we have to write more unit tests on this part to improve the solution.

Decorator solution give us ability to enable or disable quickly the UMR, so we can code first a « new TruFont function » and connect it later with the UMR.

@YvesDup
Copy link

YvesDup commented Dec 12, 2018

Possible optimizations:

First is about an option regarding the URM decorators of TruFont . All of them are distinct functions but could be rewrite in a single decorator as a class (with magic call method). This option has to harden the solution. It could develop later when all functionalities of TruFont are done.

Second is about to reduce the structure of extra datas in the UMR. For each action we stored two snaphots of datas (layer before and layer after). But It is possible to stored only one snashot and using that of previous action - or next action - when undoes or redoes.
So all the snapshots could be in the dedicated sequence (list or set, I do not know yet) of UMR

@justvanrossum
Copy link
Author

@YvesDup thank you for your detailed message about your current work. I think indeed it needs to be avoided that two snapshots are taken upon every operation.

Right now, the only other comment I have is that I would not use the word "Redo" in class and module names. Undo implies Redo, so "UndoManager" should be good enough. Likewise, I'd call the module undoManager.py, to stick with the naming conventions used in the project.

@justvanrossum
Copy link
Author

@raphlinus thank you so much for sharing your ideas. I need to get my head around how this would work in a practical implementation, but I think I understand the main ideas. I'm considering making a toy implementation to study your proposal in more detail.

@raphlinus
Copy link

@justvanrossum Certainly. And apologies for not following up to Adrien's questions earlier; I've been slammed with other things (it'll hopefully be clear soon why), and writing code samples would take a fair amount of time to prepare. I think writing a toy implementation is an excellent idea, and I'd be more than happy to take a look at it, or answer specific questions - this week is shaping up to be better than last to focus.

@adrientetar
Copy link
Member

adrientetar commented Dec 17, 2018

A JSON path can be actiual jsonpath,

Okay reading this again I now understood that jsonpath is a thing (kinda like xml queries?), there's also jsonpatch which is a standard syntax for JSON deltas. So the deltas could be like a jsonpatch starting at the glyph level

I'm considering making a toy implementation to study your proposal in more detail.

I'll be trying it also.

@raphlinus
Copy link

On closer look, I think JSON Pointer is probably more appropriate than jsonpath, we don't need a rich query language, just a way to identify particular points in the tree. Also, I think jsonpatch could be useful if we want to serialize the undo records to disk (for example, being able to undo after quitting and restarting) but that serialization is not essential.

@justvanrossum
Copy link
Author

On closer look, I think JSON Pointer is probably more appropriate than jsonpath [...]

That seems the same (or similar) to the jsonpatch paths that Adrien referred to.

However, for a Python implementation I think it makes more sense to represent a path as a tuple of path elements, saving the need for string parsing.

I like the overall design of jsonpatch, and I think I'll use that as a model for my experiments.

@adrientetar
Copy link
Member

Here's a demo with Path/Point and Path has an undo stack and a move(dx) method.

It works the only thing is now instead of writing

for point in path.points:
    point.x += 2

One writes

changes = []
for ix, point in enumerate(path.points):
    changes.append(Change(
        "replace", f"/points/{ix}", {"x": point.x + 2}
    ))
path.apply(changes)

There must be a way to simplify syntax with some side effects e.g. annotating the first example

# path.changes = [] (initial state)
for ix, point in enumerate(path.points)
    # add '/ix' to incoming changes
    point.x = 2
    # push change {'x': 2}

# apply changes

@justvanrossum
Copy link
Author

justvanrossum commented Dec 19, 2018

So I started to write a toy implementation and then got a little too excited and went a bit farther: undoManager.py [superceded by jundo]

It's based on a mix of things I learned from @raphlinus' comment right here and from jsonpatch.com (Thanks @adrientetar for the link.)

My implementation is based on proxy objects, through which modifications are done to the model objects. The model object can be completely unaware of any undo logic.

Here's a small example:

    model = [1, 2, 3, {"a": 123}]
    um = UndoManager()
    proxy = um.setRootObject(model)
    # Before any modification can be done, a new change set needs
    # to be created:
    um.newChangeSet(title="replace list item")
    proxy[1] = 2000
    assert model[1] == 2000
    um.undo()
    assert model[1] == 2
    um.redo()
    assert model[1] == 2000
    um.newChangeSet(title="replace nested dict item")
    proxy[3]["a"] = 456
    assert model[3]["a"] == 456
    um.undo()
    assert model[3]["a"] == 123

(The module's __main__ section has a more elaborate example with a glyph and contours.)

What is missing still (and I'm a bit struggling with how to approach it) is how to (specify how to) instantiate custom objects from a JSON-like tree. Right now, if a custom object is deleted from the model, the undo data will simply keep a reference to the actual object. That works fine, but may need reconsideration when the change sets need to be serialized for one reason or the other.

Efficiency is something I thought about, but it wasn't high on my priority list. It remains to be seen whether this approach is viable for outline editing, where one often changes many points at the same time with a drag or a transformation.

In the ChangeSet object there are some optimizations possible. For example it could prune multiple replacements to the same child object and only keep the last one.

There is a test suite as well [also superceded by jundo]

@anthrotype
Copy link
Member

how to instantiate custom objects from a JSON-like tree

you could using attrs to define classes in combination with cattrs to structure/unstructure them (like the trufont's tfont library does for implementing serialization/deserialization).

@YvesDup
Copy link

YvesDup commented Dec 21, 2018 via email

@justvanrossum
Copy link
Author

I continued with my undo manager, and it is now available in its own repo: https://github.com/justvanrossum/jundo

I really like described the system @raphlinus described.

In my code I focussed less on the "single-source-of-truth" concept (although my code does not preclude taking that route), and more on "how can this be exposed to a Python programmer in a convenient form". Also: "how can undo be non-intrusively tacked onto an existing object model".

I also focussed less on pure JSON-like objects, and tried to be Pythonic, by adding support for sets, and by allowing non-string keys for dicts.

Currently, objects that get deleted from the model will live on the undo stack as-is. They are not serialized at all. That is fine for a live undo system, but not when either the undo stack needs to be serialized to disk, or when it is to be used for a quick-save feature as described by Raph.

Serializing can easily be done with attrs/cattrs, as @anthrotype pointed out. The UndoManager class will need to grow some hooks to make this possible, though.

I've tried to be thorough with my test suite, but the whole things has not been used in a real situation, so we'll have to see how well it performs.

As a larger thought/question: so far I've mostly been thinking about using an undo manager object for relatively isolated bits of data, for example, "a glyph", "font info", "kerning". But I'm now thinking of into the direction of using one undo manager for a whole object tree, and somehow from the UI allow to undo the topmost undo action for what is currently active/visible. So, for example, even if glyph "a" was changed last, if we are now editing glyph "b", we would only see undo for that glyph. The undo stack is then no longer properly a stack, as actions can be reversed in a different order than they were initially applied. (This has some implications for addition/deletion of objects, but I think it can be figured out what can be done safely.)

Merry Christmas everybody!

@davelab6
Copy link
Contributor

@justvanrossum awesome! Would you like to make a quick update to the md doc on this pr referencing your project?

@adrientetar what's needed to merge this?

Happy holidays to all!

@justvanrossum
Copy link
Author

Yeah, the md definitely needs some updating to reflect the latest discussions. Will do that some time next week.

@adrientetar
Copy link
Member

The undo stack is then no longer properly a stack, as actions can be reversed in a different order than they were initially applied.

But then stuff like canUndo() is no longer constant time i.e. you have to start looking inside the list of changes to even know what's undoable in the current context.

Or every object in the tree stores undo changes relevant to itself. e.g. if you have a changeset where glyph width changes and a layer path changes then glyphs stores its change and layer stores its change. Note if you do this you loose total order between changes (as opposed to a single undo stack in glyph) and you probably need to keep track of which change belongs to which changelist (e.g. timestamp and a number to enumerate the changes as they were in the original changelist).

I also think non-linear undo history could be confusing to the user. Suppose I change the width of the glyph then I make edits to a layer (possibly many edits over a long time). Then I switch to another layer, and I hit undo. Changes I made to the other layer don't apply so the undo would probably revert my glyph width change from long ago (note how we need to order and group changes together here, so to know what to revert). Isn't that confusing? Generally things that software does for a user should be clear & easy to understand.

I'm inclined to skip this at least for now. Note there are papers on non linear undo or undo in a collaborative environment.

@adrientetar
Copy link
Member

My notes on the delta based undo: https://gist.github.com/adrientetar/aff333972a927c3d0a2c641c27ad605a

@davelab6
Copy link
Contributor

davelab6 commented Jan 19, 2019

I had a call with @adrientetar, Jeremie Hornus, @raphlinus, and @justvanrossum just now to discuss this; looks like this doc is ready for merging and @BlackFoundry will soon make a PR with their 'snapshot based' undo system, and then look to implement the 'data delta' based approach, perhaps first for glyph objects :)

@justvanrossum
Copy link
Author

About non-linear undo: note that if each glyph (or layer) has its own undo stack, there already is something that can be called non-linear undo. But that isolates the glyphs from one another, and you cannot have an undo action that applies to multiple glyphs as one unit.

What I'm proposing (but it's rather pie in the sky, as I don't yet have an idea how to implement it in a scalable way) is that one looks at the subset of changes to objects that are currently in view. Editing glyph 'b'? We see the latest undo action that involves 'b'. Editing kerning? We see the latest undo action that involves kerning. From the user's perspective this wouldn't be noticeably different from how most font editors behave today.

So I think this can be done in a way that is not confusing for the user. Usability is of course a main priority. From an implementation standpoint the challenge would be: how to efficiently find the topmost undo item for a specific subset of the model data tree.

If this can be made to work, there will be a central stream of changes for the whole font project, which can be advantageous if we want to extend to a collaborative system. It also makes an arbitrarily-fine-grained change notification system possible. For example: "tell me about changes in any glyph", or "tell me about changes to the width of glyph 'a'", or "tell me about width changes in any glyph". Again, a bit pie in the sky perhaps, but who knows — superficially this sounds very useful to me.

@BlackFoundry
Copy link

PR done on TruFont and Tfont repositories.
Could you please review @justvanrossum before @adrientetar can accept and merge?
many thanks

@justvanrossum
Copy link
Author

Done: #628 (comment) Btw the other relevant PR is trufont/tfont#18. Cross-referencing FTW :)

@davelab6
Copy link
Contributor

davelab6 commented Feb 1, 2019 via email

@justvanrossum justvanrossum mentioned this pull request Feb 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants