Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When tracing, sanitize buffer between lookup applications #2468

Closed
simoncozens opened this issue Jun 12, 2020 · 14 comments · Fixed by #3027
Closed

When tracing, sanitize buffer between lookup applications #2468

simoncozens opened this issue Jun 12, 2020 · 14 comments · Fixed by #3027

Comments

@simoncozens
Copy link
Collaborator

Given a font with the following feature code:

lookup lookup1 { sub uni0061 by uni0062; } lookup1;
lookup lookup2 {
    sub uni0061' lookup lookup1;
    sub uni0062' by uni0061;
} lookup2;

feature rlig { sub uni0061' lookup lookup2 uni0062' lookup lookup2; } rlig;

running hb-shape -V test.otf abc returns:

trace: start GSUB stage	buffer: [uni0061=0|uni0062=1]
trace: start lookup 3	buffer: [uni0061=0|uni0062=1]
trace: end lookup 3	buffer: [uni0062=0|uni0061=1]
[uni0062=0+600|uni0061=1+600]

Reporting on the application of lookups 1 and 2 would also be useful for debugging.

@simoncozens
Copy link
Collaborator Author

I don't know if it's the cleanest approach, but this patch does what I want:

diff --git a/src/hb-ot-layout-gpos-table.hh b/src/hb-ot-layout-gpos-table.hh
index 5fd848fe..b23b9177 100644
--- a/src/hb-ot-layout-gpos-table.hh
+++ b/src/hb-ot-layout-gpos-table.hh
@@ -2709,9 +2709,11 @@ template <typename context_t>
   const PosLookup &l = c->face->table.GPOS.get_relaxed ()->table->get_lookup (lookup_index);
   unsigned int saved_lookup_props = c->lookup_props;
   unsigned int saved_lookup_index = c->lookup_index;
+  if (!c->buffer->message (c->font, "start lookup %d", lookup_index)) return false;
   c->set_lookup_index (lookup_index);
   c->set_lookup_props (l.get_props ());
   bool ret = l.dispatch (c);
+  (void) c->buffer->message (c->font, "end lookup %d", lookup_index);
   c->set_lookup_index (saved_lookup_index);
   c->set_lookup_props (saved_lookup_props);
   return ret;
diff --git a/src/hb-ot-layout-gsub-table.hh b/src/hb-ot-layout-gsub-table.hh
index 612720e6..ff64c5d6 100644
--- a/src/hb-ot-layout-gsub-table.hh
+++ b/src/hb-ot-layout-gsub-table.hh
@@ -1587,9 +1587,11 @@ template <typename context_t>
   const SubstLookup &l = c->face->table.GSUB.get_relaxed ()->table->get_lookup (lookup_index);
   unsigned int saved_lookup_props = c->lookup_props;
   unsigned int saved_lookup_index = c->lookup_index;
+  if (!c->buffer->message (c->font, "start lookup %d", lookup_index)) return false;
   c->set_lookup_index (lookup_index);
   c->set_lookup_props (l.get_props ());
   bool ret = l.dispatch (c);
+  (void) c->buffer->message (c->font, "end lookup %d", lookup_index);
   c->set_lookup_index (saved_lookup_index);
   c->set_lookup_props (saved_lookup_props);
   return ret;

@behdad
Copy link
Member

behdad commented Jun 15, 2020

Problem is, descended lookups happen at specific points. They are not a sweep over the buffer. Right now our messaging only happens before / after entire sweeps.

The reason I refused to add message calls in more useful places, eg. at the point that actual substitutions happen, is that the buffer is in a state at that time that the contents look junk to the receiver of the message. Same reason applies here. Because the buffer is in the middle of writing output to its "out buffer". That is not expose in the public API.

@simoncozens
Copy link
Collaborator Author

OK. So maybe we cannot do this in Harfbuzz with the current buffer->message implementation. I guess there are three ways forward:

  • Give up - the current tracing is the best we can do.
  • Make the tracing system separate from the buffer object. (But then what do we "hang" the message_func onto?)
  • Do more tracing, but have buffer->message only send the buffer to the receiver when it is safe to do so. i.e.
bool
hb_buffer_t::message_impl (hb_font_t *font, bool buf_viewable, const char *fmt, va_list ap)
{
  char buf[100];
  vsnprintf (buf, sizeof (buf), fmt, ap);
  return (bool) this->message_func (buf_viewable && this, font, buf, this->message_data);
}

I am always in favour of providing as much debugging information as possible, so if it was up to me I would use the third option, but of course it breaks backwards compatibility and it just looks a little hacky. I don't mind personally - I have another plan for what I want to do if we can't do this.

@behdad
Copy link
Member

behdad commented Jun 17, 2020

I agree with you on all those points. Specially, triggering callbacks when things change in the buffer, ie. upon substitutions, would be immensely useful.

Note that currently when we send messages during positioning, the positions in the buffer are not useful to the callback since they are in an internal format. Ie. you cannot render them.

I like your third option in a way. But also feels yucky. Let's keep exploring.

Another option or two:

  • Right now, hb_buffer_get_glyph_infos and hb_buffer_get_glyph_positions never fail. The latter modifies the buffer if buffer doesn't have positions. Ouch. If someone calls that from the callback while we are in susbtitution phase, that screws up shaping. We can detect if buffer is in internal state and return nullptr. That can ONLY happen in message callback, so is no change to the API. I'm actually 100% sure we should do that! The discrepancy re positioning above will remain.

  • If making the callback, we can bring the buffer into a good state. But that can be O(N^2) or more expensive. I don't like that.

@behdad
Copy link
Member

behdad commented Jun 17, 2020

The buffer serialize API also would need to refuse to serialize in those cases.

@simoncozens
Copy link
Collaborator Author

Fixing the buffer to be in a good state when calling the callback seems like the cleanest solution so far; it would enable tracing into substitutions, and give a buffer that works the way clients expect. O(n^2) is bad, but I think people should expect debugging runs to be non-performant - like running Python with a profiler on.

@simoncozens
Copy link
Collaborator Author

We can detect if buffer is in internal state and return nullptr.

I can't work out how to do this. It doesn't seem to be have_output or has_separate_output().

we can bring the buffer into a good state

I can't work out how to do this either.

@behdad
Copy link
Member

behdad commented Jun 19, 2020

I can't work out how to do this. It doesn't seem to be have_output or has_separate_output().

Yes we'll add a new bit somewhere. Depends on how we want to do it:

  1. Always disable buffer access in the middle of a pass. That would be something of a new bit that is set in clear_output and reset in swap_buffers/remove_output. Is going to be hard to verify we get this right in every call site.

  2. Only return null if buffer is in a state that looks inconsistent to the outside world. That is something as simple as !have_output || (out_info == info && out_len == idx).

I think I like the second approach better. We can improve on it later by bringing the buffer in good state and address the limitation.

I can't work out how to do this either.

Quite possibly around calling the callback we we copy buffer arrays into new arrays and temporarily set those on the buffer. I can do this when we get to it. Keeping that extra array around so we don't realloc it each time is also doable, but I'm fine not even doing that initially!

So yeah, I think we can get this working for substitutions. The issue with positioning is still outstanding. But yeah, I'm excited to see what you will do with this feature!

@simoncozens
Copy link
Collaborator Author

But yeah, I'm excited to see what you will do with this feature!

It basically gives me everything I need to write a step-by-step OpenType Layout debugger. Which a very useful thing. :-)

@behdad
Copy link
Member

behdad commented Jun 19, 2020

It basically gives me everything I need to write a step-by-step OpenType Layout debugger. Which a very useful thing. :-)

Of course! That what I had always designed this for, so I don't have to write that debugger! If you demo that I will add messages at every positive action in the buffer.

@simoncozens
Copy link
Collaborator Author

simoncozens commented Jun 19, 2020

Here's the demo: justvanrossum/fontgoggles#62

@norbusan
Copy link

norbusan commented May 5, 2021

Thanks a lot, that looks very promising!!

behdad added a commit that referenced this issue Jun 10, 2021
@behdad
Copy link
Member

behdad commented Jun 10, 2021

Another option or two:

  • Right now, hb_buffer_get_glyph_infos and hb_buffer_get_glyph_positions never fail. The latter modifies the buffer if buffer doesn't have positions. Ouch. If someone calls that from the callback while we are in susbtitution phase, that screws up shaping. We can detect if buffer is in internal state and return nullptr. That can ONLY happen in message callback, so is no change to the API. I'm actually 100% sure we should do that! The discrepancy re positioning above will remain.

I just did this.

  • If making the callback, we can bring the buffer into a good state.

Seeing if I can do this now...

@behdad
Copy link
Member

behdad commented Mar 22, 2022

I'm continuing this in #3495

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants