Wondering what might cause this spacing/kerning issues in a PDF

Hello,
I am currently reading a PDF book, and in the text, there are strange jumblings of the letters like the ones in this screencap:

I don't think things like the collision between /t and /e can be explained by human error. A typist normally couldn't mess up the kerning that is in the font, and at other places in the text, the kerning seems to work. There is perhaps some technical issue that caused this. If you have encountered similar errors when typesetting books, please share what caused them and if it was something with the font or with some other software.

Comments

  • never seen this before. Looks like some output mess; not all te instances are affected. I don’t believe the fault is in the font.
  • Vasil Stanev
    Vasil Stanev Posts: 775
    edited March 2022
    @Andreas Stötzner Could be. I had a hunch that the font might have been tinkered with to add some Latin glyphs for transliteration of #Sanskrit terms (like the analaya above); and that might have caused problems on export. This is only my guesswork, however.
  • My $10 are on the PDF renderer freestyling. Only recently found e.g. the Firefox built-in PDF renderer is in fact a full fledged JS implementation...
  • @Johannes Neumeier
    The screencap is from Adobe Reader; I opened the book in MS Edge, and the error was there too. And here it is in Adobe Illustrator:

  • John Butler
    John Butler Posts: 290
    Other viewers to try in Windows would include PDF-Xchange Viewer by Tracker Software, which I use as a daily driver instead of Acrobat, and Inkscape PDF import. And something by “Foxit,” I guess.
  • Some of the font spec PDFs from Adobe look like this, too. I thought that happens when the font that is used to render the PDF has different metrics than the one the pdf was written with. I remember that a long time ago, font embedding was rather hit and miss and I saw replaced fonts in printouts a lot. 
  • Some of the font spec PDFs from Adobe look like this, too. I thought that happens when the font that is used to render the PDF has different metrics than the one the pdf was written with. I remember that a long time ago, font embedding was rather hit and miss and I saw replaced fonts in printouts a lot. 
    Well IIRC, in early Acrobat/PDF, it was much more common for fonts to not be embedded. File size was a bigger issue, so embedding seemed like a luxury when the built-in substitution fonts would just be used when someone else didn’t have the PDF’s fonts already installed.

    So it’s kinda funny and irritating that some of those official Adobe PDFs from the early ’90s often look terrible now.
  • I’ve seen these kinds of things happen in PostScript fonts that contained components. We always decompose all components before generating final PS/CFF font files.
  • Mark Simonson
    Mark Simonson Posts: 1,734
    edited March 2022
    The font in the image first posted is Adobe Garamond Premiere Pro. The metrics seem to be intact (aside from the two glitches) but ligatures are disabled. I doubt the problem lies in the font (unless it's been modified).
  • I see 3 glitches:

    yo ucan
    ho tel
    kn ow

    In all 3 cases \o is involved.
  • On the off chance this was a free book, I looked it up and found it at https://www.pdfdrive.com/the-art-of-disappearing-dhamma-talks-e17589758.html

    Opening the page in Adobe Illustrator shows there are no space characters between the words. Instead, the book is built with extra tracking values added to the glyphs next to where a space is needed. Why the book was saved this way, I haven't a clue, but every now and again throughout the book, the extra tracking is either added or removed in the middle of words, which creates the visibly bad letterspacing.
  • Mark Simonson
    Mark Simonson Posts: 1,734
    edited March 2022
    It appears to have been generated using QuarkXPress:

    endstream
    endobj
    434 0 obj
    <</Author(Ajahn Brahm)/CreationDate(D:20110823122036Z)/Creator(QuarkXPress\(R\) 7.5)/EBX_PUBLISHER/Wisdom#20Publications/GTS_PDFXConformance(PDF/X-1a:2001)/GTS_PDFXVersion(PDF/X-1:2001)/ModDate(D:20120206175227-06'00')/Producer(QuarkXPress\(R\) 7.5)/Title(The Art of Disappearing)/Trapped/False/XPressPrivate(%%DocumentProcessColors: Cyan Magenta Yellow Black\n%%EndComments)>>
    endobj
    xref
  • ...

    Opening the page in Adobe Illustrator shows there are no space characters between the words. Instead, the book is built with extra tracking values added to the glyphs next to where a space is needed. Why the book was saved this way, I haven't a clue, but every now and again throughout the book, the extra tracking is either added or removed in the middle of words, which creates the visibly bad letterspacing.
    There is a reason why Illy should never be used for opening pdfs. What you are seeing is a problem with Illustrator, not the pdf. Try just copying that paragraph and psating into a text editor.

    If that pdf was used for print, an editor should have caught the problem.
  • Cory Maylett
    Cory Maylett Posts: 248
    edited March 2022
    Mike Wenzloff said:
    There is a reason why Illy should never be used for opening pdfs. What you are seeing is a problem with Illustrator, not the pdf. Try just copying that paragraph and psating into a text editor.

    If that pdf was used for print, an editor should have caught the problem.

    I didn't adequately explain myself. The problem is obviously in the PDF since the same odd spacing first shows up when viewed in various PDF readers and in Illustrator when opened up in that application. In those unaffected blocks of text, actual space characters are used instead of tracking values.

    Illustrator just chooses to display the problem as a tracking issue where it substitutes tracking values for the missing space characters. Affinity Publisher and Designer, for what it's worth, run into the PDF problem too, but Affinity seems to choose simply to close up the missing spaces rather than substituting approximated tracking.

    An oddity, as you mentioned, is that the problem doesn't show up when copied from the PDF and pasted into a code editor, but the problem does show up in MSWord, which, unlike a code editor, attempts to display some of the formatting.

    This observation doesn't solve the question of where the problem in the PDF originated, but it does help narrow down the possibilities.

    What originally caused the problem in the PDF, I don't know. As @Mark Simonson found, the document was originally laid out in QuarkXPress, so it might trace back to that. However, Quark has had over 30 years to work out these glitches, so I suspect there's more to it. I suspect the problem has something to do with some unknown step in the procedure from the original Quark layout to the final downloaded PDF.
  • I took a look at the PDF. The problems are rare and inconsistent; most of the time a word is fine, but very occasionally it shows the problem. When I cut and paste (from OS X Preview to Sublime Text), there are no spaces between the words; in the PDF, all the lines are laid out with PDF "TJ" operators, with every single glyph positioned individually on the line. This seems like a very unusual way to lay out text in PDF. There are very few runs of "naturally" positioned text (Tj operator). So I am included to blame QuarkExpress.
  • Perhaps the pdf was created via a Postscript to pdf process and something went wrong?