Can stolen fonts be found on the web based on Bezier data of their outlines?
Vasil Stanev
Posts: 775
A non-designer asked me this question while trying to figure out what my job was.
1
Comments
-
Most web pages send their text to a browser as character codes which are rendered in a font installed on the computer viewing the web page. These cannot give you the data for any font which is not already installed on your computer.There are some websites which use WOFF format fonts which are sent to the browser in order to render a page exactly as the designer specified. In theory this data could be intercepted and copied.So I think the answer to your question is yes in some circumstances WOFF font data can be intercepted but it would be complicated to do so.And of course the font data can be retrieved from PDF files which have embedded font data.1
-
Technically it's easy, you could do the following:
1. get the font url from the html page. You'll need to parse the css @font_face rules
2. download the font
3. convert font
4. open the font in fontTools
5. make a hash for each glyph's outlines in the downloaded font
6. Same as step 5 but for a local font you want to match
7. compare the hashes between fonts
I don't recommend doing this. Most EULAs probably forbid downloading fonts locally. Also, what are you going to say to someone who's stolen fonts? You've technically scraped their web pages which is also a legal grey area.
People who pirate stuff wouldn't purchase it in the first place. It is best to enjoy life instead.4 -
A less intrusive approach:
1. Take a screenshot of the site which is using suspect fonts
2. swap the fonts using something like https://fontdragr.js.org/
3. Take a screenshot of the site using the swapped fonts
4. Compare the two images by counting pixel differences.
Could be automated using something like Selenium webdriver.1 -
Marc Foley said:People who pirate stuff wouldn't purchase it in the first place. It is best to enjoy life instead.
BTW to circumvent any legal aspects you could run the software on a server in a country that's not a signatory to any of that. But anyway: Ethical > Legal0 -
FWIW, nearly EVERY recent release at MyFonts generally shows up for free download within 24-48 hours having been reverse engineered from the web font preview system there as if there is a bot designed just to do this.3
-
Interesting.
And I guess if that does NOT happen to your font that's in fact the worst news. :->1 -
It is he worst news. 3 months to get the font in MyFonts and within 2 hours I see it on vkontakte, a little later on some Arab site, and the genie is out of the bottle, no pun intented. The current MF model is dead to me. As I explained elsewhere.
0 -
Yes, that's bad.
But I what I meant is if "nearly every recent release" is getting pirated, but yours doesn't, it must be a really bad design. :-/0 -
Stuart Sandler said:FWIW, nearly EVERY recent release at MyFonts generally shows up for free download within 24-48 hours having been reverse engineered from the web font preview system there as if there is a bot designed just to do this.
I'ver noted this about MyFonts two times - last was in March of this year. Emailed them multiple timers. They just ignore.0 -
Ralph Smith said:Stuart Sandler said:FWIW, nearly EVERY recent release at MyFonts generally shows up for free download within 24-48 hours having been reverse engineered from the web font preview system there as if there is a bot designed just to do this.
I'ver noted this about MyFonts two times - last was in March of this year. Emailed them multiple timers. They just ignore.2 -
Interesting thread! I'm working on exactly such a tool that finds and identifies webfonts. (That is, when I'm not working on one of the other 107 side projects.)
Reliably fingerprinting fonts (even when bezier data has changed) and staying within the webfont's EULA while doing so, didn't turn out to be the biggest problem. Doing this for a meaningful amount of websites, new or old, is where the challenge lies. Luckily (?) it's a problem that can be easily solved by throwing hardware, and thus money, at it.1 -
@Roel Nieskens meet @Lars Schwarz who's already created such a tool1
-
@Lars Schwarz @Stuart Sandler Nice, is it public? I can't find a link from Lars' profile.0
-
Even nicer would be to have per-purchase meta info in the font and see not where your pirated fonts show up at, but where the ripped off versions originate from. Of course a moot point for any sales through retailers or for rips based on tracing outlines instead of outright copying.0
-
I've long thought that any sold copy of a font should have a serial number (and at least people who don't know how to remove/change it would leave a trail).2
-
I'd even suggest that the unique serial number be embedded through a steganographic process during the sale, making it nearly impossible to locate and change the serial number.
Much of the future, I'm afraid, lies in some way akin to the Adobe Fonts model — never really getting direct access to the font at all except via a highly controlled, synced, obscured, temporary and conditional rental tied directly to software or an online account that keeps track of everything. It's not that this kind of thing can't be cracked, but casual theft and sharing becomes much more difficult and risky.2 -
I remember Chuck Davis of Letterhead Fonts did a dynamic serialization of font data in realtime during purchase with FontGuard - https://www.letterheadfonts.com/piracy/fontguard.php
@Roel Nieskens I assumed Lars would be responding to this thread soon directly
1 -
I don't know if embedding serials will solve anything but here's how you could do it. Scan for straight vertical or horizontal lines in an unimportant glyph like notdef or logicalnot and encode redundant points along the lines in a sort of barcode pattern. You'd need a certain minimum length of line to pull this off. In cases like grunge fonts where no long lines exist you might have to inject a benign looking glyph like a blockdraw character or something. Potential problem: web fonts are often subset. I think underscore is often included in subsets. Maybe | or [ ? The redundant points shouldn't cause rendering issues. If you can manage a line that's over 128 units in length, you could encode 64 bits (unsigned) which would allow serials up to 18,446,744,073,709,551,615. I'm not confident that that number is correct!2
-
You could also use TrueType instructions to embed any (encoded) data among the assembly code. Or through entries in a bogus lookup table. Fun stuff!
Plenty of ways to do this, but if it isn't accompanied by the threatening language in the Fontguard thing (and isn't actually followed up on), I wonder if this'd really accomplish anything. A friendly nudge to the user of the pirated font might accomplish more.
1 -
Ray Larabie said:Scan for straight vertical or horizontal lines ....
Reminds me of the time I encoded my initials into the node coordinates of a font for a client that didn't allow me to advertise my contribution. :-)1 -
Great ideas all around but it comes back to the FontGuard thing - How could a foundry to distributor announce to its customers, we've encoded the font on the fly with your name so if it gets leaked, we're coming after you legally without creating harm to the buyer/seller relationship? Is there a good way of messaging this?0
-
If you're enough of a sleazebag you could always deploy the cheery forked-tongue language that's de rigueur these days. Something like this:
"
Great news! We've personalized your font with your name, just so in case a copy of it ends up in the wrong hands we'll be right there to make sure everybody is protected. And don't forget to sign up for the $99/month* piracy insurance!
* Rate subject to change no earlier than five calendar days after enrollment.
"
And in case you think I'm kidding:
0 -
Lots of things come with serial numbers, so I don't think using them would create customer relation problems. Most software I use has serial numbers, which also, in some ways function as account numbers. If I want free updates, discounts or change my email address, I'll often need to supply these numbers.
A major impediment to this, however, is that most type foundries are small operations that sell most of their fonts through various distributors. Most of us aren't in the position to encode each copy of a font with a unique and obfuscated serial number, and none of the distributors are set up sell serialized copies anyway.
Another problem is recourse. Taking legal action against someone for illegally using a pirated $30 font just isn't practical. Anyway, US copyright laws for fonts are so bassackwards that I'm not sure any longer if legal recourse is even possible.
Assuming it is possible, it would take a coordinated effort to serialize individual copies of fonts among all the distributors, then for them to collectively go after the free font sites that make rampant piracy easy. A few multi-million-dollar judgments against some of the free font distributors would put most out of business and force the others to take serious steps to keep commercial fonts off their sites.
Even then, however, enforcement would be difficult. Take Pirate Bay, for example — the site that provides a way to illegally distribute movies, music, games and most everything else via BitTorrent. They get shut down and simply move to a different server at a different location. It's been a Whack-a-Mole situation for years.
At best, I think the problem could be substantially mitigated, but not eliminated — at least not without a radical rethink of the way fonts are distributed, like, for example, the way that Adobe has done.0 -
One potential problem with serial numbers or other per-user personalization in fonts is that people use font fingerprints for version matching across users.
Suitcase Fusion has their FontSense IDs, intended to match fonts. I think it would be desirable if the “same” font still had a FontSense match even when there is a different serialization/personalization involved. At the very least, foundries should be aware of whether their chosen solution causes their every font to show up as “different” to FontSense.
Note that FontSense is based on checksums of particular font data/tables, but not of everything in the entire font. Therefore it is at least conceivable one could have a serialization scheme that does not mess with the FontSense ID.
2 -
I had this idea years ago: Just claim that there is a unique ID (or even the user's credit card number) hidden in the font in a way that's impossible to detect. No need to actually do it. Probably a bad idea to lie to your users, though. The whole font business, such as it is, is based mostly on trust.0
-
Mark Simonson said:
The whole font business, such as it is, is based mostly on trust.
Most font buyers don't read the EULA, and have no idea who made the font. To me the font business is based mostly on fashion.
Not lying however is a personal decision (and a good one). Although this does include not ending your prices with a "9" for example.
0 -
What I mean is, when someone buys a license to a font, they receive font files. I am trusting them not to share them, since there is very little I can do to stop them and almost no way to know if they do.1
-
Lately I've been thinking about releasing my next typeface exclusively on Adobe Fonts and letting people who want extended licenses contact me directly. Seems like a good way to limit piracy.3
-
@Mark Simonson But knowing full well that many of them share, we still don't stop selling them.0
-
Right, I'm depending on the users who pay. I do all I can to keep them happy. I don't waste time on the freeloaders.2
Categories
- All Categories
- 43 Introductions
- 3.7K Typeface Design
- 803 Font Technology
- 1K Technique and Theory
- 622 Type Business
- 444 Type Design Critiques
- 542 Type Design Software
- 30 Punchcutting
- 136 Lettering and Calligraphy
- 83 Technique and Theory
- 53 Lettering Critiques
- 485 Typography
- 303 History of Typography
- 114 Education
- 68 Resources
- 499 Announcements
- 80 Events
- 105 Job Postings
- 148 Type Releases
- 165 Miscellaneous News
- 270 About TypeDrawers
- 53 TypeDrawers Announcements
- 116 Suggestions and Bug Reports