Proof of concept: Geheimsprache
yanone
Posts: 132
Want to put this up for discussion: http://yanone.de/typedesign/code/geheimsprache/
Any useful?
Any useful?
1
Comments
-
it's a fun idea! if you're interested in crypto and want to learn about how to break your own implementation, i suggest you look at the matasano crypto challenges.
i suspect the value in geheimsprache is primarily in obfuscating email addresses and telephone numbers so they won't be crawled but can be viewed by regular folks. the challenge is that the 'attacker' has access to the rebuilt font (and the table), but even if they didn't and only had access to the ciphertext, you're still victim to the predictability of the language you're communicating in. sign up for the challenge, you'll be surprised how fun this is. seriously!
i suspect you're right that most crawlers won't go to the effort, but your friends might groan when they copy and paste and try to send an email to ҸÏ×╥╡ĪңЬOõңOÏaO╇ĪĪΞÏңЬụ.
i think if i were publishing a blog that contained sensitive topics or a diary i wanted my friends to be able to read, but that i wanted to be uncrawlable, i might go in for this, but i might also be irritated a few years down the road if the only copy is an Internet Archive post of gibberish when my server goes belly up. who knows. i could easily see publishers thinking this was a great idea for stymying instapaper-like products until the support emails came in.0 -
Or simply a way to create better looking captchas.0
-
What my brother just told me is making the idea look really stupid:
If you print that page to a PDF, you can select and copy/paste that text and it will copy as human readable text, not the transmitted garbage.
Tried in Acrobat Reader and Mac’s Preview.app.
Do they have OCR built in?1 -
That's awesome, probably a much simpler explanation than OCR. Just a guess but I'd assume that pdfs are being encoded with something like the cid table or the glyph names rather than unicode values.0
-
Mmmmm... interesting...... let's see....
- In the scrambled webfont, the Unicode is shuffled (but the Glphynames remains correct).
- In the generated Pdf, the subsetted/embedded font is correctly encoded again.
How can that happen?
I'm guessing that the soft in charge of creating the Pdf it's using the Glphynames to re-encode the auto-generated subsetted/embedded font that gets inside the Pdf.
I doubt it has anything to do with OCR.
If you try shuffling both, the Unicode AND the Glphynames, the Pdf generating algorithm should not be able to re-encode the font... I guess....0 -
Yes, scramble the glyph names as well.0
Categories
- All Categories
- 43 Introductions
- 3.7K Typeface Design
- 805 Font Technology
- 1K Technique and Theory
- 622 Type Business
- 444 Type Design Critiques
- 542 Type Design Software
- 30 Punchcutting
- 137 Lettering and Calligraphy
- 84 Technique and Theory
- 53 Lettering Critiques
- 485 Typography
- 303 History of Typography
- 114 Education
- 68 Resources
- 499 Announcements
- 80 Events
- 105 Job Postings
- 148 Type Releases
- 165 Miscellaneous News
- 270 About TypeDrawers
- 53 TypeDrawers Announcements
- 116 Suggestions and Bug Reports