Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for font files #2516

Closed
larsenwork opened this issue Jul 21, 2015 · 20 comments
Closed

Add support for font files #2516

larsenwork opened this issue Jul 21, 2015 · 20 comments
Labels

Comments

@larsenwork
Copy link

There are many (and popular) open source font projects on GitHub but currently font projects get's classified as shell, python etc. projects as the source files for the fonts aren't recognised.

There are two formats primarily used for open source fonts:

  • UFO where the files have a .glif extension example
  • SFDIR where the files have a .glyph extension example

Type: would it be possible to add "font" as a type similar to how e.g. "programming" is a type - if not then I'm unsure which type category they belong to?

@glyphobet
Copy link

+1

@arfon
Copy link
Contributor

arfon commented Jul 22, 2015

Type: would it be possible to add "font" as a type similar to how e.g. "programming" is a type - if not then I'm unsure which type category they belong to?

Linguist currently defines 'data', 'programming', 'markup', 'prose', or 'nil': https://github.com/github/linguist/blob/master/lib/linguist/languages.yml#L3

I agree none of these are a great fit but it also doesn't feel right to define a font type. Personally I'm leaning towards data but this would mean that it wouldn't be included in the language stats for the repository (and also wouldn't be indexed by search). It would stop them being misclassified as Shell though...

@larsenwork
Copy link
Author

@arfon What about a graphic category (could include svg's, png's and whatnot too) ?

@larsenwork
Copy link
Author

And just to clarify - the actual font files aren't currently being classified but you often have some shell/python or other files in your font project too that gets recognised.

@arfon
Copy link
Contributor

arfon commented Jul 22, 2015

And just to clarify - the actual font files aren't currently being classified but you often have some shell/python or other files in your font project too that gets recognised.

OK thanks for the clarification.

Through the BlobHelper class Linguist knows a little about images already (including PNGs).

What about a graphic category (could include svg's, png's and whatnot too) ?

I'm not in love with that idea to be honest but it's definitely not a terrible suggestion 😄. It would be helpful for me to know what you would like to see accomplished here. Is it the ability to discover font files on GitHub, have them accurately rendered/syntax-highlighted on GitHub, contribute to the language bar statistics or all of the above?

There's been some prior discussion on SVGs here and here which might be useful for context.

@larsenwork
Copy link
Author

Rendering + syntax highlighting isn't that important as you rarely edit and look at the glyph data "directly" but almost always use a font editor of sorts.

  • It's the ability to discover/browse
  • contribute to language bar statistics.

You could easily make the case that fonts and thus font files are mini programs so the programming type could be fitting too.

What I'd also like (but that's another issue in it self) is to have categories/tags of sorts...to make it easy to find all font projects regardless of what language/format they are in.

@davelab6
Copy link

Off the top of my head:

extention file type
.mf metafont native format
.sfd fontforge native format
.sfdir fontforge native directory format
.glyph fontforge native directory glyph format
.xgf xgridfit, fontforge xml hinting markup
.glyphs glyphs native format
.ufo robofont native directory format
.glif ufo glyph format
.plist apple metadata format used by ufo
.vfb fontlab v1-5 native format
.fog fontographer native format
.ttx fontTools xml representation of opentype binary data
.otf opentype binary
.ttf truetype/opentype binary
.ttc truetype/opentype binary
.dfont mac binary
.svg svg web font format
.woff w3c web font format
.woff2 w3c web font format
.eot microsoft web font format
.vtp microsoft native font format

cc @twardoch who may know more

@glyphobet
Copy link

extension file type
.asfd FontForge compressed SFD
.pfa Adobe Printer Font ASCII
.pfb Adobe Printer Font Binary
.afm Adobe Font Metrics (probably legacy)
.suit Classic Mac OS font "suitcase"
.ps PostScript

There are more here, but a lot are legacy: http://fileinfo.com/filetypes/font

@larsenwork
Copy link
Author

@davelab6 wouldn't it be a good idea to discriminate between "source" files (type: programming), and then "binaries" (ttf, otf etc.)?

Also not sure if e.g. ufo and sfdir can be recognised as they are technically just folders?

@grzegorzrolek
Copy link

I don’t think it would make much sense to put the multitude of source files that font binaries are built with these days under a common type font. Not only these are, after all, quite distinct, if somewhat competing ‘languages’, be it proper languages like MetaFont or PostScript, XML-based wrappers, plain lists of data, binaries in some form or another, or whatever… Such a label is also more of a ‘resource’ type than a source file type.

It all makes it similar to how regular software projects are built, and as with other projects each and every source file should be classified by the particular ‘language’ it is written in, not by the type of target product it happens to be compiled into. That’s because particular people, or for that matter particular font editors, write code with particular languages. It’s only the various target font binaries that, if stored in the repo, could have their type recognised as font, just as various image files Arfon has pointed to have their type recognised as image.

@davelab6
Copy link

I agree with @grzegorzrolek but if Github is unwilling to recognise each item in the table, a single group item for all would be better than nothing :)

Github also refuses to add the SIL Open Font License to the license picked dropdown because fonts are now too niche for their scale of operations :(

@grzegorzrolek
Copy link

Well, I don’t think there’s a need to recognise all formats listed, as only a fraction of these are really distinct or meaningful as far as Linguist is concerned. Most of those listed are:

  • more or less static data contained in popular wrapper formats (.glif, .plist, .glyphs, .svg, .ttx, etc.)
  • directories of other types of files, including others on this list (.ufo, .sfdir)
  • proprietary source binaries (.fog, .vfb, etc.)
  • the industry target binaries, or the ‘font resource’ blobs (.ttf, .otf, .woff’s, .eot, .pfa or .pfb, .dfont, .suit, etc.).

This leaves us with a handful of file types for Linguist to consider, really. It’s MetaFont (.mf), FontForge sources (.sfd and thus .glyph, but these are still more like custom data-containers), and various PostScript-derived formats in their readable form (let’s say .ps’s, if PostScript as such isn’t supported already). I would also add Adobe’s OpenType feature file to the list (commonly .fea), as well as various Apple’s AAT input files (.atif, or the .mif, .jif, .kif family), or the SIL Graphite Description Language (.gdl). That, I think, would be about it for the time being, but I might have missed something.

If more font projects would use ‘proper’ build systems with more of those ‘source’ kind of files, maybe it would help these kind of projects be distinct enough to make their sources more interesting to parse and make statistics of. For now, though, it looks like most font-production people either aren’t savvy enough or they just don’t care, and simply dump into a commit whatever they happen to have in their working directory, including duplicated data scattered through different formats, intermediate files, various blobs and binaries… all of which of course does not necessarily make it any more useful or interesting, most often to the contrary.

@larsenwork
Copy link
Author

@grzegorzrolek please keep off topic rants out of this 😄

To me:

  • Human readable source files as (.glyph, .glyphs, .glif, ...) are what I think is most important to recognise since they represent source data that others can inspect and edit.
  • Binaries (.ttf, .otf, etc) are not that interesting - GitHub is about the code (to me)
  • Folders (.ufo, .sfdir etc) are not that interesting either. If they were to be the only ones recognised a font project with with only one weight and a couple of scripts would be recognised by the scripts.

@arfon
Copy link
Contributor

arfon commented Jul 28, 2015

Thanks for the extra information @larsenwork and @grzegorzrolek. I've been struggling to know exactly how to proceed here but your last comment @larsenwork has cleared up some of my thinking.

Firstly, Linguist is primarily a (programming) language detection library. Being able to detect the syntax contained within a file is important when trying to render the contents of the file in the browser (i.e. with the correct syntax-highlighting).

While we do some rudimentary stuff to identify other types of files e.g. images this information isn't really surfaced anywhere in GitHub.com (i.e. as a browse/discover experience to find 'images').

To your points @larsenwork (in reverse order):

  • Folders (.ufo, .sfdir etc) are not that interesting either. If they were to be the only ones recognised a font project with with only one weight and a couple of scripts would be recognised by the scripts.

Unfortunately I think this is a non-starter. Linguist works on a file-by-file basis and so we can't do anything smart with a folder, only the (file) contents within.

  • Binaries (.ttf, .otf, etc) are not that interesting - GitHub is about the code (to me)

We actively avoid doing anything with things like binary blobs. Given that they're not human-readable or index-able by our search infrastructure I don't think we can do anything with binary font files here either.

  • Human readable source files as (.glyph, .glyphs, .glif, ...) are what I think is most important to recognise since they represent source data that others can inspect and edit.

This makes the most sense to me. I'm still most in favour of defining some new Languages for these last few types in languages.yml and marking them as data which would make them syntax-highlight correctly (see how XML files look for example) but this doesn't solve the issue of not having them be discoverable in GitHub explore.

You could easily make the case that fonts and thus font files are mini programs so the programming type could be fitting too.

I hear you 😄 - we could mark them as programming thus making them first class citizens in the GitHub experience.

Finally, for completeness, you can currently find font files with searches like this:

/ cc @github/languages @pchaigno @larsbrinkhoff incase they have strong opinions here

@larsenwork
Copy link
Author

@arfon sounds good
If they are marked as data will it count in the language statistics? This is probably most important to me - see e.g. https://twitter.com/larsenwork/status/625981377310248960 why.

ttf files
otf files

Didn't know this but not really useful though :)

@arfon
Copy link
Contributor

arfon commented Jul 28, 2015

If they are marked as data will it count in the language statistics? This is probably most important to me - see e.g. https://twitter.com/larsenwork/status/625981377310248960 why.

No, unfortunately not. I meant to write that in my last comment. Perhaps the best option is to go with programming.

@aroben
Copy link
Contributor

aroben commented Jan 12, 2016

Another option is to classify .glyph/.glif as markup. That is included in the repository language statistics. And I think it's a better fit than programming since these files don't contain executable code.

@inoas
Copy link

inoas commented Sep 24, 2016

👍

@stale
Copy link

stale bot commented Nov 6, 2018

This issue has been automatically marked as stale because it has not had activity in a long time. If this issue is still relevant and should remain open, please reply with a short explanation (e.g. "I have checked the code and this issue is still relevant because ___."). Thank you for your contributions.

@stale stale bot added the Stale label Nov 6, 2018
@stale
Copy link

stale bot commented Nov 20, 2018

This issue has been automatically closed because it has not had activity in a long time. Please feel free to reopen it or create a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants