I'm working on a Python library that attempts report on language support given a character set.
From an amateur perspective, it seems like language support is composed of:
My library focuses on the former aspect (as it's more easily detected and isn't file-format dependent). I've gotten to the point that I want to fill in some built-in character sets, and have a couple of rather pedantic questions to float to those of you who actually curate language support lists.
I'd be happy to also hear about how anyone else addresses this. And also, realizing that it may be quite subjective and more nuanced than I imagine, about what other issues may come up in developing a programatic interface.