Can MS Word files be converted from ASCII fonts to Unicode fonts?

I am an American unable to read any Indic languages. A translator has prepared a large Hindi file by using Kundli type, which is ASCII based. Because the translation will be published and presented in a variety of forms (both as a printed book and on internet platforms), I suspect that down the line a Unicode file would be more reliable and flexible. Are there ways to convert MS Word files formatted in ASCII fonts to Unicode fonts? (Note that the translator wrote, "I can't adopt a Unicode fonts because I don't know to type in Unicode fonts. Unicode fonts are used only on internet only; I've never heard [of it being used] in book publishing." Can those familiar with Hindi keyboards confirm?)


  • So, it is not only about the fonts, but about the encoding of the Hindi in your Word file. Most of the sites that talk about this make it sound like they are the same thing.

    The Kundli font isn’t truly ASCII based—it uses its own custom encoding which is shared by some other fonts (Kruti Dev etc.). It may impersonate ASCII, though.

    Higher-end publishing in India often uses Unicode fonts. They are not only for the internet. Modern apps used in publishing, whether Word, Adobe InDesign, or QuarkXPress, are all specifically designed around Unicode and assume fonts are using Unicode. With proper Unicode text you can get Hindi spellcheckers and the like in these apps.

    There are many apps and utilities to convert Kundli encoding to Unicode.

    However, all the converters I have seen only work on text, not on Word files.

    There are other folks on this forum far more knowledgeable than I about Indic languages, so with any luck they may have a better solution—and be able to correct any inaccuracies in my writeup.
  • Converting custom encodings for Indic scripts will involve complexities beyond simple character-to-character conversion. For example, ordering of characters needs to be changed to match Unicode's logical (i.e. reading) encoding order. SIL has a text encoding conversion utility called TECKit that was designed to support such complex conversion mappings. SIL also provides the SILConverters utility  that can be used in conjuction with TECKit for converting Word documents.

    Of course, once the encoding is changed, you can no longer use the Kundli fonts. But there are plenty of Unicode-encoded Indic fonts available these days.

    Btw, in the future, you might want to make Unicode encoding a requirement for translation services.
  • Many thanks to Thomas and Peter, for helping a blind man to see. Your biblical names are somewhat at odds, but you share a scriptural message. As to the future, the problem is that there's far too little of it.
Sign In or Register to comment.