I don't have time to write as long a response here as I would like, but here are some thoughts:
1. When talking about colour in typography, it is important to distinguish between the use of multiple colours applied to text, and use of multiple colours applied to individual letters. These are different uses, with different roles in typography, and different implications for reading. I don't think Mark made this distinction, and conflates chromatic typography — use of colour in the design and arrangement of text — and chromatic fonts — use of colour in the design of glyphs. When he's talking about colour as analogous to things like bold and italic to make a case for chromatic fonts, these categories get confused. Colour is already an aspect of typography, especially on the Web where it doesn't cost anything, and has been for a long time without involving chromatic fonts.
2. One of the things on which all readability research agrees — as does common sense —, is that higher contrast between positive figure and negative ground aids reading, while we find it harder to read at lower contrasts. This is why 'black' and 'white' remain the standard of readable text reproduction, despite the freedom and low cost of using different combinations on the Web. When people do deviate from this, as in the recent trend for grey text on many websites, readability suffers (especially for those of us whose eyes are no longer young).
3. Maintaining strong contrast between figure and ground requires consistent density in the elements that make up the former, which is why a poorly inked page or some kinds of antialiasing of text on screen impede readability: they degrade the letter image by making some parts of it lighter than other parts, reducing their contrast with the ground. It isn't difficult to see how changing the colour of some parts of the letter has the same effect, which is why I consider polychromatic fonts to be only appropriate for display typography at large sizes, i.e. at sizes at which we do not rely on complete letter image for recognition, but rather on edge mapping.
4. As has been demonstrated by Denis Pelli and other researchers, we read within spatial frequency channels determined by the characteristic spatial frequencies of scripts and text styles. Changes in spatial frequency in text require retuning of spatial frequency channels, which slows and interferes with reading. It isn't difficult to see how colouring parts of a letter in different ways may alter spatial frequencies, in combination with the reduced stroke density and contrast discussed above. This is not to say that one couldn't make polychromatic fonts that stick within a single spatial frequency channel, but being aware of the need and strategies to accomplish it would need to be part of the design process. It's something we take for granted when designing text typefaces for typical, single-colour, high contrast reproduction, because we've inherited evolved letterforms and weight/proportion relationships that work and which we tend not to question. It seems to me that an intelligent approach to designing polychromatic fonts would start by questioning those things, rather than just applying colour in fancy ways to letterforms that evolved in monochromatic media.
Yes, that's right. The first version of Zapfino for Apple was very small on the body, in order to accommodate the very tall ascenders of the taller style variants. This was done in consultation with Apple, and they shipped the font, but then they got a lot of complaints from people saying that the font was too small relative to other fonts. So Apple decided to change the UPM so that everything would scale larger.
I am assuming that Microsoft is accommodating legacy systems in sticking with their power of 2 recommendation?
Not just legacy systems. The optimisations in the rasterisers are current for DirectWrite as well as the older GDI environments. Basically, Microsoft's view is that since the optimisations improve things — even if the effect of the improvements are smaller and smaller — why would you ditch the recommendation?
That said, their position has shifted. When we started making fonts for Microsoft in the late 1990s, a power of 2 UPM was a procurement requirement. Now it is still a recommendation, but no longer a requirement, and I have occasionally shipped them a font that, for one reason or another, had a non power of 2 UPM.