CJK Unified Ideographs Extension B

I tried to paste ‎21629  [CJK Unified Ideographs Extension B]. It previewed ok, but stripped all content after the character upon posting.

Comments

  • Judy Safran-Aasen
    edited August 2016
    I've tried to paste it in as well. Using IE and Edge on Windows 10. Not working either. Could be a font fallback issue or that range is not supported here?
  • Denis Moyogo Jacquerye
    edited August 2016
    Testing on Safari and Chrome on OS X doesn’t seem to work either.  

  • KP Mawhood
    KP Mawhood Posts: 296
    edited August 2016
    Is this the same issue as using-emoji-cuts-my-post-off? I found it odd that I could preview, but not post.
    …the database’s encoding does not support characters not in Unicode’s BMP and cuts off posts as a result.

  • Yes @Katy Mawhood - BMP is uXXXX . Anything with 5 letters like 21629 (I think you mean u21629) is non-BMP. That basically mean most recent large unicode additions like cjk extension b or emoji as the BMP is essentially filled up a while ago. Only the odd one or two standalone characters can be added in BMP, not whole blocks of new stuff.
  • Assuming the database in question is MySQL, this looks like the MySQL utf8 vs utf8mb4 encoding issue.
  • Hin-Tak Leung
    Hin-Tak Leung Posts: 361
    edited August 2016
    Not necessarily - I fixed a bug in the Linux kernel a few years ago about nls file naming of non-ascii file names, and the Linux kernel does not use mysql :smile:

    Unicode 2.0(?) is BMP only - so it is perfectly legitimate for anything to claim to be 'unicode compliant' and yet does not support non-BMP characters.