Wednesday, August 24, 2005

Unicode Hebrew and Greek

There has been some discussion of Unicode Hebrew in comments on Wayne's posting Who hates divorce? Unicode Hebrew and Greek certainly cause problems in e-mail, because not all e-mail programs recognise Unicode, although it has been the international standard on the Internet for many years (Eudora is a particular culprit here). It seems to me that there is much less of a problem using Unicode in web-based programs like Blogger - as long as the program supports Unicode properly, and it seems that Blogger does.

But this does require that those reading the blog are using browsers which support Unicode Hebrew and Greek. This should not be a serious problem for most readers, for Microsoft Internet Explorer (5 or 6, on Windows 95 and later) and Mozilla Firefox (my recommendation, a free download for Windows, Mac OS X, Linux and Solaris) offer full support for display of Unicode Hebrew and Greek. However, some people may be reading this blog with systems and browsers which do not support Unicode.

Users also need appropriate fonts, and this is where there may be a problem. The default fonts for this blog do not support Hebrew or Greek. My system (Firefox on Windows XP) substitutes them with fonts which do support these scripts, but not always in an ideal way. It seems to use Arial for Hebrew consonants and vowels, but not accents, and for monotonic Greek letters as used in modern Greek. But it uses a different substitute font for Hebrew accents and for polytonic Greek letters i.e. anything with a breathing mark, an iota suffix, a diaeresis, or a grave or circumflex accent. So I see a mixture of fonts, which is rather ugly, and also rather small for Hebrew, but readable. But other systems may not substitute so well.

I note that of the fonts on Windows XP, Tahoma offers full support for polytonic Greek. It may be worth modifying the template for this blog so that this is the first suggested substitute font. Tahoma also supports unaccented Hebrew, but for good quality accented Hebrew either SBL Hebrew or Ezra SIL is needed - and although these are free downloads (and should work well in all applications, not only in Office 2003, in a fully updated Windows XP) they will not be on most readers' systems.

For test purposes, here are some Unicode Hebrew and Greek texts:

Fully pointed and accented Hebrew (thank you, Tim):

בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ

Enlargement insertion by Wayne: We should be able to display these Hebrew characters so they are larger:

בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ

Increasing display size can be done in word processing programs, as well as in blog posts (for Blogger and Blogspot, use the Compose mode when creating or editing posts to view the Font and font size buttons).

Hebrew pointed but not accented:

בְּרֵאשִׁית בָּרָא אֱלֹהִים אֵת הַשָּׁמַיִם וְאֵת הָאָרֶץ

Consonantal Hebrew only:

בראשית ברא אלהים את השמים ואת הארץ

Fully accented polytonic Greek (fully composed, according to Internet recommendations):

Ἐν ἀρχῇ ἦν ὁ λόγος, καὶ ὁ λόγος ἦν πρὸς τὸν θεόν, καὶ θεὸς ἦν ὁ λόγος.

Unaccented Greek:

Εν αρχη ην ο λογος, και ο λογος ην προς τον θεον, και θεος ην ο λογος.

I would be very interested to hear, in comments on this, from anyone who has serious problems reading any of these texts.

UPDATE: In the first comment to this post, Tyler Williams points us to the Greek Unicode Tables on Brandon Wason's Novum Testamentum blog. And Brandon's post points us toward Rod Decker's helpful webpage on Hebrew and Greek Unicode fonts.

At Wed Aug 24, 07:02:00 AM, Blogger Tyler F. Williams said...

Looks great on my computer (XP) in Firefox, but in IE the pointed Hebrew and polytonic Greek the pointing doesn't show up properly. I've had the same issue with my blog. I tend to enter the Greek and Hebrew now without pointing, though that still doesn't work on all platforms. Brandon Wasson and I have had a long email exchange about the best way to include Hebrew and Greek on blogs that I will probably blog on soon. he has an excellent unicode Greek chart available here.

At Wed Aug 24, 07:10:00 AM, Blogger Wayne said...

Peter, first, thanks a lot for creating this helpful post.

I think my browser response is probably the same as Tyler's. I get square boxes for the accented Greek vowels in IE. Those vowels look perfect in Firefox.

If you or others have ideas for getting IE or other browsers to properly display all the Hebrew and Greek characters, including pointing and accents, that would be a big help to bloggers and their visitors.

At Wed Aug 24, 07:21:00 AM, Blogger Wayne said...

I have now taken time to vary the settings Tool / Internet Options / Languages (and Fonts) in Internet Explorer. Even when I add Hebrew and Greek languages, and when I specify Hebrew or Greek default fonts, I still get the square boxes for the accented Hebrew and Greek in IE.

At Wed Aug 24, 07:24:00 AM, Blogger Trevor Jenkins said...

Using FireFox on Linux both the Hebrew and Greek look fine to me.

At Wed Aug 24, 08:12:00 AM, Blogger Jim said...

I use IE and have the same result at Tyler and Wayne. Curiously, when I got the post via rss in thunderbird (my rss reader) i didnt get any boxes but I did get odd spacing- except for the unaccented greek and hebrew, which came out fine.

At Wed Aug 24, 08:35:00 AM, Blogger Brandon Wason said...

As far as I can tell, the only way to fix this problem in IE is to use HTML entities (e.g., &8067;) and actually tell the browser via CSS what fonts to use.

At Wed Aug 24, 08:37:00 AM, Blogger Brandon Wason said...

My example should actually look like: ᾃ

At Wed Aug 24, 08:48:00 AM, Blogger Wayne Leman said...

Changing the default font for the blog template to Tahoma now displays all the Greek properly in both IE and Firefox. The problem with display of Hebrew accented characters remains in IE. Perhaps we will have to code the font changes in the HTML code (which is fully accessible to us within Blogspot), as Brandon notes.

At Wed Aug 24, 03:43:00 PM, Blogger Peter Kirk said...

I note that the following part of this posting, as well as the UPDATE at the end, was added by someone else, presumably Wayne:

We should be able to display these characters so they are larger:

בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ

Increasing display size can be done in word processing programs, as well as in blog posts (for Blogger and Blogspot, use the Compose mode when creating or editing posts to view the Font and font size buttons).

Useful information, Wayne, but don't be shy to attribute it to yourself! And I don't think it works in comments.


