Sunday, August 10th, 2008
What I did this afternoon.
Sunday, August 10th, 2008 03:36 pm[michael@epicyclic ~]$ hebrew_grep שלום hebrew ./hebrew/iso8859-8: שלום-שלום ./hebrew/utf-8: שָׁלוֹם־שָׁלוֹם ./hebrew/windows-1255: שָׁלוֹם־שָׁלוֹם ./hebrew/utf-8.sans-nikkud: שלום [michael@epicyclic ~]$(Except that it actually reverses the output so it looks correct on gnome-terminal, which does not handle bidirectional text.)
Now I finally have a grep utility that can search for Hebrew text regardless of whether the files it's searching are encoded in UTF-8, Windows-1255 or ISO-8859-8; whether the text is logically ordered (UTF-8 Windows-1255) or visually (ISO-8859-8); whether nikkud (vowel points) are included or not; and whether ׳ ,״ and ־ are correctly encoded, or approximated by ASCII ", ' and -.
(And all that on a (still) empty stomach.)
Why didn't I do this years ago?