Sunday, August 10th, 2008

lethargic_man: (blue!)
[michael@epicyclic ~]$ hebrew_grep שלום hebrew
./hebrew/iso8859-8: שלום-שלום
./hebrew/utf-8: שָׁלוֹם־שָׁלוֹם
./hebrew/windows-1255: שָׁלוֹם־שָׁלוֹם
./hebrew/utf-8.sans-nikkud: שלום
[michael@epicyclic ~]$ 
(Except that it actually reverses the output so it looks correct on gnome-terminal, which does not handle bidirectional text.)

Now I finally have a grep utility that can search for Hebrew text regardless of whether the files it's searching are encoded in UTF-8, Windows-1255 or ISO-8859-8; whether the text is logically ordered (UTF-8 Windows-1255) or visually (ISO-8859-8); whether nikkud (vowel points) are included or not; and whether ׳ ,״ and ־ are correctly encoded, or approximated by ASCII ", ' and -.

(And all that on a (still) empty stomach.)

Why didn't I do this years ago?

Profile

lethargic_man: (Default)
Lethargic Man (anag.)

May 2025

S M T W T F S
    123
45678910
11121314151617
181920212223 24
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Monday, July 7th, 2025 11:19 am
Powered by Dreamwidth Studios