lethargic_man: (serious)
[personal profile] lethargic_man
Way back in the early 1990s, fed up with a genealogy program whose database kept corrupting (and which also didn't have an option to print family trees parsimoniously), I decided to write my own family tree plotter. Initially I wrote it in BBC Basic V for the Acorn Archimedes, outputting a Draw file; but as I left my parents' place and realised there were no Acorns anywhere else, I rewrote it first in the mid 1990s into Ada (stop laughing), and then (in 2000 or later) into Perl.

Originally, the program was quite small and of limited capability, and I wrote it as one big script in functional programming style, applying none of the software engineering methodologies I used at work, because it didn't really need them. However, the program grew organically (if slowly), until it had become quite hard to follow. When I decided to type up and digitise all my genealogical information in late 2004, so I could wrap it up in a single multimedia presentation to distribute to family members, I decided first to reengineer the program to give it some structure.

So far so good, but to do the job properly would really have involved starting again from scratch and reimplementing the existing functionality—a rather daunting prospect, effort-wise. Instead, I chose to go down the easier route of refactoring the code bit by bit until I was reasonably satisfied with it. The down side to this was certain suboptimal decisions I had taken, that were embedded deep in the program's infrastructure, which I now simply had to continue living with.


One of these concerned the method of producing PostScript (which the program does en route to outputting PDF). There are CPAN modules to output PostScript from Perl, but I was not aware of these when I started, and chose instead to write out simple PostScript directly. Moreover, when faced with having to lay out a large number of entries the same way, I choose to do so programmatically in PostScript rather than simply describing each one in turn, which would result in a large file. (The fact I was still using floppies at this stage was a major factor in my preferring such parsimony.)

None of this was really a problem until I came to want to represent non-Latin-1 characters. PostScript has no support for UTF-8 built in, and I haven't been able to find a simple way of doing this on the Web. I could get UTF-8 support by using one of the CPAN libraries mentioned above, but because I'm outputting text by passing it as parameters to a function rather than simply printing it directly, such libraries are incompatible with my program.

For a while I put up with simply downconverting accented-characters—describing one branch of my family, for example, as having come from "Bialystok" rather than "Białystok" (ł in Polish is pronounced like the English w). Last night, having discovered through correspondence with a distant relative that some of my ancestors came from Biřzai in northeast Lithuania, I decided to have another crack at the problem. (This was not strictly necessary; I could have stuck with putting the Yiddish name, Birzh, on the tree, as I had Rokiškis (Rokeshik) and Kupiškis (Kupishok) too.)

A way out came to me after it occurred to me I don't need UTF-8: all of the accented characters on my family tree as it stands can be safely handled with Latin-2. And so, after several hours of googling (during the course of which unexpected running into a utility written by [livejournal.com profile] ewx in 1996) I managed to end up with some PostScript (ripped merciless out of the output of ogonkify), which allowed me to do so.

But still, it's a kludge—a workaround—not a solution; and I still can't handle Hebrew characters. One day I'm going to have to solve that problem too. The question is whether I end up doing it by another workaround, or by completely rewriting my PostScript outputting code.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

lethargic_man: (Default)
Lethargic Man (anag.)

July 2025

S M T W T F S
  12345
6789101112
13141516171819
20212223242526
2728293031  

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Saturday, August 2nd, 2025 11:33 am
Powered by Dreamwidth Studios