2009-06-19

Semi-automating piece-wise translating texts through Google Translate

I am having to translate a horribly old program (1992?, DOS' Turbo Pascal)… and in german, to make it funnier… to an only-slightly-less-horribly-old environment (Delphi 4, circa 1999). To think that I was complaining about how outdated Borland's C++ Builder 5 felt… oh my.

I guess/hope/fear that later there will be a more modern target. But that will be easy, given that by then I should understand everything and the program itself is not complicated.

 The thing is proving to be interesting in a couple of ways. One of them is the translation to english (necessary to understand the scarce code comments, the procedures' names, and the user interface). I started translating some selected pieces through Google Translate, but even the text encoding has problems, so I still had to start preparing and correcting the files (which even mixed encodings halfway the file).

Also, the indenting is terrible, as I have come to expect from Borland's editors, more so when in hands of careless programmers, like the one who did the original program, judging from the interdependencies in the units and the general organization of the code. Nasty, gory.

And it's procedural with some object bits thrown in only to be able to use Turbo Pascal's user interface facilities. And, how not!, absolutely no separation between UI and the real meat of the program.

Which means I get to rewrite the whole thing for Delphi / Windows. Which brings me back to the german translation thing.

I could choose between spending some boring hours looping through the "select something - copy - paste into google translate - select result - copy - select destination - paste" routine, … or … trying to whip up something…

So now I have a shell script which gets a string, escapes it with Perl, sends it to the Google Translate API with curl, and prints it out. Then I tried feeding strings to that script via OnMyCommand in the editor, but was too much mousing. Quicksilver: too much keying. Script menu: even more mousing (I seemed to remember it allowed the use of shortcuts, but not - maybe that was in OS 9?). FastScripts Lite: like Script menu, but allows shortcuts, bingo.

But while I was preparing the Applescript, I find that BBEdit allows directly unix filters, so I can avoid AS (I'm afraid BBEdit's dictionary and I never really liked each other). It works, but so much selecting with the trackpad also got a bit unnerving, so with that excuse for going the final mile, finally I did get into AS and made a small script which finds the limits of the sentence in which the cursor is currently positioned, and feeds that to the unix filter.

So now it is just a matter of (point+click with right hand+press shortcut with left hand), and the affected sentence is substituted by the translated one. BBEdit makes it a bit easier, but really I could have done the same with QS or FastScripts in any semi-applescriptable editor, even in Terminal I guess (maybe could even be made application-independent). Makes me wonder if I could do the same in any other OS… (not the translating thing, of course, but the point+click+autoselect thing, and/or make everything get together).

Here are both scripts, in case someone finds them useful. Just quick and dirty hacks though!

Translator:
#! /bin/bash

orig=`cat "$1"`

escaped=`perl -e "use URI::Escape;print(uri_escape(q^$orig^));"`
result=`curl -s "http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=de%7Cen&callback=foo&context=bar&q=$escaped" | grep -o ":\".*\"}"`
l=${#result}
result2=${result:2:$((l-4))}

echo -n $orig -- $result2

Autoselector:
tell application "BBEdit"

tell text of window 1
set initPos to characterOffset of selection
set i to initPos
set theChar to (character i as string)
repeat until theChar is "'" or theChar is "{"
set i to i - 1
set theChar to (character i as string)
--select (characters i thru initPos)
end repeat
set i to i + 1

set j to initPos
set theChar to (character j as string)
repeat until theChar is "'" or theChar is "}"
set j to j + 1
set theChar to (character j as string)
--select (characters i thru j)
end repeat
set j to j - 1
select (characters i thru j)
end tell
run unix filter POSIX file "path/to/gtranslate.sh"
end tell

No comments

Post a Comment