The first problem that I was running into is that xgettext (and
xgettext.pl -- the xgettext from the Perl Maketext package) simply were not extracting strings from our templates. The template-code-scanner resolves that issue now. Note that this isn't a full "Template Toolkit parser" -- instead, it "just" scans for what looks like an invocation of the "text()" function.
The second problem I was running into is that the prior script required a database connection with n up-to-date loaded schema. From there, it would extract a few known-to-contain-translatable-strings tables. The list of known tables had bit-rotted though, because there are more tables now. Requiring an active database connection isn't practical for the purpose of that tool (which might be used by translators or during the release process), so I replaced it with another scanner.
The third problem I was running into is that xgettext wasn't extracting our strings and that
xgettext.pl isn't programmed to extract the function name we use (it wants l(), loc() or some others, but not our text()). Additionally, we have a largish number of translatable strings in comments due to the fact that the source uses string composition and interpolation which xgettext can't deal with.
So, I ended up implementing a scanner for our Perl code too, as I estimated replacing all text() calls with l() or loc() calls and eliminating the string interpolations with actual string translatable strings to be a lot more work.
Last but not least, I ran into a problem with character encodings in de.po on the 1.4 branch. :-(
To prevent errors like that from happening in the future, I implemented a test in 'master' to verify that xgettext's msgfmt utility doesn't fail its checks. (Note, this discards any of its warnings!)
--
Bye,
Erik.
Robust and Flexible. No vendor lock-in.