Work done on translation infrastructure

Hi all,

Last week ago, I found that we weren't including all strings for translation in our translation lexicon -- I was missing important terms such as "Asset".

Since then, I did the following (summary):

* Write a scanner to extract translatable strings from our templates in UI/ and templates/

* Write a scanner to extract strings from our SQL files

* Write a scanner to extract strings from our Perl files (ouch!?)

* Write a test (t/05-po-checks.t) to validate our PO/POT files

* Write instructions for coders to help translators get the job done (http://ledgersmb.org/community-guide/community-guide/development/coding-guidelines/coding-translation)

* Write instructions for translators how to get started (http://ledgersmb.org/community-guide/community-guide/translating)

Review of the documents in the last two bullets would be highly appreciated!

The first problem that I was running into is that xgettext (and xgettext.pl -- the xgettext from the Perl Maketext package) simply were not extracting strings from our templates. The template-code-scanner resolves that issue now. Note that this isn't a full "Template Toolkit parser" -- instead, it "just" scans for what looks like an invocation of the "text()" function.

The second problem I was running into is that the prior script required a database connection with n up-to-date loaded schema. From there, it would extract a few known-to-contain-translatable-strings tables. The list of known tables had bit-rotted though, because there are more tables now. Requiring an active database connection isn't practical for the purpose of that tool (which might be used by translators or during the release process), so I replaced it with another scanner.

The third problem I was running into is that xgettext wasn't extracting our strings and that xgettext.pl isn't programmed to extract the function name we use (it wants l(), loc() or some others, but not our text()). Additionally, we have a largish number of translatable strings in comments due to the fact that the source uses string composition and interpolation which xgettext can't deal with.

So, I ended up implementing a scanner for our Perl code too, as I estimated replacing all text() calls with l() or loc() calls and eliminating the string interpolations with actual string translatable strings to be a lot more work.

Last but not least, I ran into a problem with character encodings in de.po on the 1.4 branch. :-(

To prevent errors like that from happening in the future, I implemented a test in 'master' to verify that xgettext's msgfmt utility doesn't fail its checks. (Note, this discards any of its warnings!)

Bye,

Erik.

http://efficito.com -- Hosted accounting and ERP.

Robust and Flexible. No vendor lock-in.