Lingua-JA-Regular reviews

RSS | Module Info

Lingua-JA-Regular (0.09)

Perl programmers who have to deal with Japanese might need something to transform half-width and full-width characters, or irregular kanji, and might come across this module in a search for "zenkaku" or "hankaku".

Unfortunately, though, this module is not useable as it stands. First of all it doesn't install correctly. If you're thinking of installing it, note that this module uses Dan Kogai's, and the reason it fails "make test" is because it doesn't mention that dependency. If you install Jcode before Lingua::JA::Regular, it will pass all its tests and install correctly.

Note also that the module is based on the EUC-JP encoding. In particular the functions for dealing with CP932 just convert the CP932 characters into ASCII forms rather than the corresponding Unicode symbol.

This module also has functions for dealing with fullwidth spaces, but these are obsoleted if you use Perl's internal encoding, since \s matches fullwidth spaces (0x3000, the things called "IDEOGRAPHIC SPACE" in the Unicode documentation). There may have been justification for using EUC-JP when the original version of the module came out, since version 0.01 of this module came out before Perl 5.8.1, but if you have to deal with a lot of Japanese codes nowadays you'd be better off switching to utf8.

The documentation is also minimal, unfortunately, and it doesn't have any facility to set the character set required except by looking at $ENV{HTTP_USER_AGENT}, which is weird. I'd recommend not using this module unless it's greatly updated.