Unicode-Japanese reviews


RSS | Module Info | Add a review of Unicode-Japanese

Unicode-Japanese (0.46) ****

The XS version of the module works and installs without problems on Cygwin, and it seems to have a huge number of encodings available. But, for modern-day Perls, since 5.8.1, it isn't clear that these are necessary any more, because Encode does this work.

The module seems to work around byte-encodings, and I was surprised to get a non-UTF-8-flagged string of UTF-8 from the routines the first time I tried it.

I wrote a simple testing program for the module whose core function is like this:

sub testit

my ($text, $method) = @_;

my $uj = Unicode::Japanese->new($text);

my $ujout = eval ("\$uj->$method");

my $ujtext = $ujout->getu;

my $ujlen = $ujout->strlen;

print "Method $method on input '$text' gives output '$ujtext' with length $ujlen";

my $str = 'バカデンジャラス';
print Unicode::Japanese->new($str)->getu;
testit ($str,'z2h');
my $halfwidth = Unicode::Japanese->new($str)->z2h->getu;
testit ($halfwidth, 'kata2hira');
testit ($str, 'kata2hira');

In short order, I found two things which I thought strange. First of all, kata2hira doesn't convert halfwidth katakana into hiragana. Second, the strlen function is a nice idea, but it gives the halfwidth katakana a width of two instead of one. So these functions aren't incredibly useful. I recommend using Unicode::EastAsianWidth instead.

Another thing which surprised me is that it mixed up shift-JIS and CP932 and called the latter Shift_JIS. It probably would be better just to tell people to use the name "CP932" instead of "Shift-JIS", since it's just possible people could actually want "real" Shift-JIS instead of CP932.

Four and a half stars: ★★★★½