| Module Info
| Add a review of Unicode-Japanese
The XS version of the module works and installs without problems on Cygwin, and it seems to have a huge number of encodings available. But, for modern-day Perls, since 5.8.1, it isn't clear that these are necessary any more, because Encode does this work.
The module seems to work around byte-encodings, and I was surprised to get a non-UTF-8-flagged string of UTF-8 from the routines the first time I tried it.
I wrote a simple testing program for the module whose core function is like this:
my ($text, $method) = @_;
my $uj = Unicode::Japanese->new($text);
my $ujout = eval ("\$uj->$method");
my $ujtext = $ujout->getu;
my $ujlen = $ujout->strlen;
print "Method $method on input '$text' gives output '$ujtext' with length $ujlen";
my $str = 'ãƒã‚«ãƒ‡ãƒ³ã‚¸ãƒ£ãƒ©ã‚¹';
my $halfwidth = Unicode::Japanese->new($str)->z2h->getu;
testit ($halfwidth, 'kata2hira');
testit ($str, 'kata2hira');
In short order, I found two things which I thought strange. First of all, kata2hira doesn't convert halfwidth katakana into hiragana. Second, the strlen function is a nice idea, but it gives the halfwidth katakana a width of two instead of one. So these functions aren't incredibly useful. I recommend using Unicode::EastAsianWidth instead.
Another thing which surprised me is that it mixed up shift-JIS and CP932 and called the latter Shift_JIS. It probably would be better just to tell people to use the name "CP932" instead of "Shift-JIS", since it's just possible people could actually want "real" Shift-JIS instead of CP932.
Four and a half stars: â˜…â˜…â˜…â˜…Â½