| CPAN Ratings Lingua-StopWords reviews | |
| Home | Search | About | Bitcard Account | Login |
RSS | Module Info | Add a review of Lingua-StopWords
Lingua::StopWords provides lists of short words like "to" or "and" for several languages which should be neglected in searches. I installed Lingua::StopWords for use in a web log parser to remove uninteresting words from its list of search keywords. As far as I can see it works pretty well; some words it doesn't eliminate include "us" and "can", but since these could be "United States" or "Can of coca-cola" perhaps they are border cases.
The explanation of the module provides an example using "grep" of removing the stopwords from a list which I copied into my program. Although this is very simple, it would be preferable if this was provided as a method or procedure in the module.
Disclaimer: I have only used the English words part of this module. It provides lists for a lot of other languages but I didn't use them, so please consider this a review of the English-language parts only.
(If you would like to make a comment about this review, please address it to me at bkb@cpan.org. Please try to be civil.)
Ben Bullock - 2011-08-22T17:50:49
|
Perl.org sites
: bugs
| dev
| history
| jobs
| learn
| lists
| use
Site Information and Contacts |
|