RSS | Module Info | Add a review of Lingua-StopWords
Lingua-StopWords
(0.09)
Lingua::StopWords provides lists of short words like "to" or "and" for several languages which should be neglected in searches. I installed Lingua::StopWords for use in a web log parser to remove uninteresting words from its list of search keywords. As far as I can see it works pretty well; some words it doesn't eliminate include "us" and "can", but since these could be "United States" or "Can of coca-cola" perhaps they are border cases.
The explanation of the module provides an example using "grep" of removing the stopwords from a list which I copied into my program. Although this is very simple, it would be preferable if this was provided as a method or procedure in the module.
Disclaimer: I have only used the English words part of this module. It provides lists for a lot of other languages but I didn't use them, so please consider this a review of the English-language parts only.
(If you would like to make a comment about this review, please address it to me at bkb@cpan.org. Please try to be civil.)
Ben Bullock - 2011-08-22T17:50:49 (permalink)
1 out of 1 found this review helpful.
Was this review helpful to you?
Yes
No

