This distribution provides a very helpful way to quickly start working with Solr from Perl.
You can add multiple documents to a Solr index, run basic queries against the index and do various other useful things.
However, after a little use, the module's limitations quickly become apparent.
The WebService::Solr object relates to a specific Solr core. It does not provide a higher level abstraction relating to a Solr server. So, if you want to atomically switch two cores (perhaps one contains an updated index you have built in the background) using Solr's SWAP command, you need to build a request from scratch yourself using LWP.
If you want to search for a given string in either of two fields, for example a title and description, WebService::Solr::Query offers no way to do this. You need to build the query yourself, taking care to escape the fields that WebService::Solr::Query normally escapes for you. As of version 0.21, an unfixed ticket exists recognising this design flaw.
In summary, I would recommend this module for experimentation with Solr and for basic use. Make sure you account for its limitations when trying to do anything a little more advanced.
Traditionally, programs read information from a file by opening a file handle, reading data into memory, then closing the file handle. This module takes a different approach: it provides a scalar variable that magically corresponds to a file's content on disk without reading any new data into memory.
If you find yourself working with files of large or unknown size, this module saves you having to worry about the memory that perl might allocate.
Behind the scenes, the module uses mmap on Unix to do its magic. The module uses similar facilities on Windows and VMS, making it portable to most platforms that Perl developers often use.
I like this module because it combines the easiness of File::Slurp (or setting $/ to undef for those who like minimal dependencies) and the efficiency of other interfaces to mmap, such as Sys::Mmap, along with the benefit of better portability.
Sometimes I like to review code on paper rather than on-screen. a2pdf does the best job of rendering Perl code in print of anything I've tried: it generates pleasant looking PDF files with line numbers, page numbers, document headings and syntax highlighting using Perl::Tidy.
You have to write a small wrapper script to get useful output if your code lives in multiple files, and this generates multiple PDFs, but this doesn't inconvenience me much. Everything else about a2pdf works wonderfully.
A few weeks ago I was playing with some large XML files and thought I'd learn how SAX worked. I wrote some code to keep track of whereabouts in the tree I was and process what I found. But my code looked messy: what I really wanted was something that would give me small chunks of the document to operate on; not something where I had to keep track of my position in the document.
This reminded me of XML::Twig, which I've used in the past. Whereas SAX handlers perform an action on encountering certain features in a document, XML::Twig returns fragments of the document tree that you can interrogate or mainpulate. You can operate on the twigs, the small parts of the document tree, using an intuitive syntax or use XPath.
I've used this module for both complex and simple XML processing and it does the job better than anything else I've found. When I encountered a bug in the module, the author released a fixed version within a few days of me logging a failing test in RT. Impressive!
I've just found myself with a list of 12000 URLs to retrieve. I remembered the slowness of fetching a few hundred using LWP, and I remembered messing with POE and finding it too much like hard work.
I also remembered a lightning talk at YAPC::Europe about HTTP::Async. The talk made the module look easy to use and fast. After a quick scan through the module's documentation, I'm using its Polite subclass and happily downloading the URLs at a fair rate.
The module's interface extends LWP, which I already know, and it's making my slow ADSL connection and old laptop feel even more antiquated than usual. HTTP::Async does its job well: parallelisation without having to think.