We published a paper just two days ago on open source Enterprise Search tools such as Lucene/Solr and Xapian/Flax which basically asked whether these tools are now comparable for this purpose with the proprietary products from the likes of Autonomy and Microsoft FAST?
It's a very hot topic at the moment, and Matt Asay (VP Business Development, Alfresco) covers Lucene/Solr in particular in his CNET blog.
At the risk of simplification, the answer is more or less "yes", but the integration of these powerful tools can be held back by the fact that companies need to invest time to learn how. Then there is the issue of who do you call when you need support later on?
These problems are being addressed by companies whose business model is to provide those implementation, customization and technical support services. For Lucene/Solr, the leading name is Lucid Imagination based in San Mateo, California. One of their customers is Comcast Interactive Media, a division of the CableTV/ISP giant that specialises in online media. Their view is that Lucene/Solr has 80% of the features of rival proprietary search products (and they didn't need the other 20%).
For Xapian, the equivalent source of services and support is the Flax team. They are local to us in Cambridge (UK) and are very actively developing their Flax Search Service.
In June of this year, In-Q-Tel, the technology arm of the CIA, invested an undisclosed amount in Lucid Imagination. I guess that if ever an enterprise knew a thing about searching massively large datasets, it's the intelligence agencies! Both Lucene/Solr and Xapian/Flax are demonstrating that they are capable of scaling to more than 100 million documents.
The other problem with Enterprise Search engines is that it is hard to see the value until after you have integrated the service and can see the results on an actual document search. We're now in the final testing stages of our next release (v8.0.0) and are able to see that for ourselves. We've developed plug-ins for both search engines, and are building up a rich picture of the strengths of each.