GSA vs Lucene

Interesting article comparing the benefits of the proprietary Google Search Appliance to a free (at least in terms of paying a vendor) solution based on Lucene: Why a project switched from Google Search Appliance to Zend_Lucene.

As with most open source options a key factor is the degree to which the sponsoring user is willing to invest in their own human resources in place of vendor costs. Many smaller companies may determine that the vendor cost is lower than what it would take to maintain a cadre of internal people to develop, deploy and support a solution. Some bigger companies will allow inertia (or a preference for using outsourced talent) to rule and choose to pass on the opportunity to save real money that investing in their own human capital would allow.

SearchBlox also sells a purportedly Lucene-based solution comparable to Google’s GSA for indexing web sites. A free edition that allows indexing up to 25,000 pages is also available (of course the GSA enterprise edition will index far more than just web sites, and so the comparison breaks down at a certain point).

But for those willing to invest in the people to “roll their own” I think Apache SOLR is the real competition for GSA. I’ve seen some impressive work done around SOLR and have no doubt it could be used to implement a really serious alternative to the GSA. In fact there are a number of sites using Drupal as a web application front end for SOLR search.