Bryan Ruby


Thoughts, Words, and Deeds from Sioux Falls, SD

Search

Publishers benefit from Google News

Interesting misfire of EU's Article 11 (according to Google). Article 11 is a proposed EU Copyright directive that would prevent quite a bit of the caching of content and inclusion of content Google does with its News pages.

According to Google:

Then there's Article 11. We reiterate our commitment to supporting high-quality journalism. However, the recent debate shows that there’s a fundamental misunderstanding of the value of headlines and snippets—very short previews of what someone will find when he or she clicks a link. Reducing the length of the snippets to just a few individual words or short extracts will make it harder for consumers to discover news content and reduce overall traffic to news publishers.

Let me illustrate this with an example. Every year, we run thousands of experiments in Search. We recently ran one in the EU to understand the impact of the proposed Article 11 if we could show only URLs, very short fragments of headlines, and no preview images. All versions of the experiment resulted in substantial traffic loss to news publishers.

Google Panda Killed CMS Report's Aggregation

During the Memorial weekend, I decided to pull the plug on the CMS related news feeds we were streaming into Planet CMS. One of CMS Report's biggest strengths has always been pointing people toward the right direction in their search for content management systems. Knowing that one site couldn't support all the stories that needed to be written about CMSs, we began to rely more heavily on using a news aggregator within our Drupal CMS to provide you the links and excerpts to articles written elsewhere. I did this all with good intentions, but Google apparently disagrees.

Google constantly changes their search and ranking algorithms intended in part to weed out sites that lacked original quality content. The algorithm, Google Panda, does this in part by penalizing sites that artificially raise their onsite content by using the content of others. Sites that aggregate content from other sites get hit pretty hard in Google's search rankings. I thought I was in the clear by only providing a short excerpt and not the full content of the article, but the drop in referrals over time from Google Search tells me otherwise.

The Chris Pliakas presentation on Search Lucene in Drupal

While I was at DrupalCon last week, Chris Pliakas sent a tweet out that he used screenshots from CMS Report in his Apache Lucene presentation. I'm always flattered when this site gets noticed for something we're apparently doing right. In this particular case, we're using the contributed Drupal module Search Lucene API for our search engine as well as for faceted search and content recommendations (recommended links).

If you had talked to me a few years ago, I would have told you that the Search module that comes with the Drupal CMS is all a site like mine needs. After I became a beta tester for the Acquia Network along with their implementation of Apache Solr called Acquia Search, my opinion quickly changed. I'm now convinced that an enterprise quality search engine is truly something that can make or break your website. If you're a smaller Drupal site that feels like Solr or Acquia Search is overkill or not in your cost range, Search Lucene API may be the answer you've been looking for all this time.

The actual name of Chris' DrupalCon presentation is: "Build a Powerful Site Search with the User-Friendly, Easy-to-Install Search Lucene API Module Suite". The video of his presentation can be viewed at Archive.org and has been embedded above. Screenshots from CMSReport.com can be seen in the time frame from 19 minutes to 21 minutes.

Testing the water with Acquia Search for Drupal

Acquia used the first day of DrupalCon DC as well as their corporate site to announce the availability of their new service via a public beta program, Acquia Search. Acquia Search is "based on the powerful Lucene and Solr technologies from the Apache project" and "creates a rich index of your site content".  While Apache Lucene and Apache Solr are "free" and open source, the implementation and maintenance of these products can be rather daunting.  Acquia wishes to solve this complexity problem by offering Solr search as a service in their Acquia Network.

Acquia Search Status on Acquia NetworkBefore the beta was available to the public, CMSReport.com was invited by Jacob Singh to join the private beta program to test and review Acquia Search. I have only been using Acquia Search for a week so I still have some learning to do in order to take full advantage of the advanced configuration options in Apache Solr.  Although I'm new to Apache Solr,  I have to say that from a website owner's perspective the implementation of Apache Search was extremely easy.  After I signed up for the service on the network, implementing Acquia Search within the Acquia Drupal CMS was just a matter of activating the appropriate modules and waiting for my content to be indexed by the server.  Acquia Search works straight "out of the box" and I couldn't have asked for anything simpler.