Click here to receive your FREE subscription to Campus Technology
Home > Review: Google Mini 2.2
Search Appliance
Review: Google Mini 2.2
3/19/2007
By David Nagel
In addition, Google Mini 2.2 is supposed to be able to generate sitemaps automatically for use on Google.com. However, that functionality has not yet been integrated into the appliance, despite what Google's promotional materials say. That feature might, according to some, appear in an interim update in the near future.
The Mini's specific administration and reporting features include:
- Specify URLs and URL patterns to crawl (or restrict);
- Set a crawl schedule (continuous or fixed-duration), crawler access (user names and passwords for sites), proxy servers, and HTTP headers/agent name;
- Prevent recrawling on duplicate hosts;
- Create rules for identifying document dates;
- Set a host load schedule (for concurrent crawls), including downtimes;
- Index rollback (reverting to automatically generated index snapshots);
- Force reindexing ("freshness tuning");
- The creation of collections (setting and restricting URL patterns for collections);
- Creation of custom front ends;
- Creation of OneBox modules (described above);
- Crawl status reports, diagnostics, and queues;
- Statistics on mime types crawled;
- Serving and system status;
- Search reports;
- Search and event logs;
- E-mail notifications; and
- Miscellaneous administration features, such as LDAP configuration, SSL settings, certificate authorities, SNMP configuration, etc.


HardwareOf course, the Google Mini isn't all software. It's a complete hardware/software appliance. But you won't read too much about the system's hardware features. For one thing, they're almost irrelevant. This box provides al the inputs you need to get the job done (including a monitor port, two RJ-45 jacks, and various other types of data connections, seen below).

As far as the guts are concerned, that information is just plain unavailable. Google says the machine runs on "standard" PC hardware and won't say anything else about it. I suppose I could eventually take a blow torch to my unit to find out what's inside it, but, for now, I prefer to leave it in pristine condition.
But the real question about the hardware anyway is whether or not it has the muscle power to do what it's supposed to do. And the answer is yes. It'll pull up results in a fraction of a second; it'll generate reports quickly (although you might need to refresh the browser manually to see the finished reports in a reasonable amount of time); and it can crawl with multiple concurrent connections. Using four concurrent connections, I was experiencing two to 16 pages crawled per second, with an average of about seven pages per second.
Recommended Reading
- Sentrigo Offers Help for Database Patching Woes
Sentrigo Inc. released its new Hedgehog vPatch database security software product Tuesday. The product addresses patching inconsistencies that seem to affect busy Oracle database administrators (DBAs), who don't always have time to test and patch. However, users of Microsoft SQL Server database in the enterprise can take a lesson here too.
- Starfish Launches Higher Ed Retention Solution
Software provider Starfish Retention Solutions has announced the upcoming launch of its first product, Starfish Office Hours. The company said this will be the first in a series of products intended to help higher education institutions improve retention and graduation rates by aiding in the delivery of programs designed to help at-risk student populations.
- Unisys Offers Free Unified Communications Trial
Unisys announced Monday that it is offering companies a free 30-day unified communications trial using Microsoft solutions. The offer is currently available through Microsoft's sales personnel.
- New Mexico Launches Statewide eLearning Initiative
As part of its Innovative Digital Education and Learning initiative (IDEAL-NM), New Mexico is launching a statewide program to standardize on a single electronic learning platform--Blackboard--spanning K-12, higher education, adult education, and government. The initiative will also support a new statewide virtual high school.
- North Carolina Adopts Blackboard for Higher Ed
The University of North Carolina and the North Carolina Community College System have signed on with Blackboard to deploy that company's electronic learning platform across 68 individual campuses.
- Semantic Search: Could the Web Think?
Semantics is a sub-field of linguistics that focuses on meaning making in language. Therefore, the Semantic Web we're still reaching for will be based on a set of definitions, languages, and standards that can base a search on the detection of meaning and not just on a simple character string. The Semantic Web will at least be smarter than the current Web.