Did you know that you can use Google’s power to create custom search engines and then use them on your site or whenever you need them? Well, if you didn’t, then this post will try to provide some info about this feature.
To create a Google custom search, there are some simple steps you have to follow. First of all, you have to go to the Custom Search website and click on the “Create a Custom Search Engine” button. The next screen will ask you some things about your search engine:
- Search Engine Name
- Search Engine Description
- Search Engine Language
- What do you want to search
This is where you decide where your search engine will look for results. There are 2 options:- Entire web
- Only sites I select
- Select some sites
- Here you put the sites your search engine will look into. Put one site per line. You can also use wildcards for searching entire domains (eg: *.domain.com), entire sites (eg: www.domain.com/*) or parts of sites (eg: www.domain.com/*2009* for all pages within 2009).
- Select an edition
This is where you specify the type of the search engine to use. There are 2 options here:- Standard Edition: Free with ads
- Business Edition: Starts at $100 per year, no ads on results pages.
The next screen will let you tryout your search engine. If the results are what you expected, you press on the “Finish” button and you are done. Your search engine is ready. A list with all your search engines will be displayed and you can visit it by just clicking on it’s name.
Some Custom Search Engines Worth Visiting
Here is a list of some neat engines we found and one that we created for you:
- WordPress Plugin Repository – Searches wordpress.org plugin repository, for lack of a proper search engine
- Wordpress Search – Search for sites relevant to wordpress development and deployment.
- Drupal Developer Search Engine – Search engine for Drupal developers to search many Drupal related sites.
- Search Drupal Modules – Ever had trouble finding that Drupal module you know just has to exist? Someone must have already created it… well now you can find it!
- Twitter Search – A Twitter Search Engine.
- Java Search Engine – Java Programmers and Technical Search.
- JSLSE - Our custom search engine. This one searches more than 10 JavaScript Libraries sites.
Beyond Simple CSEs
After your custom search engine creation, you can use various tools to personalize your search engine. For example you can change it’s look and feel, add annotations, link the search engine to your Adsense account and earn money, promote some sites to appear higher than the others based on specific queries, refine the results and you can also invite up to 100 friends to contribute to your search engine. There are usage statistics provided for your CSE too.
If you invite someone to contribute, he/she will be able to add refinements to your CSE and use the Google Marker bookmarklet which allows the user to add refinements and labels to pages. You will also be able to use their refinements too.
The Control Panel is a great way to customize your CSEs but if you want to go further, you should learn how to use the XML or TSV custom search format.
Defining CSE Specifications
A basic specifications file looks like this:
<CustomSearchEngine volunteers="false" keywords="climate "global warming" "greenhouse gases"" language="en" visible="false" encoding="UTF-8"> <Title>RealClimate</Title> <Description>Science behind global warming and climate change.</Description> <Context> <BackgroundLabels> <Label name="_cse_hwbuiarvsbo" mode="FILTER"/> <Label name="_cse_exclude_hwbuiarvsbo" mode="ELIMINATE"/> </BackgroundLabels> </Context> <LookAndFeel nonprofit="false"/> </CustomSearchEngine>
and includes the following elements:
- CustomSearchEngine
- Title
- Description
- Context
- BackgroundLabels
- Label
- BackgroundLabels
- LookAndFeel
There are also some optional elements that you can use between the Context element. These are:
- SubscribedLinks
Enables subscribed links in your search results page. Subscribed links is a form of special results that you create for a set of pre-defined queries. It is a way to directly answer your users’ questions in the results page. - AdSense
Associates the search engine with your AdSense account. Make money with your custom search engine by connecting it with your Google AdSense account. - Enterprise Account
If you upgraded to Google Site Search, it lists your contact information. You can change the attribute values to update your information.
Selecting Sites to Search With XML
If you want to create a simple CSE with 2 – 10 sites’s results included, then the Control Panel is what you are looking for. But, if you want to use more than 50 or 100 sites in your search engine, then the best thing to do is to list all sites in a file and upload it. You can create annotations files using 3 formats:
- OPML – Outline Processor Markup Language
- TSV – Tab Separated Values
- XML
The most powerful but also the most complex format is XML. You can do anything that the CSE API allows you to do with this format. TSV is easier but with some restrictions on features and OPML is the easiest format to use since it allows you to use already made annotations files but you have less features from the other 2 formats.
The OPML Format
OPML is a type of XML format that was originally developed for defining ordered lists of elements or outlines, but it is now also commonly used for web feeds. If you have OPML files from some feed aggregators, you can upload the OPML file without bothering with typing each site. Custom Search grabs the value of the OPML attribute htmlUrl and adds it to the list of sites to search. You can upload multiple OPML files for each of your search engines.
Example OPML File:
<opml version="1.0"> <head> <title>Bicycles</title> <dateCreated>Fri Mar 14 23:21:11 PDT 2008</dateCreated> <dateModified>Fri Mar 14 23:21:11 PDT 2008</dateModified> </head> <body> <outline type="rss" text="Road Bikes" xmlUrl="http://www.google.com/exampleurl.opml" htmlUrl="http://www.google.com/sampleurl1.opml"/> <outline type="rss" text="Mountain Bikes" xmlUrl="http://www.google.com/exampleurl2.opml" htmlUrl="http://www.google.com/sampleurl2.opml"/> </body> </opml>
The TSV Format
The easiest format to create is the TSV. The only thing to do, is to open your spreadsheet editor and allocate a column for each of the fields. Save the file with a .tsv extension and upload it. The TSV file should have a specified format. The 2 required fields are:
- URL
- Label
and there are 3 optional fields:
- Comment
- Score
- Custom field – This one is a field that you can add for your reference since it does not affect the search engine. To create one, you must prefix it with “A=”. For example A=Contributor
Example TSV file:
URL Label Comment A=Contributor
www.cancer.gov/cancertopics/types/liver/* _cse_Ansi-stoubiq government site John
www.medicinenet.com/liver_cancer/* _cse_Ansi-stoubiq site on symptoms Bill
www.webmd.com/hw/cancer/* _cse_Ansi-stoubiq great site for patients! John
www.oncologychannel.com/*/treatment _cse_Ansi-stoubiq Steve
The XML Format
To get the most out of your CSE, you should use the XML format. There are 3 ways to use this format:
- One Annotations file per CSE
- One Annotations file for all your CSEs
- Context files with inline annotations
You can use any of the ways above since you can always change the way by just copy pasting. The following is an example of XML annotations. It is roughly the XML version of the TSV example in the previous section. It includes the same elements, except for custom attributes, which are available only in the TSV format. This annotations file tells Custom Search to include everything under www.webmd.com/hw/* but exclude everything under www.webmd.com/hw/cancer/*:
Example XML File:
<Annotations> <Annotation about="www.cancer.gov/cancertopics/types/liver/*"> <Label name="_cse_Ansi-stoubiq"/> <Comment>government site</Comment> </Annotation> <Annotation about="www.medicinenet.com/liver_cancer/"> <Label name="_cse_exclude_Ansi-stoubiq"/> <Comment>site on symptoms</Comment> </Annotation> <Annotation about="www.webmd.com/hw/cancer/*"> <Label name="_cse_exclude_Ansi-stoubiq"/> <Comment>great sites for patients!</Comment> </Annotation> <Annotation about="www.oncologychannel.com/*/treatment"> <Label name="_cse_exclude_Ansi-stoubiq"/> </Annotation> </Annotations>
Hosting Your Annotations File On Your Own Server
Google allows you to host your own annotations on your server and to be able to use more than 5000 annotations, create annotations with scripting languages like PHP and update it as frequently as you want. To host your annotations, you have to create the file, upload it to your host and tell the CSE where to find this file. You can link annotation files together and use a large number of annotations that end up to 50 files!!
Example External Annotations:
<GoogleCustomizations> <Include type="Annotations" href="http://www.yoursite.com/cse_bacon_annotations.xml" /> <GoogleCustomizations>
Tweak The Ranking of Your CSE
You can also change the way your CSE ranks the results. For any reason, you might want to change the way the results are ranked. This can be done by using keywords, weighted labels and scores. While keywords and weighted labels are defined in the Context, scores are defined in the annotations file.
Keywords
With keywords, you can quickly change the results. Your CSE will boost pages that contain your keywords.
Keywords example:
<CustomSearchEngine volunteers="false" keywords="asana "yoga postures""> </CustomSearchEngine>
Labels
You can use two kinds of labels: search engine labels and refinement labels. Search engine labels determine which sites should be covered by the search engine.Refinement labels, on the other hand, are visible to your users and show up as links.
Search Engine Labels example:
<BackgroundLabels> <Label name="_cse_hwbuiarvsbo" mode="FILTER"/> <Label name="_cse_exclude_hwbuiarvsbo" mode="ELIMINATE"/> </BackgroundLabels>
Refinement Label example:
<Facet> <FacetItem title="Lectures"> <Label name="lectures" mode="BOOST" weight="0.8"> <Rewrite>lecture OR lectures</Rewrite> </Label> </FacetItem> </Facet>
Whether a site is promoted, demoted, or excluded depends on the search engine label it is associated with. A search engine label can have the following modes:
- ELIMINATE
Excludes sites tagged with this label from your search engine. - FILTER
Includes only sites tagged with this label, and excludes everything else. - BOOST
Show sites tagged with this label higher in the results.
Weights
Weights let you define how much a label should promote or demote a tagged site. The values for weights can range from -1.0 to +1.0. The weight range gives you fairly refined control over sites. A positive weight in the label emphasizes sites tagged with it, while a negative weight, de-emphasizes.
A weighted label example:
<BackgroundLabels> <Label name="_cse_hwbuiarvsbo" mode="FILTER" weight="0.65"/> <Label name="_cse_exclude_hwbuiarvsbo" mode="ELIMINATE"/> </BackgroundLabels>
A very useful attribute of the Label element is “top”. You can manualy set the order of your rankings this way:
<Label name="best_resource" mode="FILTER" top="3"/>
Tagging Sites
You can tag your annotations with your labels this way:
<Annotations> <Annotation about="webcast.berkeley.edu/*" score="1"> <Label name="cse_university_boost_highest"/> <Label name="cse_bicycles_exclude"/> <Label name="cse_hamsters_filter"/> </Annotation>
Scores
Scores are used to order sites using the same label. So you can order a site this way:
<Annotations> <Annotation about="*.edu/*" score="0.0001"> <Label name="vision_label"/> </Annotation> <Annotation about="*.ucsd.edu/*" score="0.7"> <Label name="vision_label"/> </Annotation> <Annotation about="*.vision.ucsd.edu/*" score="1"> <Label name="vision_label"/> </Annotation> </Annotations>
I hope this info was of some value to you. On the second part of this article, we will cover some special aspects of Custom Search Engines like search suggestions, appending queries, special results, subscribed results, look and feel and we will also create a custom search engine that will use all the features covered in the 2 articles.
Popularity: 1%
Related posts:
- Use Google’s Power To Create Powerfull Search Engines (Part II) In our previous article, we learned how to create a...
- Using Bing’s API To Create A Custom Search Engine Microsoft’s search engine Bing, is a getting more popular each...
- Google Visualizations From A To Z Google has released an API that you can use to...
- Google Maps From A to Z During the last years, online maps have evolved so much...
- Bing Maps. A Google Maps Alternative, or Better? Microsoft has released their own maps service and to be...
About the Author:
Filed under: Services, Tutorials - Trackback Uri






Just bookmarked this article. Good read. I think I’m going to try this in a project this week! Thanks!