
About Search Engines
Search engines use software "bots" to scout the Web and assemble databases or indexes of Web pages. When you enter a query at a search engine Web site, your input is checked against the search engine's keyword indexes. The best matches are then returned to you as hits. A search engine has two parts: A "robot", "crawler", or "spider" that travels to pages on the Web and assembles an expansive index, and a program that receives your search request, compares it to the index entries, and returns results to you.
Keyword Searching
This is the most common way to search the Web and it works by matching the text of your query to the database's indexed text. Authors often specify the keywords they want the search engines to use but often authors do not select keywords or the the search engine spider looks for additional information on the page. Words that are mentioned towards the top of a document and text that is repeated throughout the document are more likely to be judged as important. Some search engines index every word on every page, others index only part of the document.
Lycos, for example, indexes the first 20 lines of text, in addition to the title, headings, subheadings and the hyperlinks to other sites. Full-text indexing systems pick up every word in the text except commonly occurring stop words such as "a," "an," "the," "is," "and," "or," and "www." AltaVista utilizes full-text indexing without excluding the common articles, "a," "an," and "the." Some search engines discriminate between upper case and lower case; others store all words without reference to capitalization. Some search engines use the information contained in the mark up tags of the coded Web page, while others do not. Search engines vary in how they deal with singular and plural words, verb tenses, and stemming, e.g. if you enter the word "dog" should you receive "dogma" as a hit? Even in light of these shortcomings and potential points of confusion, keyword searching via search engines can be very powerful.
Search Engines
|
|
|||
|
|
|
||
|
|
Alta Vista |
|
|
|
|
Northern Light |
|
|
|
|
AllTheWeb |
|
|
|
|
HotBot |
|
|
|
|
Lycos |
|
|
Meta Search Engines
These web searching tools operate in basically the same regard as other search engines except they contain no indexed database of their own. When you submit a query, the search engine sends copies of the query to other search engines and returns those results. Search engines have differing algorithms for determining what are the most relevant pages matching your search. Meta search engines are useful because they provide a way to find common results among the sets of returned pages and a means of comparing and evaluating individual search engines. Although meta search engines are helpful, almost none of the freely available meta search engines search Google, a search engine indexing one of the largest numbers pages on the Web.
|
|
Webcrawler |
|
|
|
|
Copernic Basic 2001 |
|
|
|
|
Ixquick |
|
|
|
|
Metacrawler |
|
|
|
|
Vivisimo |
Multimedia Search Engines
Often using a general search engine can yield frustrating results when attempting to find a specific type of multimedia such as image or audio files. Fortunately there are a variety of multimedia search engines on the Web.
|
Searches the about.com databases for all kinds of useful clip art |
Finds audio, video and images on the Web, including MP3 files |
|
Seeks relevant pictures from more than 150 million images, one of the most comprehesive image searching tools on the Web |
|
|
Allows searchers to look for MIDI (Musicial Instructment Digital Interface) files |
|
|
Browses by artist, date or filename, or keyword search for MP3 (MPEG layor 3 - compressed audio) files |
Specialized Search Engines
Often it can be useful to use a search engine that specifically looks for a type of document or specialized information. For example, if you were looking specifically for medical Information, using a standard search engine might leave you sifting through hundreds of Web sites while using a search engine that specializes in medical information would likely yield more relevant results.
|
Scirus is a product developed by Elsevier Science, concentrating soley on collecting scientific content. It searches both the open Web and membership sources such as ScienceDirect, MEDLINE on BioMedNet, and Beilstein on ChemWeb. Scirus features an array of scientific information in data and chart form. |
The German Environmental Information Network search engine gathers public affairs information distributed across Web sites run by public institutions in Germany, such as environmental authorities, and agencies and ministries at the federal levels. It functions as an information broker for environmental information in Germany. |
|
SportsSearch is a specialized directory of sports Web sites. |
|
|
Artcyclopedia is an index of hundreds of museum sites and image archives. Visitors can search for where the works of over 5,500 different artists can be viewed online. |
MedHunt uses both humans and Web crawling to build its index of medical information. Searches can be narrowed by region and a French interface is available. |
More Search Engine Information
WSU Libraries Search Engine Guides