There are two ways a user talks to a web database - through a type-in box and by selecting choices from a directory structure. This section will examine the more common of the two, the search engine, and will help you to learn the differences between search engines (because, just as the databases are different, so are the search engines) and how to become an effective searcher.
The four main steps in using a search engine are: translation from your search request into search terms that you type, methods of combining the search terms to give the computer more information about your search request, examining the results presented, and re-searching if needed. This cycle can be repeated as necessary until you have found that which you are looking for or you decide that you have exhausted the selection from this search engine, and you move onto another tool.
Current technology is not good enough to take your search need - "I want to buy a new computer" and turn it into a search request. The burden falls to the user to make their search request as simple and specific as possible. Here are the steps to do that:
Once you have done these two steps, you have the basis for searching - a handful of very specific facets, which if combined, make up the whole of your search request. However, there are still some things to consider when selecting your search terms.
It will always help if you will come up with synonyms for each facet in your search request. (In our example, we could expand "buy" to "buy, purchase, order"; "Pentium-based PC " to "PC, Computer, Personal Computer, IBM-Compatible, 586, Pentium-based, Pentium"; and "mail order" to "catalog, mail order, store, shop") The more synonyms you can come up with for each facet, the better chance the computer will have of matching your search request. Roget's Thesaurus can be very handy during this step.
One caveat about synonyms - try to have about the same number for each facet, unless you want to "weight" a term. We'll talk more about weighting in a little bit.
You must be wary of choosing words that have more than one meaning (like China/china or Polish/polish). The computer doesn't know the difference, and can't tell the context of the terms. Just avoid these terms if at all possible.
If your word is spelled differently in your English and other dialects of English (American, Britain, Canadian, etc.), you will need to include all variations for an inclusive search. Words like color/colour or catalog/catalogue will allow you to select pages from one country or the other. If you want pages from both countries, you'll need to include both spellings in your search. Similarly, being aware of different terms used will help you with successful searching. If this is a concern, look at Britspeak, a UK->US and US->UK dictionary for some help.
Truncation involves trying to look at multiple forms of a word (like woman and women). You can indicate this with a symbol (for example, wom*n). Search engines have different symbols for truncation, but AskScott (or the HELP screen on the search engine) will tell you what symbols to use in truncation when you are constructing your search. Some engines, like Lycos, only allow "right-truncation", meaning truncation at the end of the word (such as child$ for children, child, childlike, childhood). Try to use the simplest form of each word in your search facet and use truncation symbols.
By indicating a relationship between your search terms, you can help the computer rank the pages in an order that is more relevant to you. You can do this through phrase searching, Boolean logic, pseudo-Boolean logic, and term weighting.
Phrase searching is the most powerful of the combination techniques, and you should always use it if possible. When you are creating your search terms, if there are words that usually go together in a phrase, you can indicate that by placing them in quotes (such as "PC-Compatible computer" or "Scott Nicholson"). It's almost always quotes that are used - but you'll want to check the AskScott search page or the HELP screen of the database to ensure that. Nothing will increase the accuracy of the relevancy more than using phrases to search when appropriate.
Don't let the name fool you - this is not difficult. This is placing the words AND, OR, and NOT to indicate a relationship between the search terms for the web database. Not all web databases allow you to do Boolean searching, but you can look at the AskScott entry screen for a quick review of a database's terminology.
AND - By putting the word AND between two terms, you are telling the database "I want to find pages that have both of these terms. The page is not useful to me if it's only got one or the other." In our example, we want pages with "buy" AND "PC". If it's just got "buy", it's not worthwhile. If it's just got "PC" but nothing about purchasing it, we don't want it.
OR - The term OR is used to indicate a broader search. In our example, we would be happy with pages that have "buy" OR "purchase" OR "catalog" OR "mail order". We also will be happy with "PC" OR "Computer" OR "Pentium".
If we are looking for a new computer, we could use:
buy OR purchase OR "mail order" OR catalog
pc OR computer OR pentium
We used OR's between the synonyms for each facet, and AND between the difference facets. Notice that "mail order" is in quotes - which means it is a phrase. Go over this again if you need to - it is an important concept, and is the main method for searching a database that allows Boolean searching.
Grouping by using parenthesis can also be helpful. In the above example, we will want to use parenthesis like this:
(buy OR purchase OR "mail order" OR catalog) AND (pc OR computer OR pentium).
Notice that the parenthesis group a facet. Always use the parenthesis to help you and the computer communicate properly.
NOT - NOT can be used in special circumstances, such as give you a way to use a homonym. If you want to look up things that are Polish (i.e. from Poland), and you don't want anything on shoe polish, you can use (Polish NOT shoe). Notice that you always group the NOT in a parenthesis.
Before continuing, make sure you understand when you use AND (for combination and narrowing; between different concepts) and when you use OR (to broaden a search; for synonyms). These are ESSENTIAL concepts to Boolean searching.
Many of the search engines allow "natural language" searching, which involves two symbols, + and - for psuedo-boolean operations.
If you include a + before a search term or a phrase, it means that the term MUST be in all sites that are returned. It's very similar to a Boolean AND. You might use this if you wanted to find pages about an individual, such as: +"Scott Nicholson" library science searching tutorial HTML. That would gather all of the pages with the phrase "Scott Nicholson" on them, and then move the pages with the other terms to the top of the list.
If you include a - before a search term or phrase, it means that the term should NOT be in all sites that are returned. It's similar to the Boolean NOT.
The final thing you can do to combine your terms is to weight them. By "weighting", you're telling the search engine that some terms are more important to you than other terms. In our example, we might feel that "Pentium" is more important than the rest of the terms, and while they are still important, we really need to make sure it's got "Pentium". Different engines have different techniques for weighting.
If you have a higher number of synonyms for one search facet, that facet will carry more "weight" in the relevance ranking. Thus, if there is a facet of the search that you feel needs to be brought out, enter more synonyms for that term. This doesn't always hold true, but it's a good rule of thumb.
When you get your results, don't give up hope. I usually look at the first 50 listings - and if there's nothing at all about my topic, I'll try a different search or a different search engine. The more time you spend on term selection and combination, the less time you should have to spend on results analysis. When looking at your results, glance at the WWW addresses - pages with similar addresses are probably from the same site, and can be skipped over once you've seen the site. Pages that are titled "Re: XXXXXXXX" are probably from an e-mail archive and usually should be skipped over. The best type of page to look for is one that has other links on the same topic - that is the best way to explore. Another good term to look for is FAQ (frequently asked questions) for an introduction to a topic. If you don't find what you are looking for, then you must either change your search or move to a different database.
If you want to change the search, the best way to do so is examine pages that are like what you are wanting. If you can find one page that's similar to what you are looking for, read over that page for phrases, synonyms, or search facets you could add into your search. Many times, other pages will jog your memory for another term to add.
A good rule of thumb is this - if the page selection you get is too broad, you need to add more terms with AND. If the selection you get is too narrow, you need to remove search facets or change some ANDs to ORs. If there are lots of articles on a facet that ignore the other facets, remove some synonyms from that facet. However, the fewer search terms you have, the more pages you will have to go through to find what you want.
Some search engines (such as Excite) have a button you can push to "find more pages like this one". By using this feature, you can narrow down to a group of relevant documents quickly. This is a good feature when you are having a lot of trouble coming up with synonyms - if you can find one page you like, you can easily find others.
Search Engines are the most common way to access the data in a Web Database. The basic steps to using one are:
Ready for Section Three: Using a Pick List?
You can also go back to the tutorial start page or go directly to AskScott!
Back to Scott's Home Page Extravaganza.All contents of this page are copyright 1996 by Scott Nicholson.