If you have not read the introductory page, please do so before proceeding.
In this document, you can learn what type of information is available on the Internet, how it has been made searchable through web databases, the role that AskScott plays in helping you select the proper database, and how you can get better results on the search you are doing.
Table of Contents
The World Wide Web is the single most extensive information source today. It has millions of authors from almost every country. Over the few years, the WWW brought together easy access to web pages and other information sources like Gopher, FTP, Usenet news, and Telnet. The interesting thing about this is that, for the most part, the information is free once you have access to the Internet.
This data is placed on the web and funded for different reasons. There are five main types of web pages.
Not always. As you can see, there is no filter between author and the published work. This editing position which exists in print and performs a level of quality control is non-existent for most WWW pages. It is important to learn who created the information so that you know the reliability of the data. (Of course, printed information has the same problem. However, the lower cost of publishing on the web invites more problems in this area.)
While not being something to worry about, it is something to keep in mind.
The Internet was created in order to survive war by having no centralized computer. Therefore, there is no centralized database like in a library. This need of an organized tool to search this mass of changing information is met by several "web databases".
The majority of these databases are created and maintained by computer programs called
"robots" or "spiders". These programs search the web for new pages and
record information about the page. Some databases save every word on a page - these are
known as "full-text" databases. Other web databases remove common words (or
words on a "stop list") in order to save disk space. Other spiders just examine
the page for the "most important" words (known as keywords).
Some of the databases are created by humans, however. While a spider may find new pages, a
person actually examines the page and writes a review, selects words from a list (known as
a "controlled vocabulary") and assigns them to a document, and might select
important keywords from the document.
Again, it varies from database to database. Some databases have a "search
engine" attached. These request the user to type in a term or terms and then search
this database for the words that match the user's request. These allow various levels of
sophistication, from those that just search for the words entered to engines that will
also look for synonyms of the words that were typed in. The common factor of all search
engines is that the user types in the terms desired.
The other main access method is through a subject tree. If words from a controlled
vocabulary (a list of predetermined terms) were selected for a document (usually by a
human), then the user can search the tree to find works on a topic. If you haven't seen a
subject tree, take a look at Yahoo now. (Use your
browser's back button to return).
That's a complicated question. Different methods of database creation and database searching are good for different search topics. Studies in the Library and Information sciences have shown that both a search engine and subject tree have their benefits. One of the web databases claims that their database is better because they record every word on the page. However, if you wanted to search for all pages with AMERICAN HISTORY, it would be much easier in a database that either only picked out the most important words or in a subject tree.
Thus, the best web database depends upon your search topic.
This is where AskScott comes into play. If you come with a search request, AskScott will lead you through a question and answer session to the database that is the best starting point for your search.
There are two main reasons:
AskScott helps you find the database you need, and gives you advice about its operation.
AskScott will be updated as databases grow and change. New databases will be examined to see where in the current hierarchy they fall, and old databases will be reclassified if need be. What you are using today may change tomorrow, and AskScott will help you use the best tool for your search need.
If you are having trouble with your search, there are two things you can do. You can move to a different database and try your search there, or you can examine your search strategy.
The strategy you use in a search engine is the key to finding information you desire. On each page where a search engine is recommended, hints have been provided to remind a user of the aspects of that search engine that are different from other engines.
Almost every web database lists results in a "relevance-ranked" order. This means that the program will list the web pages that best match your request near the top of the list. A side effect of this is that you should enter as many "search terms" as possible. The more search terms you can come up with, the better your search will be.
Each search engine is different, and the advice on the individual page will help you, but an important search term to notice is the AND term (or the + sign, depending on the database). If I enter eggplant AND stew, the program will look for pages that have eggplant and stew on their pages. However, if I enter a "phrase", such as "'eggplant stew'", then the program will look for pages that have those two words in a row (again, check the individual engine to see how to enter a phrase). So, if I wanted to find recipes for stew that uses eggplant as an ingredient, I would enter "eggplant AND stew". If I wanted recipes for eggplant stew, I'd enter the phrase "eggplant stew". Other ways of creating phrases may be with the terms NEAR, ADJ (for adjacent), or W (for with).
Another useful strategy to learn is with the NOT sign (also known as - in some engines). Since the word "POLISH" could be a thing you do to shoes or a thing from Poland, it is important to use a NOT sign when doing searches. Let's say I want to learn about things from Poland - I would enter "polish NOT shoe".
A good thing to know when searching is truncation. Truncation symbols allow you to find similar forms of the same words. For example, if you were looking for information on discipline. If you searched for "discipline", you'd only find works with that term. However, if you search for "disciplin$" (assuming that $ was the truncation symbol in this database), you'd find discipline, disciplinary, disciplinarian, and any other words that started with 'disciplin'.
A similar tool to truncation is a wildcard. These help you to identify other forms on a word in a different way. If I was looking for works on females, I might search on the term "wom*n" (assuming that * was the wildcard symbol). That would find the words woman, women, womyn, and any other words that started with "wom" and end with "n".
This is a case of either not entering enough search terms or entering terms with multiple meanings. Either come up with synonyms for the terms you are searching or use the NOT command to make your search more precise. In general, the more search terms you can come up with, the faster you'll find relevant pages.
If this happens, you need to limit your search. Think about what you are REALLY looking for, and enter terms having to do with more unusual aspects. You can also try using a "phrase" instead of an AND for your search (see the above section for details).
This next section discusses the creation of AskScott. If you are ready to search, you can go to the AskScott home page now.
While I could try to come up with some goofy acronym (like Search-engine, Client-Oriented Tool Thingy), I won't. :) It's called AskScott because it simulates a Q&A session with a person - and as it's a little bit of my thoughts...not to mention it satisfies my ego. ;) So, if you want to know a little bit of my thoughts about the search engines, just Ask Scott!
Dr. Scott Nicholson, Assistant Professor and Bibliominer (bibliomining is data mining for libraries) at your service. I'm working with a team of Syracuse University students and alumni to keep AskScott up to date. It's hosted at the Information Institute of Syracuse.
Most of it comes from the Help and FAQ screens of the web databases. Some comes from literature, some from original research, and some just comes from personal experience with the web databases. AskScott is a collection point for this scattered information.