This guide was written by Charles Harris (with lots of help)

Internet Search FAQ Home
Useful Links – Urls for a rainy day

Contents

  1. WHAT IS THIS FAQ?
  2. DISCLAIMER
  3. WHY USE THE INTERNET AT ALL?
  4. HOW CAN I FIND…?
    1. How Can I Find Specific files,
      texts, multimedia or people?
    2. How Can I Find Specific information?
    3. How Can I Find More General
      Background Information?
  5. HOW CAN I FIND THINGS FASTER?
  6. SHOULD I PAY FOR INFORMATION?
  7. WHERE CAN I GET FURTHER HELP?
  8. HOW CAN I VALIDATE WHAT I FIND?
    1. How reliable is the Net?
    2. What can I do about it?
  9. WHAT ABOUT THE FUTURE?
  10. URLS FOR A RAINY DAY – Loads of
    useful links for research of all kinds
  11. END CREDITS

Awards for the Internet Search FAQ

The Control Voice “You’re Neat Award”
Britannica Internet Guide Award


1. WHAT IS
THIS FAQ?

Although this website was compiled
originally for writers, it has become increasingly clear
that this FAQ (Frequently Asked Questions) list is of use
to anyone who wants to find their way around the Net.

It grew out of a cry for help that I sent out, in
desperation. As a professional writer, I wanted
information of a variety of types. One day I might want
specific dates, another day just background information. I
wanted to know if I could use the Internet to find these
different types of information quickly and reliably. And I
wanted to know which of the many different bits of the
Internet would be good for which different type of search.

However, the vast majority of books, articles and Usenet
postings do not address the question from the point of
view of the user, and tend to be obsessed with either
vague surfing or searching out free software. The last
thing I wanted was yet more software.

I was pleased to receive a number of responses to that
original cry for assistance – useful and supportive
answers, which gradually became the foundation of this
FAQ.

I discovered that the Internet has many useful parts that
the average Google user never finds. For example, have you
ever tried the discussion groups that are technically
called Usenet? There are over
20,000 on every subject you can imagine, filled with
useful information and people to contact for research.
They can be found in Google under More/Even more…/ Groups

The FAQ tries to look at the Net from the point of view
of the user. So it is divided into the kinds of questions
that Net searchers might have. It also includes “worked
examples” where possible, to clarify the methods that can
be used.

There is also a list of useful links (or URLs) URLs For A Rainy Day which includes
many of those mentioned in the main text and a load more.

I haven’t tried to explain what all the technical terms
mean (eg: URL, ARCHIE, FTP…) These are very adequately
explained in a thousand postings, books and magazines. The
problem is knowing which to use in which circumstances.

All suggestions and
comments
are welcome. If sending information on an
existing link, please include the link’s title in the FAQ as
well as the address and any other information that you feel
will be useful. If suggesting a new link, then I generally
only include sites that have a wide range, and usually only
those which compile links to other sites or will in some
other way help Internet searchers. Click here.

Please be aware that the examples of searches on this page are included just to show the method, the links have not been kept up-to-date. You should find up-to-date links (where available) on our links page – URLS for a Rainy Day


2.
DISCLAIMER

URLS, e-mail addresses, etc, are generally included on
the recommendation of satisfied users. They are passed on
herewith without prejudice! I try to check them out to
ensure that they are relevant and useful but make no
guarantees that they are still there, or in fact ever
were. The contributors and I take no responsibility for
any loss, damage or waste of time in using them or indeed
for anything else. Sorry.


3. WHY USE
THE INTERNET AT ALL?

3.1 If you want to use the Net effectively, you need to
be prepared for what it can and can’t do.

The Internet is not a substitute for a good library. The
Internet can be very frustrating. The Internet is very
variable. The Internet is not well indexed. And the
Internet is not comprehensive. So is it worth using at
all? Well…

3.2 The Internet is an additional source of information,
which often can’t be found, or isn’t as up-to-date,
elsewhere.

“Searching for data on Internet can be frustrating but
what you find often can’t be found in a library — the
same is true in reverse. I didn’t stop using the library
when I started using the Internet.” (writer Laurence A.
Moore)

3.3 The Internet is convenient, and supplies information
in usable form.

“One handy thing about Internet research is that when I’m
done, the results are on my computer. With the library,
the best I can do is photocopy what I find, or bring the
books home and type the data in.

“Looking out the window above my computer, I see birds
and autumn-coloured trees and calm, quiet, gently-falling
rain. As soon as I send this, I’m going to bring a mug of
fresh coffee back from the kitchen and take off on
Internet. Can’t do that at my local library!” (Laurence A.
Moore)

3.4 However, the Internet has to be worked at. The
“superhighway” is still substantially under construction.
As one writer put it: “the Internet is an enormous library
in which someone has turned out the lights and tipped the
index cards all over the floor.”. (Or, variously, “Like
trying to work off the librarian’s notes after discarding
the card catalogue,” Allen Schaaf)

3.5 Be realistic and focused about what
you want to find. Do you want a precise fact, or more
general background material? How will you know when
you’ve found enough information – or when to stop
trying? Faced with the enormous size of the Net, it’s
tempting to believe that the ideal link is just around
the next corner, but some types of information simply
aren’t there, while other information may exist on the
Net, but be extremely difficult to locate. Sometimes, to
be honest, there are easier ways: a phone call, the
local bookshop, a friend of a friend.

Nevertheless, the more you learn about the
Internet, the more you become aware of what it can and
can’t do. The most difficult way to approach the
Internet is when you already have a large and urgent
piece of research to conduct. Better to check out small
areas of it without stress, for a few minutes at a time,
on a regular basis. Give yourself a chance to play about
with the Net when the pressure is off, so that when the
pressure is on you can find what you need quickly and
efficiently.


4. HOW CAN I FIND…?

What’s the best and most efficient way to look for what I
need? (Here we look at some ways of finding the different
kinds of information that’s on the Net.)

4.1
How can I find Specific Files, Texts, Media (images,
sounds, etc) or People?

4.1.1 How can I find a specific file by name?

The more precise you can be with your search, the better.
So if you have a precise filename, you’ve got the best
chance of finding what you want.

Many search engines and meta-search engines now have
facilities for searching for software files, etc). Try Google for example or
many of the others listed in URLs For A Rainy Day.

There are many books, articles, etc, on the Internet
which show how to search for specific filenames, using
Archie, etc, so this is not dealt with further in the FAQ.
However, researchers rarely have a precise, or even
imprecise, filename. So….

4.1.2 How can I find a specific text?

There are an increasing number of web and FTP sites which
hold public domain copies of a wide range of classic
texts, song lyrics, etc. Some links are given in URLs For A Rainy Day. You can
also link to some of these via: http://dspace.dial.pipex.com/jane.dorner/jd_links.htm.

There are history archives on the Internet and a number
of libraries on the Net. For example, David Brager
suggests the Library of Congress’ American Memory section
http://rs6.loc.gov/ – “Large collections of primary source and archival material
relating to American culture and history.”

In addition, increasing numbers of search engines will
allow you to search across a number of search engines for
specific items such as lyrics. One such is OnlineSpy. They are
called “metasearch” engines.

4.1.3 How can I find a specific image, sound or
movie clip?

Many “metasearch” engines, such as OnlineSpy (see above) will allow you to
search for images specifically – or even sounds or movie
clips. You may however need to be very precise with the
terms you search with (see next section for how to use
search engines with precision).

One particularly useful site is Image Surfer, recently
developed by Yahoo. This is a search engine which you can
search by category or using search terms, but instead of
giving its answers in text form it produces a series of
small thumbnail images. Much the most useful image
searcher I’ve yet seen, Image Surfer’s capacity is still
small, but Yahoo promise it will grow in size. Well worth
checking out.

ImageFinder
gives you a number of different databases to search for a
variety of types of image – eg: the Smithsonian
Photographic Collection or Colombia University Image and
Video Catalog.

Useful for both pictures and sound is the
search engine HotBot
http://www.hotbot.com which provides tick boxes to allow your search
to include still images, video or audio sound clips, or
even shockwave animations. Said to be one of the best
MP3 search engines at the moment.

4.1.4 How can I find specific people?

There are many resources on the Net that can help you
locate and even make contact with specific people – famous
or not, individuals or companies. Whether they’ll be of
any use to you will depend on a number of factors, not
least geographical.

As with so much on the Internet, the vast majority of
resources are devoted to the USA. So there’s little
difficulty in finding directories and databases with
look-up or even reverse look-up facilities covering just
about every member of the US population, alive or dead.

(Particularly intriguing, in passing, is Ancestry.com which
among its useful resources for genealogical research
allows you to find the social security number and other
details of any dead American…. and then offers a
facility to write a letter! Do they know of some postal
service that we don’t?)

More wide-ranging are the directories of email addresses.
However these are far from all-inclusive, even assuming
your target has an email address. Some Internet Service
Providers – such as CompuServe and AOL used to provide a
look-up service which included all subscribers (and
probably still do) but only for other subscribers, as I
understand.

For the rest, directories such as BigFoot rely on
finding email addresses of those who have web-pages or
post regularly to newsgroups. By no means does this
include everybody. Expect to have to try a number of sites
before you find a lead.

In Urls For A Rainy Day – there are numerous search
facilities giving a number of meta-search

engines, people searchers and reference sites which
offer specific people-finding databases. Particularly
useful are those such as Langenberg which
have links to many different “people” sites on one page.

There are also databases devoted to certain types, eg: politicians.

Organisations are generally easier to find through a
search engine. But even then it is not always easy –
especially if the organisation doesn’t have a web page of
its own. However, David Brager tells of one very useful
site. If you know an organisation or individual’s domain
name (ie: the bit of the web address before .com, .co.it
or whatever) you can use it to find all kinds of details,
from contact e-mail and snail-mail addresses to phone
numbers at http://www.vservers.com/before/dnscheck/.

Whether looking for people or organisations, in difficult
cases you may need to try the more refined methods for
finding information by using Search Engines, or posting
questions on Newsgroups or Mailing Lists (as described in
the next sections)

4.2
How can I find Specific
Information?

(eg: dates and places. Or questions like: “what is a…?”
“who is…?”)

4.2.1 Search engines
are popular for this. You type in a key word or phrase
(such as Spain, or Spanish Civil War) and wait to see what
they provide.

The popularity of search engines on the Net can be
changeable. When I started this FAQ there was no clear
winner. Then Alta Vista appeared, and for some time beat all the
others hands down. Since then Google has taken over
at the top. Google has many strong points, including
simplicity, a lack of adverts and the ability to check its
own “cache” of pages if the page you’re looking for has
temporarily disappeared. But no search engine is perfect
and different people have their different favourites. You
can find many other good search engines, each with its own
particular strengths on our Links Page.

No search engine covers 100% of the Net.
Indeed the latest I hear is that the very best cover no
more than 16% of existing web-pages and are struggling to
keep up.

The trick with using a search engine, is to know what
each is good for and to look carefully at the hints and
tips that they offer. For example some engines will only
search for a precise phrase if you put it in quotes – such
as: “Spanish Civil War.”

Planning is necessary for any search. Do
some advance work with a Thesaurus and list a fair number
of relevant search terms. Remember that search engines
aren’t like “Find” facilities on word processors. So you
can afford a scattergun approach, trying a number of
related words at the same time in case one of them hits
home. For example: in starting a search for items on
dealing with tiredness you might type the following
related terms into the search box: fatigue overwork
tired exhausted exhaustion sleep
.

Most search engines treat key words as potential parts of
words. So in the above example fatigue will also
find documents containing the word fatigued, and
sleep will find sleepy, sleepless
and sleeping pill. But while exhaust
might have found exhausted and exhaustion
it has been avoided so as not to pull out articles on car
engines and pollution!

If you find you’ve got too many articles, you can often
make the search more specific by adding words you want to
see (eg: overwork) or conversely specifying
terms that you don’t want to see (eg: Seattle).

Often you do this by using symbols (+ and -) or logical
terms (AND and NOT). Check the rules
for the search engine you’re using. Some require
wild-cards such as * to stand for missing letters. And
many of them allow increasingly sophisticated ways of
refining your search, by suggesting useful key words or
popular web pages.

Note: you may not get access to the hint pages
unless you’re accessing directly via the search engine’s
home page.

In addition to using the search engines’ own help pages,
you can find a brisk and useful guide to the top search
engines and how they work from the Web Search Cheat Sheet
www.colosys.net/search/.

Click here to read an actual example of a search using different search engines.

See also Developments on the Net

As I said, each search site has its particular
strengths.

Nick Tompkins writes to tell us about Google. “I am currently
training employees at a major TV company, so I have been
showing them a few search engines – running searches for
the same subject matter to see how much relevant material
is found. Google came out tops and a producer found his
own name on a site selling videos of a film he made a few
years ago. He went straight to the legal dept to see if
they owed him any money…..”

Look in particular for the “search within results” link
at the foot of Google’s results page, which gives you a
chance to narrow down your search, if your first search
pulled out too many sites, or too many that weren’t
relevant.

Pat Marcello adds, “I also like the fact that Google
achives web pages, so if a page is offline, I can still
get to the information.”

Mike Casswell: “The best feature of all, in Alta Vista, is the
Advanced Query Page, which is a different page (linked
from the Simple Query page). This has a number of clever
search tools. I often use ‘near’ which is both simple and
powerful. There is also a help page for the Advanced Query
syntax.”

However, TJ had mixed feelings about search engines: “I
find that using a keyword or Yahoo gets me much more than
I wanted. For some reason, I feel as if all I have to do
is type in a subject and I’ll find everything referenced
on that one subject. Doesn’t happen that way, does it?”

If you want to find a lot of search facilities in one
place, Ellie Kuykendall says, “I very much like http://www.beaucoup.com.
There are over 600 search engines on one page…everything
from government to e-mail addresses. Takes a bit of time
to load, but I use it all the time.”

4.2.2 Meta- and
Multi- Search Engines

“Meta” search engines use various techniques to search
across a number of engines at once. They can often be
customised for different types of search allowing you to
select which search engines you want to use, and many
offer a number of specialist categories, including many
databases that are not covered by normal search engines.
One excellent meta-search engine is Profusion http://www.profusion.com/.
Profusion combines search results in a single list
(avoiding duplication). It’s fast and easy to use, and can
also check that the links are still live.

Alvaro Ramirez recommends the Meta Crawler: “It may not
be very accurate, but fast… it sure is!” http://www.metacrawler.com/

Often confused with meta-search engines are “multi-”
search engines. I bet you’re confused already. Essentially
they do the same job, sending your search terms to a
number of search engines at once, but they don’t try to
combine them. Instead they display the results from each
search engine in separate windows. Two excellent
multi-search engines are Search Spaniel http://www.searchspaniel.com/
and theinfo.com http://www.theinfo.com.

When you enter your search terms, they open a new window
for each of the search sites individually. Beware, this
can be a bit overwhelming if you’ve selected all possible
sites!

They also make no attempt to combine search results, so
you have to be prepared for a fair bit of sifting, but
that can be an advantage in some cases, as different
search engines rate sites in very different ways. So this
approach is useful for those more difficult searches,
where your search terms may be less easy to narrow down.

For example, I tried to use theinfo.com to find a site
relating to the feature film “Go”. Now being both a verb
and a game the word “Go” is likely to appear on a million
pages, even capitalised, so I wasn’t surprised to find
zilch on the first attempt. I closed the myriad new search
windows that theinfo.com had opened (and selected a few
less sites!) and tried “Go AND cinema”. Some metasearch
engines have difficulties with search terms that use
expressions such as AND, OR, +, – etc. In this case, some
of the sites used by theinfo.com still came up with
nothing useful, but others put the film “Go” at the very
top.

Note: If I’d looked carefully enough I’d also have
noticed they have a set of specialist search sites,
including a category for “Movies”. There are ten
specialist categories – good, but not as wide-ranging as
Search Spaniel which offers 23 as well as a “personalised”
list (not that I’ve ever managed to get that feature to
work!).

Theinfo.com doesn’t pretend to be the only search engine
you’ll ever use. Like any search source, you need to have
the right kind of query. There is even a page on the site
that explains when it’s useful to use theinfo.com as
opposed to other sites, an excellent feature I wish more
search engines would adopt: http://www.theinfo.com/about/whyuse.html

Most of us tend to be lazy and stick with just a couple
of search engines that we are used to. One bonus that
comes with using a multi-search engine is that you get a
chance to see new engines you may never have seen before,
and to catch up on the latest advances of those you may
not have touched for months (or even years!) Thus it can
be a quick way to test the strengths and weaknesses of
different search engines against each other.

Thus both Multi- and Meta- search engines can be
extremely useful, but be aware that they don’t always
provide all the facilities of the original search engines.
For example, searching recently on the title of my own new
project – “Paradise Grove” – I found our own film website
on Alta Vista but not on Profusion (which uses Alta Vista)
because Profusion treated the capital letters differently!
However, these are minor drawbacks. On the whole, the good
meta/multi-search engines are well worth using.

For more metasearch engines click
here.

But the Internet offers far more resources than just
search engines…

4.2.3 Usenet newsgroups
are one of the Internet’s best kept secrets. The majority
of Internet users have never heard of them but they can be
extremely useful for asking specific questions,
researching general topics, discussing issues and keeping
in touch with a community of interest of any kind.

The World Wide Web is by no means all there is to the
Internet. Indeed there are other areas which can be just
as useful for finding information, and perhaps the most
important of these is Usenet. There are currently 24,000+
newsgroups, covering just about every subject you can
imagine, and a few you probably can’t! Explore them here

Beth Porter says: “Post messages in classy newsgroups and
fora. Can be dodgy, but it’s paid off for me quite a few
times.”

And Yvonne Hewett: “I use the Net for research by the
simplest method possible: going into the list of
Newsgroups and searching it for the topic I’m interested
in, and then posting to the group.

“I’ve found that the Net is like most places where there
are people with expertise: if I approach them properly and
ask intelligent questions, the answers are usually
forthcoming. If answers aren’t, I often get pointers to
people who are in the know. And like any other research,
it takes time and patience to work through the masses of
non-indexed information.”

This can lead to some very precise areas of research.
Usenet groups are grouped under categories or regions,
with subcategories and often sub-sub-categories – such as
Arts & Entertainment, Society or Alternative (Other).
So for example, soc.history.medieval
gives you ongoing discussions on the Middle Ages. Alt
tends to be more radical and critical than the other
categories – thus under Other, you’ll find alt.atheism.
Most groups also have their own FAQ.

You can search one or more newsgroups using either AltaVista (selecting

“search Usenet”) or Google (Click the criss-cross square icon top right and select
“Groups” – not “News”). You’ll also find that Google
groups includes many groups solely set up under Google.

If you don’t know which newsgroup(s) will be best for
your subject, then try putting an appropriate query into
Google Groups or AltaVista and noting what newsgroups come
up.

Once you’ve located useful groups, you can make things
easier by subscribing to an appropriate newsgroup using
specialised software such as Free Agent or the “News” facility built into many email
clients, such as Thunderbird, Opera or Windows Mail.
Subscribing in this way is best if you want to keep up
with ongoing discussions and post your own questions.

Click here
if you want to read an instructive example
of
searching Usenet for researching a specific question.

In addition, almost all newsgroups have a FAQ (Frequently
Asked Questions list) which can be a mine of useful
information, or at least tell you if you’re in the right
place. FAQs can be searched for at Infinite Ink
or downloaded from ftp sites ftp://ftp.uu.net/usenet/news.answers
and ftp://rtfm.mit.edu.

IMPORTANT NOTE: some newsgroups don’t take well to being
“used” by strangers. If you’re thinking of posting a
question, lurk for a little while first, to check out the
prevailing mood and make sure your question hasn’t already
been answered in the group’s own FAQ. Avoid assuming that
other users are only there to provide you with free
answers to your questions. And do make sure that your
subject line is a useful guide to what you’re asking:
“Information wanted” is not so effective as “Who won the
Battle of Bosworth Field?”

4.2.4 Mailing

lists can be helpful in similar ways.

For those who haven’t met mailing lists yet, they are the
equivalent to newsgroups, but you receive all the postings
(or a digest of them) by email. There are even more
mailing lists than Usenet groups, and some are very highly
specialised indeed.

Excellent places to start searching for appropriate
mailing lists are Topica,
which also allows you to read the lists, messages and
discussions on-line, Liszt
and Windweaver

Web Resources. For more resources check on our links page.

Alternatively, you can obtain a list of mailing lists by
sending an email with the single word HELP in the body of
the email, to mail-server@sri.com
or mail-server@rtfm.mit.edu.

4.3
How Can I Find More General Background Information?

As the searches become wider and less specific, the
Internet becomes more tricky to use.

4.3.1 Newsgroups and mailing lists
remain useful nonetheless – as above – for posting
specific queries, obtaining FAQs, or just lurking and
seeing what ideas crop up.

Jane Dorner: “Quite honestly the best thing is to join a
newsgroup dedicated to the research subject in hand and
trawl that until you find what you’re looking for.”

4.3.2 Gopher, etc.

Few writers mentioned Archie, Veronica, Gopher, WAIS or any
other use of FTP or Gopher- space. However, before I go much
further I should say that this could be for a good reason.
As Steve Hunt writes, “For all practical purposes they are
dead.

“I think all the Veronica and Archie servers are down, and
the only gopher I know of that is still running is the gopher at University of
Minnesota
, probably because they created the gopher
protocol.”

I suspect that this too has now gone the way of all other
gophers, but if do you find one that works, or are just
interested in ancient history, this is how it used to work
in the early days:

Marnie Froberg researched police corruption using Archie
(for ftp file site searching) and Veronica (for searching
worldwide gophers) and WAIS based search engines
(TradeWave Galaxy and Harvest both of which “run WAIS in
the background”).

“AmyWriter” however found great success with Gopher:
“Through Gopher, I’ve downloaded some great files that go
beyond what the encyclopaedia has, e.g. for Haiti, I got
info on all the political stuff that is happening NOW from
news articles, white papers, etc. This is info that would
be dated in the encyclopaedia.

“Basically, I go into Gopher and type in, “Jamaica,” for
example. This brings up a list of reports on many topics
of interest which I scan and select and then print out.
For example, there might be a college professor’s report
on current Jamaican politics.”

Gopherspace could give very quick and informative answers
to queries, but gradually disappeared as more
organisations have moved over to the Web. However, gophers
covered a large number of databases that were not on the
Web and which contained a wealth of information and texts.
You access gopher space using dedicated gopher software or
from a Web browser by typing the gopher’s address (it
starts with gopher:// instead of http://) usually followed
by a port number (typically 70) as in gopher://gopher.ic.ac.uk/70

.

Gopher search engines are called Veronica
or Jughead. Veronica was the more recent.

A typical Veronica search (on the single word “uranium”)
brought 146 items within seconds. Some of these (again
typically) were out-of-date or the connections didn’t
work. The rest gave me everything I might have wanted to
know about uranium, from its elemental properties to the
latest uranium mining figures from various parts of the
world.

Gophers were generally run by universities and government
departments, so seemed to be best for academic and
governmental type searches, although there was some quite
non-academic stuff there as well.

4.3.3 Web directories may be better for
vaguer background research than search engines because
they allow you to follow through a line of thought on a
root and branch principle. Try clicking on the appropriate
“branch” of Yahoo (for example) and then narrowing down: http://www.yahoo.com.
However, Yahoo can look rather limited at times, with a
relatively small database. To an extent this is a problem
with all directories, which can’t be compiled as
automatically as search engines, and so tend to be smaller
and less up-to-date.

Galaxy http://galaxy.einet.net/galaxy.html
is much more clearly laid out than Yahoo – easier to see
where the different sub-headings are, and seems generally
a better choice at the moment.

It should be noted, though, that the distinction between
search engines and directories is becoming increasingly
blurred. Many of each now have a “web search” option,
links and directory-like services – check out Google and Alta Vista.

4.3.4 Value-Added Guides are often more
helpful than directories. They offer fewer links, but
pre-select them to filter out the dross. In addition,
their descriptions are generally more detailed than
ordinary directories and search engines. Clearinghouse,
for example, http://www.clearinghouse.net/
provides topical guides to the Internet. They aren’t as
comprehensive but provide value-added descriptive and
evaluative information ideal for researchers.

Encyclopedias have a similar function, allowing you to
browse topics at leisure. One of the best is Wikipedia a free
encyclopedia which allows all users to edit any page, thus
aiming to use the pooled expertise of millions of surfers,
generally to good effect.

Those Encyclopaedia Britannica people also run an
excellent value-added combined search engine
and directory
. It’s actually quite good, however
they don’t seem to be able to decide whether to charge for
it – it started off as a paid-for service, then went free,
and is now back to subscription only (although you can
take a free trial). Probably best to try the free sites
first before you shell out your money.

4.3.5 Search engines, however, appear to
grow less useful as the query becomes vaguer. Much of the
problem lies in knowing how to phrase the key words. On
wider background searches these can become confusingly
all-encompassing.

4.3.6
WebRings

WebRings are an interesting new development which could
be useful for general browsing.

One of the most difficult things to
duplicate on the Net is the ability to browse around a
subject, slowly but thoroughly building up a solid base
of knowledge. Somehow it’s a great deal easier in a
physical library where you can find lengthy books on
specific topics. Web pages have a tendency to be lighter
in content than most books, and following links can be a
remarkably hit and miss affair.

With WebRings, groups of sites on a topic
are linked together so that you can move easily through
the sites, forwards and backwards or even at random. In
theory, a WebRing should also provide a certain
guarantee of quality. You can search for WebRings at
http://www.webring.org/. Worth checking out to see if there’s
something on your chosen subject.

4.3.7 Blogs

Blogs vary enormously. They are a blend of
on-line newsletter and discussion group – sites where
individuals and/or subscribers can post news stories,
links, discussion points or the detailed minutiae of
their life. Some are riveting, some are less so, and the
range of interests is idiosyncratic. The form broke
through most decisively during the 2003 Iraq War, when
bloggers could give more up-to-date, and let’s face it
considerably less censored, news than any of the
official news services on either side.

To poke around among the bloggers try
looking at the Blogs listed at:

Eatonweb

Blog Portal
Blogger

Look for the option to list by categories,
or search on a particular search term. In addition,
should you feel the urge to set up your own Blog, the
last two sites listed above will help you do it for
free.

4.3.8 Proximity Searching

A more unusual tool for research comes from NameBase. In
addition to a database of useful articles in a number of
fields, particularly social, political and commercial, you
can perform a “proximity search”. Search on a name, and
their database creates a “network diagram”, linking a wide
range of related names, grouped according to how close or
frequent the link is found to be.

Easier to use than to describe, try it out.

4.3.9 The “Deep” or “Invisible” Web

When conducting a search on the Net, the main search
engines can be very good – but be aware that there is a
great deal of the Web that they are totally unable to
search. This has been called the “invisible web” or the
“deep web”. It exists for a number of reasons. Most
important is that many excellent databases cannot be
searched by the search engines’ automatic software (“web
spiders”), either because the spiders cannot access the
databases, because otherwise ordinary webpages are
constructed in ways that interfere with the workings of
the spiders or for other technical reasons.

Frames and dynamic pages interfere with the way that the
web spiders access information and return useful addresses
for the search engines to use. And the text content of
images or Adobe pdf files cannot be examined. In addition,
some databases simply won’t work with spiders or refuse
access.

For example, you can’t search on a phone number in Alta
Vista and get an answer from Anywho.com.
Financial databases, newspaper archives, government
information, almost every kind of resource or database is
affected by the “invisible web” problem. If you rely
solely on search engines, you are limiting your resources
to a tiny fraction of what is out there.

The best way to deal with this is to have a good supply of
databases which are specific to your subject. These will
lead you faster and more surely to the information you
need than any search engine, which will on current
estimates only cover 1/500th of the 500 billion pages now
on the Wwb.

Many of the links on our links page
will return information that search engines cannot find.

Other useful strategies include searching Usenet and other discussion
groups, posting queries on discussion groups, and using expert resources.

There are a slowly increasing number of sites for
researching the invisible web. Two of the best include:

www.invisibleweb.com
– developed by Intelliseek
www.completeplanet.com
– 20,000 approx invisible web databases

Further information on the invisible web (and links) can
also be found at these two sites:

http://www.searchenginewatch.com/sereport/00/08-deepweb.html

http://websearch.about.com/internet/websearch/library/searchwiz/bl_invisibleweb_apra.htm

Thanks to Steve Hunt for pointing out the invisible web
issue.


5 HOW
CAN I FIND INFORMATION FASTER?

5.1 First, there’s the obvious: get a
faster modem, or an extra-fast connection like an ISDN
line, ADSL or cable-modem connection. Or upgrade your
processor, RAM and video RAM. However, these cost money,
and you’re still at the mercy of a slow connection
somewhere the other side of the world.

5.2 If you don’t need pictures, then set
your browser to load web-pages without them.
Unfortunately, there are still some sites which are
virtually unusable when displaying text only. Luckily, not
many.

5.3 Less obvious is
the question of efficiency. The Net is so large that it
takes time to get to know any one subject area – to suss
out some databases you can trust, assess which sources are
best for which kinds of information. You can make on-line
life easier for yourself if you focus on relatively small
subject areas for relatively long periods of time. It’s
more difficult if your work or inclinations lead you to
research civil engineering one day, single parents the
next…

5.4 Books and Bookmarks

Ultimately, you can’t beat a good set of URLs in a
well-maintained (and backed-up(!)) bookmark or favorite
list. Some of the best URLS come from experience. Others
can be culled from books, newspapers and magazines.

Beth Porter: “Get hold of Computer Life’s Road Map to the
WWW, which is sweetly laid out in category globules
[Sports, Media & Entertainment, Politics]; there’s
also the Internet White Pages, published by IDG Books
[Godin & McBride] … more URL’s than you’ve had hot
dinners.”

5.5 Some popular sites now have one or
more “mirror sites” in other parts of the world, carrying
the identical information. If accessing one of these,
choose the mirror site in a time zone which is likely to
be least in demand, eg: between 11pm and 7am local time.

5.6 Consider going on-line at more
expensive times of day (if you have to pay for phone
calls) or using Internet providers with better bandwidth
and modem/user ratios. The extra cost may well be
outweighed by the greater efficiency and faster access
times. Talking of money….


6. SHOULD
I PAY FOR INFORMATION?

6.1 OUTERNET.

Richard Broke: “One of the problems of the Internet is
that it is free! So, basically, you get what you (don’t)
pay for – much of the time. The Outernet is the name given
to pure knowledge databases which are subscription only.
Probably the biggest is called Dialog (sometimes aka
Knowledge Index).

“Because they are selling data, these outfits are
reliable (by which I mean accurate) and up-to-date.”

However this probably is mainly of use to those whose
work can justify the expense. Dialog begins with an annual
sub of 30 UK pounds (or equivalent). However, to that you
must add on-line charges which depend on where you live
and which database you access. Some databases charge $12
per hour, while others go as high as $225. Then there’s
charges for displaying documents (say 60c per document),
extra charges if you print stuff out, connection charges
if you’re outside the US…

Web: www.dialog.com/
or phone:

UK: 020 7930 5503 USA: 800-334-2564 or 415-254 7000

I have phone and fax numbers for other European countries
if wanted.

6.2 HIRING
RESEARCHERS

There are other ways of paying too.
Alex: “As with traditional research, you may find it pays
to hire someone to find the information for you.”

Alex gave details of Mindsource ….an
organisation that finds information for people. Costs
start as low as $50/Quarter, but you get more the more you
pay. At the moment Mindsource is probably not as useful as
it could be – but that may change. For details of
Mindsource: send blank email to mindsource@memo.net.

6.3 INTERNET
SUBSCRIPTION SERVICES

There are subscription on-line services on the Internet
itself, Microsoft Encarta. Concise definitions are free, the full
encyclopedia payable.

For variety and topicality, check out the Electric Library.
Electric Library allows you to use “plain English”
searches across a wide range of newspapers, magazines, TV
and radio transcripts, dictionaries, encyclopedias, maps,
photographs, and literary and artistic classic works. When
I’ve used it, it’s been fast and efficient at pulling out
relevant information. You can also download special
software (Windows and Mac) to search the library without
needing an Internet browser.

(Note: Electric Library goes offline each day from 04:30
to 6:30 EST for uploading new material. Depending on where
you are in the world, and what times you like to work,
this down-time may or may not be an inconvenience. For
example in Britain, this is 09:30-11:30).

Irritatingly the site doesn’t display the current
subscription rate (it used to be $9.95 per month,
unlimited use – $59.95 p.a.) but there’s a seven day free
trial

Finally, you can download whole databases
to keep on your hard drive, for a fee. One example is
NameBase – an index of individuals, corporations and
groups compiled from 600 investigative books published
since 1962, and thousands of pages from periodicals
since 1973 – covering the international intelligence
community, political elites Right and Left,
assassinations, scandals, Latin America, big business,
and organized crime.

Download a 10-day free trial of NameBase’s
entire index from
http://www.namebase.org.


7. WHERE
CAN I GET FURTHER HELP?

7.1 ALCS has a
dedicated writers’ server, put together by Jane Dorner and
Chris Barlas. (http://www.alcs.co.uk)

Chris writes: “One of the features is a writers’
information directory, a series of hyperlinks that writers
have found useful for research purposes.”

The Society of Authors have also developed a site (http://www.writers.org.uk/society/)
put together by Storm Dunlop. As has the Writers Guild of
America (http://www.wga.org/).

(Note: The Society of Authors has temporary server
problems, some users may find problems for a few weeks).

7.2 Writers’ Personal Resource
Pages

There are a number of personal writers’ pages on the Net,
offering useful links. Some are listed here (9.8).

7.3 Scrnwrit Mailing List

Marty Norden tells of the screenwriters list called
SCRNWRIT: “There are plenty of folks there who might be
able to direct you to the right sources. If you’d like to
join, just send the one-line message “Subscribe SCRNWRIT”
to Listserv@tamvm1.tamu.edu.
Be aware, however, that SCRNWRIT is an unmoderated and very
active list. You’ll easily receive 50-100 messages per
day
from it, sometimes more.”

7.4 Research by Real People

A growing number of free resources offer searching by
soft, warm-blooded sentient beings, rather than computers.

ProfNet
is a collaborative of 3900 public information officers
(PIOs) linked by Internet to give journalists and authors
convenient access to expert sources. There are a number of
ways of submitting queries, not restricted to the Net
itself:

  • Phone (from US): 1-800-PROFNET (1-800-776-3638)
  • Phone (from outside US): 01-516-941-3736
  • Fax (from US and Canada): 1-516-689-1425
  • Fax (from beyond US and Canada): 1-516-689-1425
  • Email: profnet@profnet.com
  • CompuServe: 73163,1362

Ask An Expert offers similar expert facilities.

KnowPost is run
by people from all over the world with the aim of helping
“guide those who are lost on the Information
Superhighway”. The service is free, but you have to answer
a question for each question you ask.

A clever new variation on human-based research resource, SourceNet is a
private research tool that lets journalists post anonymous
queries on any topic that will be distributed daily to
nearly 10,000 corporate and agency PR professionals.

Journalists worldwide use the service to round out stories,
find guests, test new story ideas, or find expert sources
for stories in progress. SourceNet say, “It’s like having an
army of research assistants helping you, for just a minute
or two of effort on your part! All queries are completely
anonymous; personal contact information is never available
to PR people through SourceNet queries.”

Free to all working members of the media.

7.5 Internet by Email

If you can’t get at all these resources in the normal
way, you can get just about anything you need by email.
Loyd Colston writes:

By sending email to mail-server@rtfm.mit.edu
with text in the BODY of send
usenet/news.answers/internet-services/access-via-email one
will get a file on how to do archie, FTP, WWW, WAIS, etc.
by email only. In other words, you can FTP a file from an
email only account. This file is also useful in learning
how to use internet in general. The author also publishes
the e-zine The Internet Tourbus which is a free tourguide
of the Internet.

In addition, Bob Appleton has updated his files which
show in specific detail how to get just about any type of
file by email. The files are free and everything mentioned
in them is also free. To get these files individually or
together in zipped format as well as in text format, send
a message to: agora@dna.affrc.go.jp
and in the body of the message put:

send ftp://ftp.crl.com/users/iv/iverham/XXX

where XXX stands for one of the files
listed below:

email4u.txt getit4u.txt fun4u.txt pix4u.txt
email4u.zip getit4u.zip fun4u.zip pix4u.zip
4useries.zip

Repeat the line for each additional file requested. In
addition the .txt (but not the .zip) files are available
here:

send
http://members.aol.com/bombagirl/freeware/XXX

send http://www.wireworm.com/4useries/XXX

Another way is to send a blank message to: 4useries@wireworm.com.

When you get the email information, please give attention
to Email by news groups. Vigilant and InReference both
allow for keyword searches of Usenet being sent to your
mail box. This saves a LOT of time having to read every
article about items. Both allow filtering so you can
specify exactly what you want.

7.6 Information
Research FAQ

For those who want to go into more
academic detail about research – online and otherwise –
I can recommend the massive, if slightly sprawling,
Information
Research FAQ
(not to be
confused with our own production). Articles on a wide
range of topics related to finding information fast.

7.7 Windweaver Web Resources

An excellent site well worth checking out for help in
research is Windweaver

Web Resources with over 100 pages of useful Internet
search guides and links.

Created by an Internet trainer who specialises in
research resources, it offers useful comparisons of the
strengths and weaknesses of the different search engines,
etc, along with guides, help pages, and many, many links.
Highly recommended.

7.8 Articles

Sal Towse, contributor to the FAQ, has a good article on
web searching in “Computer Bits”. Well worth a read:
www.computerbits.com/archive/1998/0700/web_searches.htm
as is the computerbits site in general.

7.9 Language Problems?

If the material you find is not in your own language,
don’t despair. There are even sites that will translate
individual words, or better whole chunks of text or
web-pages between different languages. Don’t expect
perfection, or the most obscure dialects, though. An
early attempt at this was Babelfish,
from Alta Vista which covered French, German, Portuguese,
Italian, Spanish or English. Nowadays a Google search brings up
many serviceable translating sites – including Google itself
of course.


8.
HOW CAN I VALIDATE WHAT I FIND?

8.1 How reliable is the Net?

Things are clearly changing all the time. The Internet is
growing bigger – and as many have discovered a web page or
Usenet posting can look the same whether created by an
world authority or a student. The key issues are:

Accuracy

Most traditional media have standards of fact-checking,
which need not be followed by the creator of a web page.
The same applies to discussion groups. In misc.writing,
for example, a writer accused 90% of the advice posted
about copyright law of being wrong.

However, we shouldn’t overstate the case. Mistakes also
occur in venerable legal textbooks. The problem is that we
grow up learning to judge the validity of traditional
media. Often this comes from the context in which it
appears – we value information in a medical journal, for
example, over a teen magazine. On the Net, that context is
often missing or severely limited.

The counter argument is that the Net is so large that few
inaccuracies will go unchallenged for long. This is the
philosophy behind Wikipedia which is designed to allow every user to
edit entries on the basis that the Internet community will
prove the best arbiter of truth. (Nonsense and vandalism is
removed speedily, in theory at least).

Authenticity

How authentic is a website or posting? Many health sites,
for example, are created by drugs companies, but don’t
reveal that fact on the site itself. There are
important-sounding history sites which are run by
right-wing extremists.

Even the domain name is no proof of authenticity, as
constant legal wrangles continue to prove. Although the
Net is subject to laws of libel, misrepresentation and
advertising standards codes (despite suggestions to the
contrary) these laws are not always easy to enforce, and
take time.

Ageing and Fluidity

Books and magazines have the grace to look their age.
There’s no dust on a website that hasn’t been updated for
over two years. Many sites don’t even show a date.

Or you may find the reverse problem. The web is so fluid
that the solid site you relied on for information on a
regular basis may simply disappear overnight. And every
search engine is filled with out-of-date links to sites
that no longer exist.

Accessibility

The information you want may exist, but may be buried
under a load of dross. Top of my personal hate-list are
sites that offer to rig search engines so that your
personal URL appears “near the top of any lists.” Using
techniques which fool search engines into thinking a page
is more relevant than it is, they turn the usefulness of
the web on its head.

8.2 What can I do about it?

8.2.1 Double-check

Journalists always double-check information unless it
comes from a totally secure source. The same should apply
for any information you need to verify from the Net.
Either find another reliable Internet source, or use
traditional means – books, telephone, etc.

8.2.2 Look for “branded” sites

“Branded” sites from organisations you know and trust are
likely to be among the more reliable – although even they
should still be treated with care, and you should not take
the domain name as proof on its own. Many
authentic-sounding domain names have been bought up by
others.

Government sites generally provide reliable statistics,
as reliable as government statistics ever are! Electronic
versions of publications such as established broadcasters,
newspapers and specialised journals are likely to have
been prepared as carefully as their traditional
counterparts. Academic departments of universities can be
good, too, but check to see if you’re reading the work of
a professor or student…

8.2.3 Use “filtered” directories

Search engines use automatic web-crawling “spiders” to
trawl for pages, and generally make no attempt to judge
the value of the sites they find. However, the better
search engines have developed a range of strategies for
excluding sites that try unfairly to raise their profile,
with a variable degree of success.

By contrast, directories are selected by humans, giving a
greater reassurance that the sites will at least be
relevant to your query. Better still are value-added
guides such as Britannica (now mostly
subscription),
or Dotdash – and
directories which give star ratings to valued sites.

8.2.4 Look at internal evidence

In all cases, you should check websites for internal
evidence of quality: writing style, language, range of
content, level of detail, clarity of design – all can give
important clues as to the expertise of the provider.

Writers with depth of knowledge and experience tend to be
precise, rather than vague in their use of language, and
will normally include detailed material and evidence to
back up what they say. And while slickness is no guarantee
of quality, a badly organised site suggests that the
content may be sloppy in other ways too.

In addition, look for references and citations, clear
identification of who “owns” the site, the date of the
last update, contact information and email links. Lack of
any or all of these should make you increasingly
suspicious of the validity of what you find.

8.2.5 Ask who’s paying

Good content takes time and effort, and while some people
are happy to provide this for nothing, there’s no
incentive quite like hard cash to ensure the website is
kept accurate and up-to-date.

In some cases, it may be that the site relies on
advertising, and therefore has a built-in incentive to
keep visitors happy and supplied with good content.
However, in others the paymaster may be a drugs company, a
political organisation or an individual with an axe to
grind.

In the long run, if you can afford it, you may feel it’s
safer to shell out money yourself, and use one of the
subscription-based services. However, even if you do pay,
all the above considerations still apply.


9. WHAT
ABOUT THE FUTURE?

9.1 Plus Ça Change

Clearly, the Internet will change – commercially,
technically, philosophically. There’ll be a need for
services to pay for themselves, and that may mean more
subscriptions, as discussed above. On the other hand,
there are other ways of skinning a cat.

Many sites will pay for themselves by advertising, while
sites set up as corporate PR will find they need to offer
more than pretty pictures to attract the browsers.

There are also publication spin-offs, which need not
necessarily be financially damaging to the provider, even
if free. Times Educational Supplement found that their
free Internet site actually led to increased sales
of the printed publication rather than the decreased sales
that might have been expected.

When it comes to reliability of information, new
techniques of encoding “watermarks” and using encrypction
programs and digital signatures could be an important step
towards ensuring that users trust the information they
find.

The energetic prosecution of legal safeguards will also
be increasingly necessary – whether over trademarks and
domain names or “passing off”. And search engines will
need both legal and technical ways to stop their search
results from being rigged.

It’s perhaps as difficult to predict commercial
developments as technical developments. The Net has an
anarchic way of confounding predictions all the time. It
wasn’t so long since everyone was hyping “push” as the
latest transformation. So far, the push revolution has
been put on hold.

9.2 Technical Developments on the Net

9.2.1 Intelligent Agents

There’s new software developing all the
time. Intelligent Agents (IAs) are held to be the coming
thing, according to some (the developers of Intelligent
Agents, perhaps).

IAs are not writers agents with nous, but computer
programs that help with searching. There are a number of
different types of IA, but basically they aim to make
internet searching easier by

(a) learning your preferences in an “intelligent” way
and/or
(b) going onto the net and searching while you are
offline, thus saving time and money and/or
(c) (theoretically) returning with a more targetted,
useful list of hits – by avoiding duplication, eliminating
dead links, and generally being more efficient than your
average bog-standard search engine.

When I posted a query about IAs, at first I received a
grand total of one reply (from a reviewer) and zero
replies from researchers and writers, from which I
concluded that either IA’s are not much use, or that
no-one who uses them saw my message or (most probable)
that researchers and writers are too busy researching and
writing to take valuable time wrestling with unknown bits
of software.

Since then I’ve heard from Vic Justice who recommends “a
search device called Web Ferret, which is a free download.
He says:

“It calls up 500 responses to the subject query and does
it faster than, say Yahoo or others. To my computer
illiterate mind, it searches the search engines.

“A major advantage of Web Ferret is the contents index
that pops up as the mouse cursor touches the item title,
so that you only need call up items specific to your
search. This saves time.” http://www.ferretsoft.com

Bruce Krulwich who is professionally employed in bringing
agents to market has also written a number of articles on
the subject: http://www.geocities.com/ResearchTriangle/9430/

Meanwhile, for Mac owners only, Mike Shields tells us
about Sherlock, ‘another method of research, for those of
us using a Mac with OS 8.5. Sherlock, performs the same
feats as Metacrawler. Allows fuzzy searches as well.

‘What makes this really powerful, is that you can create
a Sherlock plugin for your site, so that your site will be
included in the searches.

‘You only search the plugins that you yourself load. So,
for instance, if Charles Deemer creates a plugin for his
site, and you check it off, then you search his site, if
you’ve loaded his plugin before hand. You can also choose
which of the search engines to utilize.

‘It’s a very fuzzy search. I can ask a question like,
“Who is Charlie Harris?” And set it off and running, and
it should come back with a few interesting things.’

More information from http://www.apple.com/sherlock/

To find out more on all kinds of IAs, you can also try:

Software:

Autonomy
SSSpider
AgentSoft
Alexa Internet
BotSpot – try “Best
of the Bots”

Reviews and Discusion:

UMBC Agent Web
IBM
Intelligent Agents Home Page

Crawling

towards Eternity (Web Techniques, May 1997)
Stroud’s

Internet Agent Reviews

(Most of these links courtesy of Jen –
thanks.)

The jury is still most decidedly out. Please tell me if
you’ve tried IAs and find them scintillatingly useful,
totally useless, or somewhere between the two.

9.2.2 Watching the Net

To get the best out of your Internet
searching, it’s worth spending a little time keep track of
developments. Directories and search engines are
constantly looking for ways to improve their service.

One of the best ways of keeping track is the excellent Webmaster’s Guide
to Search Engines
. This is really aimed at website
developers, but is no less useful for Internet
Researchers, as a way of learning about search engines and
what makes them tick.

In particular, it has a twice-monthly update news page
“What’s New”. If you subscribe (for free) you can join the
mailing list and receive regular news of updates by
e-mail.


10 URLS FOR A RAINY DAY

This is our page of links. Essential Internet sites for
finding information – from Ancient History to CNN. Why not
download it, and/or make it your browser’s home page, for
easy reference.


11 END
CREDITS

Complaints, criticisms, suggestions etc, to Charles Harris

Many thanks (in alphabetical order) to:

Alex: minder@galdr.demon.co.uk, AmyWriter, Bob Appleton, Chris Barlas: chris.barlas@alcs.co.uk, Steven Blacher, David Brager, Richard Broke,
Jim Burgess, Khem Caigan, Mike Casswell:
mike-casswell@mail.u-net.com, Lloyd Colston: lloyd@colston.com, Huw Colingbourne, Jane Dorner, Marty Fouts,
Marnie L. Froberg:
ion@istar.ca,
Yvonne Hewett:
yvonne@optiprod.demon.co.uk, Steve Hunt, TJ, Edo Jansen, Floyd H. Johnson,
Ellie Kuykendall,
Trygve Lode, Wayne Lutz: wlutz@home.com, Pat
Marcello:
patm7@prodigy.net,
Michelle McIntyre, Laurence A. Moore:
larrymor@crl.com, Marty Norden, Pat , Beth Porter: 100541.165@compuserve.com, Alvaro Ramirez, Allen Schaaf, Robert
Silverman:
webmaster@freeality.com, Sal Towse: towse@null.net

And to all those who sympathised with my/our plight and
lent support. If I’ve missed anyone out, please tell me.

© Charles Harris

All suggestions and comments are welcome. If sending information
on an existing link, please include the link’s title in
the FAQ as well as the address and any other information
that you feel will be useful. If suggesting a new link,
then I generally only include sites that have a wide
range, and usually only those which compile links to
other sites or will in some other way help Internet
searchers. Click here.

(This FAQ may be copied in whole or in
part for non-profit making purposes only, provided

you tell me you’re doing it, adequate credit is given
to those who helped towards it, and the home address
is given http://www.search-faq.com.)


Back to the Top

URLs for a rainy day

Comments are closed.