Library 102 First Class Session
Second Class Session | Third Class Session | Fourth Class Session
Introduction to the Internet
What is the Internet
? VERY BRIEFLY
, the Internet is an international, decentralized network of computer networks running on the protocol (communications rules) called TCP/IP
. A computer network consists of two or more computers connected in order to share information. When two or more networks are connected, it is called an internet. THE Internet is the largest internet in the world connecting millions of computers from virtually every nation on earth. How does TCP/IP make the flow of information on the Internet possible?
What is the Web
? VERY BRIEFLY
, the World Wide Web
is a collection of hypermedia documents accessed through the Internet. It is just a part of the Internet; it has become the dominant part. For more information about the Web, click on the link About the Web
on the Internet 101.org
Each computer connected to the Internet has what is called an IP address (Internet Protocol Address). The IP address is called a "dotted quad " because it is written in a group of 4 numbers separated by periods (Called dots). An IP address or dotted quad looks like this: 126.96.36.199 . Because it would be difficult to remember these numerical IP addresses, the Domain Name system (DNS) was developed. The domain name system gives a descriptive name to the numerical address making it easier to remember.
The parts of a domain name are also separated by periods. The most specific part of the domain name, which is the host computer also known as the server is on the left. In this example, www.cos.edu , the host computer is www. Most servers have a name, but not all Web servers are called "www". The most general part of the domain name is always on the right. The broadest part of the domain name, which usually specifies a geographical location such as us for United States or uk for the United Kingdom or a type of organization such as . edu for a college or university or . gov for a government site, is called the top-level domain . The top-level domain helps to identify where the information is coming from. It is also a key factor in evaluating a web site. The primary top-level domain abbreviations currently in use are as follows:
|.com Commercial Business
||.net Network; Internet Service Provider|
|.gov U. S. Government
||.org Non-Profit Organizations|
More recently approved top-level domains are as follows:
|.info Any Individual or Company
|.name Any Individual
||.pro Professionals such as doctors, lawyers, accountants|
According to an article published in the Los Angeles Times by David Sarno June 21, 2011, ICANN (Internet Corp. for Assigned Names and Numbers) announced what they called "one of the biggest changes ever" to the way the DNS works. This change allows companies and organizations to apply to create their own versions of top level domains for a price tag of $185,000. ICANN will make the final decisions on the new domains. An example could be .music which would indicate legitimate professional musicians' sites.
URL is an acronym for Uniform Resource Locator . The Uniform Resource Locator (URL) is a specific address on the web. URLs are written without spaces and the parts are separated by special punctuation. URLs show in the Address Box of the Web browser. For example:
The URL consists of the following:
Type of transfer://servername.domain/directory/subdirectory//filename.filetype
Another way of saying it is:
The structure will always be the same, although there may not always be a subdirectory.
The Protocol * or type of transfer is given first, followed by a colon slash slash: http://
Then the Domain Name ( the location of the page): www.cos.edu
Then the Directory (and/or subdirectory): library/Lib102/
Directories and subdirectories are followed by a slash.
Then the File : library102syllabus
The file name specifies the individual document you are accessing .
Then the File Type is last: .htm
Sometimes the file type will be .html
Both file types: .htm and .html signify the file type for hypertext (web pages)
http is the protocol for the World Wide Web; other protocols are available through the Internet. The protocols are as follows:
||File Transfer Protocol-accessing FTP sites; an executable file may be sent to your computer; use caution|
||Gopher Protocol-accessing Gopher menu text-based information only|
||Telnet Protocol-a telnet window to run older, text-based program|
||Mail Protocol-opens a Send Mail Dialog box to send email|
||News Protocol-opens a Usenet newsgroup link to posted articles|
ARPANET (Advanced Research Projects Administration Network) It was the precursor to the Internet. It was developed in the late 60's by the U. S. Department of Defense as an experiment in wide-area networking that could survive nuclear war. The original host computers were at Stanford Research Institute, UCLA, UC Santa Barbara, and the University of Utah.
Berners-Lee, Tim inventor in 1989 of the combination of software programs and networking protocols known as the World Wide Web.
Blog A web site that contains an online personal journal with reflections, comments and often hyperlinks.
Boolean From the name of an English mathematician, George Boole (1815-1864),refers to a search strategy which relies on the concepts of AND, OR, and NOT. For a more detailed explanation of the use of Boolean logic in Internet searching , click on this link from the University at Albany Libraries.
Breadcrumbs A set of links found on some webpages and some search results that shows the current page's position in the hierarchy of the site. Clicking on one of the links will take you to that level of the site.
Browser A software program used for displaying the HTML language of the Web. The first browser was introduced in 1993. In 1999 Internet Explorer became the most widely used browser.
Cache is the memory storage of the computer. The computer uses data in the cache as a short-cut for retrieving information which speeds up the performance of the browser. The cache should be cleared on a regular basis or it will slow down the browser instead of speeding it up.
CERN European Laboratory for particle Physics where the World Wide Web project originated in 1989.
Cookies Small files of information generated by a web server and stored on the user's computer. They are used for personalization of sites.
Copyright very briefly, is protection by law of an author or creator to the rights of his or her intellectual property. You should assume that all information on the Web is copyright protected unless otherwise stated. U. S. Government works are not protected by copyright. For more information on copyright, see the U. S. Copyright Office web site at http://www.copyright.gov/
Cyberspace Term originated by the author, William Gibson, in his novel NEUROMANCER . The word is used to describe the whole range of information resources available through computer networks.
Database A collection of data organized for search and retrieval.
Domain Name The unique name that identifies an Internet site. Domain names always have two or more parts separated by dots. The part on the left is the most specific and the part on the right is the most general.
E-mail Electronic mail. Messages are sent from one person to another person electronically via computer. It is the most widely used service on the Internet. The Internet mail protocol is called SMTP (simple mail transport protocol).
Encryption Any procedure used to convert text into cipher text to insure network security.
Filter Bubble Describes a phenomenon, popularized by Eli Pariser in his book by the same name, using algorithms to selectively retrieve websites based on information about the user such as location and search history. The danger is that the use may only see information that the user already knows or agrees with, thereby, creating a narrow, informational bubble.
FAQ Acronym for Frequently Asked Questions . FAQ's are documents that list and answer the most commonly asked questions on a particular subject.
FTP Acronym for File Transfer Protocol allowing users to transfer text files, programs, software etc. from one Internet site to another. This was one of the original Internet services. Files open to the public are called "anonymous" FTP sites.
Folksonomy A term combining "folks" and "taxonomy" and refers to the non-controlled vocabulary that develops as users add tags or metadata.
Gutenberg Project Gutenberg's mission is to create a free library on the Internet of the world's greatest books. The Project is named for the 15th century printer, Johannes Gutenberg, who invented the printing press and made books available to the masses for the first time in history.
Gopher An Internet protocol which is text and menu-based and organized by subject. Gopher gets its name from the University of Minnesota where the protocol was developed. Gopher is also a pun on "gofer"- one who fetches things.
Hit A result of a search query.
Home Page A document created to serve as the starting point for World Wide Web users; it is also the first page on a Web site.
Host Computer A computer connected to a network that a user can access for information or for running a computer software application.
HTML Acronym for hypertext markup language which is a coding language used to create documents on the World Wide Web.
Http Acronym for hypertext transport protocol , the protocol of the World Wide Web.
Hyperlink An embedded HTML code that appears on the computer screen as a highlighted word or image or icon.. It is the basis for the World Wide Web. It allows users to go with the click of a mouse to other documents.
Internet An international, decentralized network of computer networks running on the TCP/IP protocol.
Invisible web information on the web that search engines don't have access to such as databases, intranets, password-protected sites or sites intentionally blocked.
IP Address is an unique internet address. No two computers can have the same IP address. It is a code made up of a series of numbers separated by dots. An IP address would look like this: 9188.8.131.52.
Meta Search Engines Allow searches to be sent to several individual search engines all at one time. The results are shown all at once. Metacrawler at www.metacrawler.com was one of the first. Dogpile at www.dogpile.com/ is a popular example.
Metatags A field in the HTML coding language for a Web page that allows the creator of the page to enter text describing the content of the page. The content of metatags is not shown on the page itself when the page is viewed in a browser.
Natural Language Search Some search tools allow a question to be typed in the search box. Ask.Com is a good example of this type of tool.
Netscape A graphical browser for the Web; a software program for viewing hypertext documents and all other protocols of the Internet. It was released in October 1994.
Nesting The use of parentheses to specify the way terms in a Boolean expression should be grouped, the order in which the terms should be searched.
Packet Switching A method used to move data on the Internet where all the data coming out of the computer is broken into chunks, sorted and directed to different routes and reassembled at the destination.
Password A code used to gain access to a locked system. Good passwords contain letters and non-letters.
Phishing An Email is sent out and directs the user who takes the bait to a web site, that although it may look authentic, is actually bogus. The user is then asked for personal information such as a social security number or bank account numbers or other information that is used for identity theft.
Podcasting A subscription service that became available in 2000 and allows downloading of audio and video files.
Protocol A set of rules or standards defining how computers communicate and exchange information with each other.
Public Domain A work of authorship is in the “public domain” if it is no longer under copyright protection or if it failed to meet the requirements for copyright protection. Works in the public domain may be used freely without the permission of the former copyright owner.
Relevancy Ranking A method search engines use to sort the results list . The relevancy is determined by using algorithms, but actually it is the searcher who ultimately determines which sites are relevant.
Results List The web pages that are retrieved by the search engine that match what was typed in the search box. The most helpful results include some text from the page, the date modified, and a choice of "similar pages".
Search Directories Are subject (topic) indexes of Web sites organized by categories . They are usually man-made according to selection policies and have fewer sites included than search engine databases. .An example is the Internet Public Library at http://www.ipl.org/ .
Search Engines Are searchable databases compiled by automated programs, which are called "crawlers," "robots," or "spiders." The databases are based on key word rather than on topic. They usually have coverage of millions of sites and the relevancy is mathematically determined. Some examples are as follows: Google at www.google.com and Hotbot at http://hotbot.com .
Search Strategy Components are as follows: 1)State what you want to find 2) Identify key words 3)Select synonyms and variant word forms 4) Combine synonyms, keywords, and variant word forms 5)check your spelling.
Status Bar At the bottom of the screen, the status bar displays the Web address of the document being transferred, the progress of the downloading, and the completed percentage of the document layout, as the page is loaded.
Stemming Some search engines allow a word to be reduced or truncated to its root form and will retrieve all forms of the same word in the pages. For example, in many search engines, using teach* will retrieve teaches, teacher, teaching.
TCP/IP Acronym for Transmission Control Protocol/Internet Protocol which is the suite of protocols that define the Internet; to be on the Internet the computer must run on TCP/IP.
Telnet An Internet protocol for remote terminal connection. One of the most common uses of Telnet is to access online library catalogs at colleges and universities world-wide. Login, passwords and terminal emulation for each site will be different.
Tilde often is used to identify a personal page.
Twitter Is a free social networking and micro-blogging service that enables its users to send and read messages known as tweets.
Urban Legend An apocryphal (of questionable authorship or authenticity; erroneous; fictitious) story involving incidents of the recent past, often including elements of humor and horror, that spreads quickly and is popularly believed to be true.
URL Acronym for Uniform Resource Locator which is the standard way to give the address of any resource on the World Wide Web. Every page on the Web has a unique URL. It consists of the protocol used, the domain name and usually a directory and filename and file extension.
Web 2.0 A phrase coined by O'Reilly Media in 2004 referring to a second-generation of web-based services such as social networking, wikis, blogging, podcasting, and folksonomies. The emphasis is on online sharing, networking, and collaboration.
Web Page A document on the World Wide Web. Each page is an individual HTML file with its own URL.
Web Site A group of related web pages collected around a main page.
Veronica Software which searches keywords in gopher menus.
Wiki A web site that lets visitors add, remove and edit content. The most famous example of a wiki is Wikipedia.
WWW Acronym for the World Wide Web.
World Wide Web An Internet service which contains text, graphics, video and sound. It is currently the most powerful, comprehensive and popular Internet protocol using hyperlinks to link one piece of information to another in a web pattern.
*This glossary was compiled from the following sources:
Courtney, Nancy, ed.. Library 2.0 and Beyond: Innovative Technologies and Tomorrow's User Westport, Connecticut: Libraries
Flanagan, Debbie. "Preparing Your Search". Web Search Strategies .1999,2000
(http://home.sprintmail.com/ debflanagan/prepare.html). 13 June 2000.
Hanson, Jarice. 24/7 How Cell Phones and the Internet change the Way We Live, Work, and Play . Westport, Connecticut:
Praeger Publishers, 2007.
Harris, Robert. A Guidebook to the Web . Guilford, Connecticut: Dushkin/McGraw-Hill, 2000.
Henderson, John. "ICYouSee Glossary". ICYouSee A Guide to the World Wide Web . June 1999.
(http://www.ithaca.edu/library/Training/glossary/cache.html) 15 Oct. 2001.
Hillstom, Kevin. Defining Moments: The Internet Revolution . Detroit: Omnigraphics, 2005.
Hock, Randolph. The Extreme Searcher's Internet Handbook: A Guide for the Serious Searcher . 4th ed. Medford, New Jersey:
CyberAge Books, 2013.
Morley, Deborah. Getting Started: Web Page Design with Microsoft Frontpage 98 . Fort Worth: Dryden Press,1999.
Rappoport, Avi. "Search Terms Glossary". Search Terms for Web Sites and Intranets . 2004.
(http://www.searchtools.com/info/glossary.html ) 26 Aug. 2004.
"What is phishing?" Webopedia. 2004 (www.weobopedia.com/TERM/p/phishing.html) 27 Jan. 2005.
"Whatis?com". What Is Com Home Page . 1996-2000. 9 June 2000(http://www.whatis.com/nfindex.htm).
Searching Three Ways
I. Site Search:
Of the three ways to search on the Web, the easiest and most direct way to get information on the Web is to know the URL (Web address). When you already know the URL, go to the Address Box, highlight the current URL and press the backspace key . The Address Box should now be empty; all you need to do is type in the new URL and press the enter key . The URLs are case sensitive, and any typo will cause an error message, so type carefully.
If you do not know the exact URL, you might try guessing, which works well in certain circumstances. For a company name, trying: www.companyname.com often works; for example, www.microsoft.com . For companies with longer names, abbreviations might be used. The New York Times URL is an example: www.nytimes.com . Instead of a company name, the middle word in a basic URL might identify the subject or topic of the site; for example, www.weather.com.
WHEN YOU KNOW A URL, DO A SITE SEARCH
II. Directory Search:
Directories are composed of links to web sites arranged by topic (subject matter). They are man-made and have fewer links available than search engines. Directories have very broad subject headings and that by making more and more choices, the topics are eventually narrowed down to particular sites. You should also do a directory search when you want to see a list of sites on your topic with annotations. The Internet Public Library is a good annotated web directory at the URL: http://www.ipl.org/ .Using a directory may help you to avoid sites where the content is very brief or superficial that are more likely to be found when using search engines.
The University of California at Berkeley Library sponsors an excellent tutorial and table of features for recommended subject directories .
WHEN YOU ARE LOOKING FOR NON-SPECIFIC INFORMATION, DO A DIRECTORY SEARCH.
According to Randolph Hock, the author of The Extreme Searcher's Handbook , the strengths and weaknesses of general web directories are as follows:
||Relatively small database compared to web search engines|
||May not have sites addressing very specific topics|
||Typically less search functionality than most search engines|
|Good for General Questions
||Paid inclusion may affect quality|
The three main points to remember about web directories are that they are selective , that is, selected by persons according to some set standards of quality, although some directories allow paid inclusion. Secondly, they are categorized (classified) by broad topics and sub-topics. This hierarchical arrangement helps to narrow a broad topic. Thirdly, usually only the homepage or main page of the site is indexed , which contrasts with search engines where every page of the web site is usually indexed.
For two concepts or more, it is generally better to use a search engine.
III. Search Engine Search:
Search engines are software programs that search the web and log words into a database. When a word is typed into a search box the program scans the search engine's database for the word and returns sites that are mathematically determined to be relevant. Most search engines rank the sites by relevancy. Remember this relevancy is determined by a software program NOT by humans who have actually looked at the particular site. Search engines also compile databases based on popularity
Because the databases compiled by search engines can have words from millions of web sites, it is important to plan a search strategy when using search engines. For help in preparing a search strategy, read: Recommended Search Strategy : Analyze your topic & Search with peripheral vision prepared
by the UC Berkeley Library . A worksheet to help in the analysis is also provided: Analyze Your Topic
at this site.
Taking a few extra minutes to think about a search strategy, will actually save time and provide better results.
Although many of the search engines have indexed millions of web sites, there is no one search engine or directory which has indexed everything on the web. Also there is always a lapse in time from what is published on the Web to what is indexed in a search engine; there is no instant indexing . Because search engine software differs, each search engine will give a different view of the Web. It is usually a good idea to try the search strategy with more than one engine.
Because each search engine is unique, it is a good idea to always check the "Hints", "Help", "Tips" etc. links before performing a search. Also like everything else about the Web, search engines change.
The search engine is by key word rather than by subject . You should use a search engine when you are looking for an obscure topic, a specific site, or need to retrieve many, many web sites, or want to search by words in a domain or URL, or to retrieve by popularity ranking.
The University of California at Berkeley Library sponsors an excellent tutorial and table of features for recommended search engines: http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/SearchEngines.html
If the search retrieves too few results, here are some troubleshooting ideas:
Double check your search strategy to see if it is correct for the particular search engine used.
Check your spelling.
Try more synonyms or variants or truncation.
Try the search with a different search engine.
If the search retrieves too many results, here are some troubleshooting ideas:
Be more specific in choice of words
Add more concepts to the search.
If the search engine allows, limit the search by language or top-level domain.
Try reorganizing your words to do a phrase search.
WHEN YOU HAVE SOMETHING VERY SPECIFIC TO FIND, USE A SEARCH ENGINE AND PLAN A VERY PRECISE SEARCH STRATEGY.