Introduction to the Internet

—By John H. Krick, Senior Analyst

In This Report:

Datapro Summary

What Is the Internet?

From Dialup to Dedicated

Internet Providers

Types of Connection

Internet Applications

E-Mail

FTP--the File Transfer Protocol

Archie

Telnet

Gopher

Network News

World Wide Web

References


Datapro Summary ^

The Internet, the global network of networks, is the subject of this overview. A hot topic of discussion in all the media, the Internet has been transforming itself over the past several years from a government funded research and education network into a general purpose communications tool that will offer new opportunities for all kinds of businesses. The number of commercial users of the Internet has been growing by as much as 10% per month. Details on Internet providers, types of connections, and Internet applications are supplied in this report, as well as a short bibliography of valuable reference books about the Internet.

What Is the Internet? ^

There are probably very few technologically aware people around today who don't have some notion of what the Internet is--a network of networks, a communications tool that is making Marshall McLuhan's vision of a "Global Village" even more of a reality than perhaps even McLuhan might have envisioned. It is the stuff of techno-myth, and everyone wants a piece of the action. The Internet could be the prototype of some larger network entity of the future called the "Information Superhighway," that will deliver education and health care services, and shopping and entertainment too, into our living rooms. Not everyone views such an evolution of the network as a desirable thing, and the growth of the Information Superhighway is certain to be controversial. However long and difficult a process that growth is though, one thing is certain--business opportunities along the electronic freeway are available now and will grow in number, diversity, and profitability apace with the network itself.

From Dialup to Dedicated ^

In February of 1993, I began experimenting with dialup connection of a standalone PC to the Internet. What I found out about setting up such a connection, the applications available to search the Internet, and what I found on the net using those applications are the subjects of this report.

My standalone PC and its dialup SLIP connection to the Internet have subsequently grown into a full-time dedicated T1 connection that supports over 300 users, including Datapro's worldwide research and analysis staff, the technical personnel that maintain our production network and database, and our sales and marketing departments.

Internet Providers ^

There are several types of Internet providers--service companies that deliver Internet attachment to individuals and organizations. Some resemble the traditional dialup bulletin board, and many of the smallest providers have roots in that realm, having added Internet attachment as an afterthought. Other small-scale providers started in business for the express purpose of providing Internet connectivity. These sorts of providers may not be able to provide robust enough, or dependable enough, connectivity for any but the smallest of businesses.

On a somewhat larger scale, the regional providers whose high-speed backbone plant once made up the major portion of the entire Internet, connecting universities and reseach centers into giant subnets, are now willing and able to take on commercial customers. They have the depth of expertise and the physical infrastructure to do it.

Finally at the high end of the spectrum, are the giant commercial providers that include long distance companies, Regional Bell Operating Companies (RBOCs), and computer industry giants. Not all of these entities, as for example some of the RBOCs, have established clear Internet service offerings. The largest of them, Advanced Networks and Services (ANS), the result of a partnering of IBM and MCI, has quickly become the provider of choice for Business Week 1000 corporations. ANS now owns substantial portions of the Internet backbone, implemented over 45M bps T3 lines.

Types of Connection ^

Internet connections fall into two large categories--dialup connections and permanent, dedicated connections. The first usually means relatively slow speed and necessitates attachment to a single workstation, not a local area network. While a dialup account accessible by many users through a modem pool is possible, it does not represent a practical solution to providing Internet connectivity.

Dialup connections to the Internet themselves come in two forms--online dialup and SLIP or PPP connections. SLIP stands for Serial Line Internet Protocol; PPP for Point-to-Point Protocol.

Online dialup connections can be the cheapest way to establish Internet connectivity. Some providers charge as little as ten dollars a month for Internet e-mail and the ability to read USENET news. These kinds of accounts, however, restrict the user to the software that the provider makes available. In many cases, that may be somewhat less than user-friendly UNIX software like nn, the UNIX news reader program. On the other hand, newer online services like James Glieck's Pipeline service based in Manhattan, provide a much more friendly and state-of-the-art interface to the net, enabling users to access all Internet services through a consistent interface.

SLIP or PPP connections, in contrast, allow the user extreme flexibility in the choice of software, but cost much more. They also require a degree of expertise in setting up and maintaining the applications that many users will not possess. Technical support offered by providers may not be sufficient to get the extreme novice over every hurdle involved here.

Depending on the volume of traffic anticipated, permanent, high-speed lines like 56K bps switched or 1.5M bps dedicated T1 connections to the Internet, are the ultimate form of attachment. While expensive, they offer the only reasonable method for businesses that require Internet access for more than a few users. They are a necessity for businesses that expect to establish an Internet point-of-presence for access by customers.

Delivering Internet services over a LAN will also mean running a TCP/IP stack side by side with whatever protocol stack your network operating system supplies. For Novell NetWare, this will mean user workstations will need to load IPX and IP stacks, generally at boot up. If your network is already supplying some large-scale client/server applications such as a relational database, your choice of Internet applications software could be somewhat circumscribed by the TCP/IP stack already in place.

Internet Applications ^

The number and variety of Internet applications is growing in step with the network itself. Interestingly enough, some new applications seem to subsume older applications as well, as in the World Wide Web Browsers which allow users to perform all Internet functions from a single interface.

E-Mail ^

Certainly the most popular Internet application is electronic mail and it's easy to see why. Internet e-mail allows users to keep in touch with associates throughout the world. For business, this means an unparalleled degree of communication with staff in the field, with suppliers, and most important of all, with customers and potential customers.

Internet e-mail addressing is by now at least somewhat familiar to most people who stay abreast of industry developments. The ubiquitous "AT" sign, @, that separates the user name from the organization name, jumps out at the viewer, making the character string immediately recognizable as an Internet address. For many users, having an Internet e-mail address printed on their business card has become a kind of status symbol.

The format of an Internet e-mail address follows the form user@organization.domain, where the organization name could be the name of a company, of a government agency, or of an educational institution. Larger organizations often add names of particular servers or smaller subdivisions, or departmental names to this part of the address. Also, for networks outside the U.S., two-letter country designators are appended on the end of the address, so that, for example, ".uk" means the United Kingdom, while ".jp" indicates Japan.

Domain types fall into a few different categories:

Perhaps the easiest way to connect LAN users to Internet e-mail is through a gateway to an existing LAN e-mail package. Most providers of LAN e-mail software are able to supply--at additional cost, of course--software that will allow users to address messages to Internet e-mail addresses. It should also be noted that users of any of the large commercial online services, such as Compuserve, America OnLine, Prodigy, AT&T EasyLink, and GEnie, are able to send and receive Internet e-mail.

FTP--the File Transfer Protocol ^

The Internet File Transfer Protocol (FTP) allows users to access files from servers located all over the Internet. There are over 8000 of these servers and most allow access to any Internet user who supplies the user name "anonymous." After that, when prompted for a password, users enter their Internet e-mail address and are granted access. Generally, only a "public" directory ("/pub") will be open to anonymous users.

The files available in the "/pub" directory on FTP servers may be text files or executable binaries. Large files, that is over 50K or so, will very likely be in one or another compressed format. For UNIX, these will be .Z or .tar files. MS-DOS files will most often be compressed using the zip utility and will carry the extension .zip. These files can be decompressed using the PKUNZIP utility available from most Internet FTP servers. Compressed Macintosh files are identified by the .hqx suffix.

FTP has approximately twenty commands, but only a few of them are used very often. Users can change directories with the cd command, but finding your position within the directory tree must be accomplished by using the pwd command (print working directory). Unlike the dual-purpose MS-DOS cd command, cd without an argument on a FTP server, usually a UNIX machine, gives an error--not the current directory path. Detailed directory listings can be obtained using the dir or LS -l commands. The LS command without the -l switch will return just a list of filenames.

Once seated in the proper directory, users can initiate file transfers using the get command. The format of the command is get . Multiple files can be gotten using the mget command in combination with the asterisk character as a "wildcard." For example, the command mget *.txt would initiate the transfer of all files in the current directory that carried the .txt extension. Mget by default prompts the user after each transfer to ask if it's okay to transfer the next file in the queue. Users can turn off this y/n prompting using the prompt command which is a binary switch. Issuing prompt once turns off prompting, issuing it again turns it back on.

Archie ^

Suppose I have an idea there is an existing file, be it a text file or an executable program, somewhere on an FTP server that fits a certain requirement I have. I may not know the exact filename, and I certainly don't know which of the several thousand FTP servers in the world it may reside on. How do I find it? I can use Archie, the FTP search tool, to tell me the exact file name, its address on a server somewhere on the net, and the exact directory path to access that file.

The name Archie is a corruption of the word archive. Several Archie servers are spotted all over the world. Each of these servers has exactly the same content--directory listings of all of the FTP servers on the net, updated monthly when each Archie server polls every FTP server. Users can access Archie servers and perform a search for any string they feel might make up a portion of the filename they are seeking.

Archie's command set is small. The most important command, find, does most of the work. Find was once called prog for esoteric, UNIX-derived reasons, and while some documentation for Archie may still reference the prog command, it is entirely synonymous with find. While the Archie command set is compact, there are also several boolean switches used with the "set" and "unset" commands to provide exactly the sort of output you want from Archie. For example "set maxhits" specifies the maximum number of search hits Archie will return. "Set pager" returns Archie's output a single screen at a time, and "unset pager" returns to the default scrolling output.

Archie's functionality is severely limited; no logical operators like "AND" and "OR" are provide to enable the construction of complex search strings. Worse than that, Archie servers have been hard pressed to keep pace with the exploding growth of the Internet. Often, it is simply impossible to attach to an Archie server during U.S. business hours.

Telnet ^

Now that I want to run an Archie search, I can access and take control of an Archie server using Telnet. Telnet may be the oldest Internet application. It provides basic terminal emulation and allows users to take control of applications that reside on remote systems. These systems are generally UNIX systems, but are not so necessarily. The Telnet commmand is issued in the form telnet <destination address>, where the "destination address" argument is the name of some application and the server it resides on. For example, telnet archie.rutgers.edu attaches the user to the Archie server at Rutgers University in New Jersey.

Gopher ^

I could also use Telnet to access a Gopher server. The Internet Gopher was developed at the University of Minnesota. Minnesota's school mascot and the namesake for its sports teams is the gopher, hence the name. The habits of the gopher have leant themselves to an interesting interpretation of the name as well. It is said that gopher menus allow the user to easily tunnel through the Internet. Be that as it may, the Gopher has become the de facto standard among menu-driven interfaces to the Internet.

In the late 1980s, several universities interested in providing their student bodies, faculties, and other staff with access to all of the computer resources on their campuses, as well as on the Internet, developed client/server menu systems for their networks. The Minnesota Gopher proved to be the most popular of these, and today there are nearly 1,800 Gopher servers listed on the Minnesota "All The Gopher Servers In the World" list.

The "All the Gopher Servers in The World" list, beyond cataloging most of the extant Gophers in the world, also provides a dramatic demonstration of the capabilities of this tool. Gopher is an example of true client/server computing. Clicking on a menu item in the "All the Gopher Servers In The World" list connects you immediately to that server, not just to a directory of what is available on that server, but to the server itself. For example, if I were to point to and click on a menu line for the Chinese University of Hong Kong, I would, within a split second, be browsing the contents of that server.

And what might those contents be? Since the largest percentage of Gopher servers are maintained by universities, there will very likely be a great deal of campus-specific information, including phone directories, course catalogs, and listings of special events. Beyond that, however, there is also likely to be a large amount of information about the Internet itself, as well as hooks to such things as "Popular FTP Sites" and "Popular Gopher Destinations."

Network News ^

Network News, often called USENET News, is a large collection of special interest bulletin boards--on the order of 5,500 or so. Over 2,000 of these boards are devoted to particular geographic areas. For example, several boards exist for the New York City area. Location-specific newsgroups exist by now for nearly all major metropolitan areas in the Internet-connected world. Not every provider carries every metropolitan newsgroup; most smaller providers carry only newsgroups specific to their own location and those from the largest cities.

Most USENET newsgroups contain information of a more topical nature. Newsgroups are divided into several broad categories and a useable ordering of content is achieved by following a hierarchical pattern. The "comp" newsgroup category, short for "computer," contains over 500 groups devoted to specific areas of computer technology. For example, the hardware category includes specific types of machines and the software category includes specific application types, as well as specific product offerings from major manufacturers. Other newsgroup categories include "biz," devoted to business, and surprisingly, one of the least populous groups. K12 includes topics of interest to primary and secondary educators. "Sci," for the hard sciences like physics, chemistry, biology and others, provides communication between researchers working in those fields. "Soc" provides forums for the discussion of social issues including topics such as culture, religion, and politics. "Rec," for recreation, provides newsgroups for the discussion of mainstream hobbies and sports. "Alt," for alternative, is the largest and most unrestrained newsgroup category. Topics of discussion among the alt newsgroups range from the amusing to the ridiculous. While there are a few serious alt newsgroups, there are not very many.

For those with more mundane matters to attend to, an example of a serious USENET newsgroup is "comp.dcom.lans.ethernet." In this group, as the group name implies, technical issues surrounding Ethernet LANs are discussed. Implementors of Ethernet LANs can post technical questions and have them answered by other Ethernet users around the world, usually within a few hours. Comparable newsgroups exist for other LAN technologies like token-ring (comp.dcom.lans.token-ring) and FDDI (comp.dcom.lans.fddi). Similarly, groups covering many aspects of computer technology exist. Some examples are:

A short list like this only scratches the surface of the rich lode of technical information available in computer-related newsgroups. As previously noted, there are more than 500 of them. This small collection of examples illustrates the key point about the division of the newsgroup hierarchy into more and more specific areas as one proceeds downward. Self-explanatory groupings like "os" for operating systems or "dcom" for data communications give way to even more focused areas of interest like Ethernet or animation

World Wide Web ^

The World Wide Web is, at this moment anyway, where the Internet is headed. WWW, as it is called, is able to deliver all of the Internet functionality we've already discussed as well as graphics, audio, and even, given a fat enough pipe and a properly equipped client platform, full motion video over the net.

World Wide Web servers have, almost overnight, become the most popular form of Internet "point of presence" for businesses. The Web "Home Page," the first thing a user sees when logging on to a Web server, can feature an attractive graphic that is integrated into the functionality of the Web. For example, Novell's Home Page features a row of NetWare documentation volumes with titles that represent each of the major subdirectories of the server. By clicking on the spine of any of these "books," the user is taken to further information about that subject. Within each subject, blue highlighted items within the text are further hypertext links that allow the user to "drill down" deeper in that particular area of interest.

World Wide Web Servers are accessed using software called a browser. The most popular browser thus far is Mosaic from the National Center for Supercomputing (NCSA) at the University of Illinois Urbana-Champaign campus. Mosaic is available in versions for Microsoft Windows, Macintosh, and the UNIX X-Windows system and provides a consistent interface across each of these platforms. Other Web Browsers include Cello for the MS-Windows environment from the Legal Information Institute at Cornell University, as well as Samba for the Macintosh platform.


References ^

An Internet Bibliography

There has been a veritable explosion of Internet-related books from all major publishers in the past several months, but strangely enough some of the best are among those that have been around for a while.

Zen and the Art of the Internet, Brendan Kehoe, Prentice-Hall ,3rd edition, 1994.

The Whole Internet, A User's Guide and Catalog, Ed Krol, O'Reilly & Associates, 2nd edition, 1994.

Connecting To The Internet, Susan Estrada, O'Reilly & Associates, 1993.

The Internet Complete Reference, Harley Hahn & Rick Stout, Osborne McGraw-Hill, 1994.

Doing Business On The Internet, Mary Cronin, Van Nostrand Reinhold, 1994.

Internet Mailing Lists, Edward T. L. Hardie & Vivian Neou, Prentice Hall, 2nd edition, 1994.

The Internet Directory, Eric Braun, Fawcett, 1994.

!%@:: A Directory of Electronic Mail Addressing & Networks, Donnalyn Frey & Rick Adams, O'Reilly & Associates, 1994.

The Matrix, John S. Quarterman, Digital Press, 1990.