An Overview of TCP/IP Protocols
and the Internet
Gary C. Kessler
Hill Associates, Inc.
kumquat@hill.com
23 April 1999
This paper was
originally submitted to the InterNIC, and posted on their Gopher
site, on 5 August 1994. This document is an updated version of
that paper.
Contents
An increasing number
of people are using the Internet and, many for the first time, are
using the tools and utilities that at one time were only available
on a limited number of computer systems (and only for really
intense users!). One sign of this growth in use has been the
significant number of TCP/IP and Internet books, articles,
courses, and even TV shows that have become available in the last
several years; there are so many such books that publishers are
reluctant to authorize more because bookstores have reached their
limit of shelf space! This memo provides a broad overview of the
Internet and TCP/IP, with an emphasis on history, terms, and
concepts. It is meant as a brief guide and starting point,
referring to many other sources for more detailed information.
2. What are
TCP/IP and the Internet?
While the TCP/IP protocols and
the Internet are different, their histories are most
definitely intertwingled! This section will discuss some of the
history. For additional information and insight, readers are urged
to read two excellent histories of the Internet: Casting The
Net: From ARPANET to INTERNET and beyond... by Peter Salus
(Addison-Wesley, 1995) and Where Wizards Stay Up Late: The
Origins of the Internet by Katie Hafner and Mark Lyon (Simon
& Schuster, 1997).
2.1. The Evolution of TCP/IP
(and the Internet)
Prior to the 1960s, what little
computer communication existed comprised simple text and binary
data, carried by the most common telecommunications network
technology of the day; namely, circuit switching, the technology
of the telephone networks for nearly a hundred years. Because most
data traffic is bursty in nature (i.e., most of the transmissions
occur during a very short period of time), circuit switching
results in highly inefficient use of network resources. In 1962,
Paul Baran, of the Rand Corporation, described a robust,
efficient, store-and-forward data network in a report for the U.S.
Air Force; Donald Davies suggested a similar idea in independent
work for the Postal Service in the U.K., and coined the term packet
for the data units that would be carried. According to Baran and
Davies, packet switching networks could be designed so that all
components operated independently, eliminating single
point-of-failure problems. In addition, network communication
resources appear to be dedicated to individual users but, in fact,
statistical multiplexing and an upper limit on the size of a
transmitted entity result in fast, economical data networks.
The modern Internet began as a
U.S. Department of Defense (DoD) funded experiment to interconnect
DoD-funded research sites in the U.S. In December 1968, the
Advanced Research Projects Agency (ARPA) awarded a contract to
design and deploy a packet switching network to Bolt Beranek and
Newman (BBN). In September 1969, the first node of the ARPANET was
installed at UCLA. With four nodes by the end of 1969, the ARPANET
spanned the continental U.S. by 1971 and had connections to Europe
by 1973.
The original ARPANET gave life to
a number of protocols that were new to packet switching. One of
the most lasting results of the ARPANET was the development of a
user-network protocol that has become the standard interface
between users and packet switched networks; namely, ITU-T
(formerly CCITT) Recommendation X.25. This "standard"
interface encouraged BBN to start Telenet, a commercial
packet-switched data service, in 1974; after much renaming,
Telenet is now a part of Sprint's X.25 service.
The initial host-to-host
communications protocol introduced in the ARPANET was called the
Network Control Protocol (NCP). Over time, however, NCP proved to
be incapable of keeping up with the growing network traffic load.
In 1974, a new, more robust suite of communications protocols was
proposed and implemented throughout the ARPANET, based upon the
Transmission Control Protocol (TCP) and Internet Protocol (IP).
TCP and IP were originally envisioned functionally as a single
protocol, thus the protocol suite, which actually refers to a
large collection of protocols and applications, is usually
referred to simply as TCP/IP. The original versions of both
TCP and IP that are in common use today were written in September
1981, although both have had several modifications applied to them
(in addition, the IP version 6, or IPv6, specification was
released in December 1995). In 1983, the DoD mandated that all of
their computer systems would use the TCP/IP protocol suite for
long-haul communications, further enhancing the scope and
importance of the ARPANET.
In 1983, the ARPANET was split
into two components. One component, still called ARPANET, was used
to interconnect research/development and academic sites; the
other, called MILNET, was used to carry military traffic and
became part of the Defense Data Network. That year also saw a huge
boost in the popularity of TCP/IP with its inclusion in the
communications kernel for the University of California s UNIX
implementation, 4.2BSD (Berkeley Software Distribution) UNIX.
In 1986, the National Science
Foundation (NSF) built a backbone network to interconnect four
NSF-funded regional supercomputer centers and the National Center
for Atmospheric Research (NCAR). This network, dubbed the NSFNET,
was originally intended as a backbone for other networks, not as
an interconnection mechanism for individual systems. Furthermore,
the "Appropriate Use Policy" defined by the NSF limited
traffic to non-commercial use. The NSFNET continued to grow and
provide connectivity between both NSF-funded and non-NSF regional
networks, eventually becoming the backbone that we know today as
the Internet. Although early NSFNET applications were largely
multiprotocol in nature, TCP/IP was employed for interconnectivity
(with the ultimate goal of migration to Open Systems
Interconnection).
The NSFNET originally comprised
56-kbps links and was completely upgraded to T1 (1.544 Mbps) links
in 1989. Migration to a "professionally-managed" network
was supervised by a consortium comprising Merit (a Michigan state
regional network headquartered at the University of Michigan),
IBM, and MCI. Advanced Network & Services, Inc. (ANS), a
non-profit company formed by IBM and MCI, was responsible for
managing the NSFNET and supervising the transition of the NSFNET
backbone to T3 (44.736 Mbps) rates by the end of 1991. During this
period of time, the NSF also funded a number of regional Internet
service providers (ISPs) to provide local connection points for
educational institutions and NSF-funded sites.
In 1993, the NSF decided that it
did not want to be in the business of running and funding
networks, but wanted instead to go back to the funding of research
in the areas of supercomputing and high-speed communications. In
addition, there was increased pressure to commercialize the
Internet; in 1989, a trial gateway connected MCI, CompuServe, and
Internet mail services, and commercial users were now finding out
about all of the capabilities of the Internet that once belonged
exclusively to academic and hard-core users! In 1991, the Commercial
Internet Exchange (CIX) Association was formed by
General Atomics, Performance Systems International (PSI), and
UUNET Technologies to promote and provide a commercial Internet
backbone service. Nevertheless, there remained intense pressure
from non-NSF ISPs to open the network to all users.
In 1994, a plan was put in place
to reduce the NSF's role in the public Internet. The new structure
comprises three parts:
- Network
Access Points (NAPs),
where individual ISPs would interconnect.
Although the NSF is only funding four such NAPs (Chicago, New
York, San Francisco, and Washington, D.C.), several non-NSF
NAPs are also in operation.
- The very
High Speed Backbone Network Service, a network
interconnecting the NAPs and NSF-funded centers, operated by
MCI. This network was installed in 1995 and operated at OC-3
(155.52 Mbps); it was completely upgraded to OC-12 (622.08
Mbps) in 1997.
- The Routing
Arbiter, to ensure adequate routing protocols
for the Internet.
In addition, NSF-funded ISPs were
given five years of reduced funding to become commercially
self-sufficient. This funding ended by 1998.
In 1988, meanwhile, the DoD and
most of the U.S. Government chose to adopt OSI protocols. TCP/IP
was now viewed as an interim, proprietary solution since it ran
only on limited hardware platforms and OSI products were only a
couple of years away. The DoD mandated that all computer
communications products would have to use OSI protocols by August
1990 and use of TCP/IP would be phased out. Subsequently, the U.S.
Government OSI Profile (GOSIP) defined the set of protocols that
would have to be supported by products sold to the federal
government and TCP/IP was not included.
Despite this mandate, development
of TCP/IP continued during the late 1980s as the Internet grew.
TCP/IP development had always been carried out in an open
environment (although the size of this open community was small
due to the small number of ARPA/NSF sites), based upon the creed
"We reject kings, presidents, and voting. We believe in rough
consensus and running code" [Dave Clark, M.I.T.]. OSI
products were still a couple of years away while TCP/IP became, in
the minds of many, the real open systems interconnection protocol
suite.
It is not the purpose of this
memo to take a position in the OSI vs. TCP/IP debate.
Nevertheless, a number of observations are in order. First, the
ISO Development Environment (ISODE) was developed in 1990 to
provide an approach for OSI migration for the DoD. ISODE software
allows OSI applications to operate over TCP/IP. During this same
period, the Internet and OSI communities started to work together
to bring about the best of both worlds as many TCP and IP features
started to migrate into OSI protocols, particularly the OSI
Transport Protocol class 4 (TP4) and the Connectionless Network
Layer Protocol (CLNP), respectively. Finally, a report from the
National Institute for Standards and Technology (NIST) in 1994
suggested that GOSIP should incorporate TCP/IP and drop the "OSI-only"
requirement. [NOTE: Some industry observers have pointed
out that OSI represents the ultimate example of a sliding
window; OSI protocols have been "two years away"
since about 1986.]
2.2. Internet Growth
The ARPANET started with four
nodes in 1969 and grew to just under 600 nodes before it was split
in 1983. The NSFNET also started with a modest number of sites in
1986. After that, the network has experienced literally
exponential growth. Internet growth between 1981 and 1991 is
documented in "Internet Growth (1981-1991)" (RFC
1296).
Network Wizard's distributes a
semi-annual Internet
Domain Survey. According to them, the Internet had
nearly 30 million reachable hosts by January 1998. The Internet is
growing at a rate of about a new network attachment every
half-hour, interconnecting more than 200,000 networks. It is
estimated that the Internet is doubling in size every ten to
twelve months, and has been for the last several years.
And what of the original ARPANET?
It grew smaller and smaller during the late 1980s as sites and
traffic moved to the Internet, and was decommissioned in July
1990. Cerf & Kahn ("Selected ARPANET Maps," Computer
Communications Review, October 1990) re-printed a number of
network maps documenting the growth (and demise) of the ARPANET.
2.3. Internet Administration
The Internet has no single owner,
yet everyone owns (a portion of) the Internet. The Internet has no
central operator, yet everyone operates (a portion of) the
Internet. The Internet has been compared to anarchy, but some
claim that it is not nearly that well organized!
Some central authority is
required for the Internet, however, to manage those things that
can only be managed centrally, such as addressing, naming,
protocol development, standardization, etc. Among the significant
Internet authorities are:
- The Internet
Society (ISOC), chartered in 1992, is a
non-governmental international organization providing
coordination for the Internet, and its internetworking
technologies and applications. ISOC also provides oversight
and communications for the Internet Activities Board.
- The Internet
Activities Board (IAB) governs administrative and
technical activities on the Internet.
- The Internet
Engineering Task Force (IETF) is one of the two
primary bodies of the IAB. The IETF's working groups have
primary responsibility for the technical activities of the
Internet, including writing specifications and protocols. The
impact of these specifications is significant enough that ISO
accredited the IETF as an international standards body at the
end of 1994. RFCs
2028
and 2031
describe the organizations involved in the IETF standards
process and the relationship between the IETF and ISOC,
respectively, while RFC
2418 describes the IETF working group guidelines
and procedures. The background and history of the IETF and the
Internet standards process can be found in "IETF—History,
Background, and Role in Today's Internet."
- The Internet
Engineering Steering Group (IESG) is the other body
of the IAB. The IESG provides direction to the IETF.
- The Internet
Research Task Force (IRTF) comprises a number of
long-term reassert groups, promoting research of importance to
the evolution of the future Internet.
- The Internet
Engineering Planning Group (IEPG) coordinates
worldwide Internet operations. This group also assists
Internet Service Providers (ISPs) to interoperate within the
global Internet.
- The Forum
of Incident Response and Security Teams is the
coordinator of a number of Computer Emergency Response Teams (CERTs)
representing many countries, governmental agencies, and ISPs
throughout the world. Internet network security is greatly
enhanced and facilitated by the FIRST member organizations.
2.4. Domain Names (and Politics)
Although not directly related to
the administration of the Internet for operational purposes, the
assignment of Internet domain names is the subject of some
controversy and current activity. Internet hosts use a
hierarchical naming structure comprising a top-level domain (TLD),
domain and subdomain (optional), and host name. The IP address
space (and all TCP/IP-related numbers) has historically been
managed by the Internet
Assigned Numbers Authority (IANA). Domain names are
assigned by the TLD naming authority; until April 1998, the Internet
Network Information Center (InterNIC) had overall authority of these names, with NICs around the
world handling non-U.S. domains. The InterNIC was also responsible
for the overall coordination and management of the Domain Name
System (DNS), the distributed database that reconciles host names
and IP addresses on the Internet.
The InterNIC is an interesting
example of changes in the Internet. Starting in 1993, Network
Solutions, Inc. (NSI)
operated the InterNIC on behalf of the NSF and had exclusive
registration authority for the .com, .org, .net,
and .edu domains. NSI's contract ran out in April 1998 and
was extended several times while everyone tried to determine who
should pick up the registration for those domains. In October
1998, it was decided that NSI will remain the sole administrator
for those domains but that users could register names in those
domains with other firms. In addition, NSI's contract was extended
to September 2000, although the registration business has to be
opened to competition by June 1999.
Meanwhile, the newest body to
handle gTLD registrations is the Internet
Corporation for Assigned Names and
Numbers (ICANN). Formed in October 1998, ICANN is the
organization designated by the U.S. National
Telecommunications and Information Administration (NTIA)
to administer the DNS. Although still surrounded in some
controversy (which is well beyond the scope of this paper!), ICANN
has received wide industry support. ICANN will form several
Support Organizations (SOs) to create policy for the
administration of its areas of responsibility, including domain
names (DNSO), IP addresses (ASO), and protocol parameter
assignments (PSO).
On April 21, 1999, ICANN
announced that five companies had been selected to be part of this
new competitive Shared Registry System for the .com, .net,
and .org domains:
Phase I of the competitive registrar
testbed program will run until June, 1999. At that time, the
Shared Registry System for the .com, .net, and .org domains will
be opened to all ICANN-accredited registrars. ICANN also announced
a list of 29 other applicant companies that had met its
accreditation standards and will be able to enter the market as a
registrar after Phase I:
The domain name structure is best
understood if the name is read from right-to-left. Internet hosts
names end with a top-level domain name. World-wide generic
top-level domains include:
- .com:
Commercial organizations (administered by the Shared Registry)
- .edu:
Educational institutions, although today usually limited to
4-year colleges and universities (administered by the InterNIC)
- .net:
Network providers (administered by the InterNIC and the Shared
Registry)
- .org:
Non-profit organizations (administered by the InterNIC and the
Shared Registry)
- .int:
Organizations established by international treaty
- .gov:
U.S. Federal government agencies (delegated to the U.S.
Federal Networking Council and administered by the
InterNIC)
- .mil:
U.S. military (managed by the U.S. Defense
Data Network)
The host name entc.tamu.edu,
for example, is assigned to a computer in the Engineering
Technology and Industrial Distribution (ETID) Department at Texas
A&M University (tamu), within the educational top-level
domain (edu). The host name golem.hill.com refers to
a host (golem) in the Hill Associates domain (hill)
within the commercial top-level domain (com). Guidelines
for selecting host names is the subject of RFC
1178.
Other top-level domain names use
the two-letter country codes defined in ISO
standard 3166; munnari.oz.au, for example, is
the address of the Internet gateway to Australia and myo.inst.keio.ac.jp
is a host at the Science and Technology Department of Keio
University in Yokohama, Japan. Other ISO 3166-based domain country
codes are ca (Canada), de (Germany), es
(Spain), fr (France), gb (Great Britain) [NOTE:
For some historical reasons, the TLD .gb is rarely used;
the TLD .uk (United Kingdom) seems to be preferred although
UK is not an official ISO 3166 country code.], il (Israel),
ie (Ireland), jp (Japan), mx (Mexico), and us
(United States). It is important to note that there is not
necessarily any correlation between a country code and where a
host is actually physically located.
The Western Hemisphere, European,
and Asia-Pacific naming registries are managed by the American
Registry for Internet Numbers (ARIN), RIPE,
and Asia-Pacific
NIC (APNIC), respectively. These authorities, in turn,
delegate most of the country TLDs to national
registries (such as RNP in Brazil and NIC-Mexico),
which have ultimate authority to assign local domain names.
Different countries may organize
the country-based subdomains in any way that they want. Many
countries use a subdomain similar to the TLDs, so that .com.mx
and .edu.mx are the suffixes for commercial and educational
institutions in Mexico, and .co.uk and .ac.uk are
the suffixes for commercial and educational institutions in the
United Kingdom.
The us domain is largely
organized on the basis of geography or function. Geographical
names in the us name space use names of the form entity-name.city-telegraph-code.state-postal-code.us.
The domain name cnri.reston.va.us, for example, refers to
the Corporation for National Research Initiatives in Reston,
Virginia. Functional branches are also reserved within the name
space for schools (K12), community colleges (CC), technical
schools (TEC), state government agencies (STATE), councils of
governments (COG), libraries (LIB), museums (MUS), and several
other generic types of entities. Domain names in the state
government name space usually take the form department.state.state-postal-code.us
(e.g., the domain name dps.state.vt.us points to the
Vermont Department of Public Safety). The K12 name space can vary
widely, usually using the form school.school-district.k12.state-postal-code.us
(e.g., the domain ccs.cssd.k12.vt.us refers to the
Charlotte Central School in the Chittenden South School District
in Charlotte, Vermont.) More information about the us
domain may be found in RFC
1480.
The scheme of TLD assignment and
management has worked well for many years, but the pressures of
increased commercial activity, network size, and international use
have caused controversy about how names can be fairly assigned
without violating trademarks and conflicting claims to names. In
November 1996, an Internet International
Ad Hoc Committee (IAHC) was formed to resolve some of
these naming issues and to act as a focal point for the
international debate over a proposal to establish additional
global naming registries and global Top Level Domains (gTLDs). In
February 1997, the IAHC proposed the creation of seven new gTLDs:
- .firm
for businesses, or firms.
- .store
for businesses offering goods to purchase.
- .web
for entities emphasizing activities related to the WWW.
- .arts
for entities emphasizing cultural and entertainment
activities.
- .rec
for entities emphasizing recreation/entertainment activities.
- .info
for entities providing information services.
- .nom
for those wishing individual or personal nomenclature.
The IAHC also proposed that up to
28 new registrars be established to grant second-level domain
names under the new gTLDs, all of which will be shared among the
new registrars. Furthermore, the three existing gTLDs .com, .net,
and .org were also be shared upon conclusion of the NSF contract
in the U.S. in 1998.
The IAHC was dissolved in May
1997 with the publication of the Generic
Top Level Domain Memorandum of
Understanding framework. The Council
of Registrars (CORE) an operational body made up of all
of the Registrars established under the gTLD-MoU framework.
TCP/IP is most commonly
associated with the Unix operating system. While developed
separately, they have been historically tied, as mentioned above,
since 4.2BSD Unix started bundling TCP/IP protocols with the
operating system. Nevertheless, TCP/IP protocols are available for
all widely-used operating systems today and native TCP/IP support
is provided in OS/2, OS/400, and Windows 95/98/NT, as well as most
Unix variants.
Figure 1 shows the TCP/IP
protocol architecture; this diagram is by no means exhaustive, but
shows the major protocol and application components common to most
commercial TCP/IP software packages and their relationship.
--------------------------------------------------------- ------
APPLICATION |Telnet|FTP|Gopher|SMTP|HTTP|BGP|Finger|POP|DNS|SNMP|RIP| |Ping|
|------+---+------+----+----+---+------+---+-+-+----+---| |----+-----
TRANSPORT | TCP | UDP | |ICMP|OSPF|
|--------------------------------------------+----------+--+----+----+----
INTERNET | IP |ARP|
|----------+-------+----+------+-------+------+------+-----+------+--+---|
NETWORK | Ethernet | Token |FDDI| X.25 | Frame | SMDS | ISDN | ATM | SLIP | PPP |
INTERFACE | | Ring | | | Relay | | | | | |
--------------------------------------------------------------------------
FIGURE 1. Simplified
TCP/IP protocol stack. |
The sections below will provide a
brief overview of each of the layers in the TCP/IP suite and the
protocols that compose those layers. A large number of books and
papers have been written that describe all aspects of TCP/IP as a
protocol suite, including detailed information about use and
implementation of the protocols. Readers are referred to Internetworking
with TCP/IP, Vol. I: Principles, Protocols, and Architecture,
2/e, by D. Comer (Prentice-Hall, 1991), TCP/IP: Architecture,
Protocols, and Implementation with IPv6 and IP Security, 2nd.
ed. by S. Feit (McGraw-Hill, 1997), "TCP/IP Tutorial" by
T.J. Socolofsky and C.J. Kale (RFC
1180), and TCP/IP Illustrated, Volume I: The
Protocols by W.R. Stevens (Addison-Wesley, 1994).
3.1. The Network Interface Layer
The TCP/IP protocols have been
designed to operate over nearly any underlying local or wide area
network technology. Although certain accommodations may need to be
made, IP messages can be transported over all of the technologies
shown in the figure, as well as numerous others.
Two of the underlying interface
protocols are particularly relevant to TCP/IP. The Serial Line
Internet Protocol (SLIP, RFC
1055) and Point-to-Point Protocol (PPP, RFC
1661), respectively, may be used to provide data link
layer protocol services where no other underlying data link
protocol may be in use, such as in leased line or dial-up
environments. Most commercial TCP/IP software packages for
PC-class systems include these two protocols. With SLIP or PPP, a
remote computer can attach directly to a host server and,
therefore, connect to the Internet using IP rather than being
limited to an asynchronous connection. PPP, in addition, provides
support for simultaneous multiple protocols over a single
connection (see the IANA
list of PPP protocols), security mechanisms, and
dynamic bandwidth allocation (e.g., when running over ISDN).
3.2. The Internet Layer
The Internet Protocol (RFC
791), provides services that are roughly equivalent to
the OSI Network Layer. IP provides a datagram (connectionless)
transport service across the network. This service is sometimes
referred to as unreliable because the network does not
guarantee delivery nor notify the end host system about packets
lost due to errors or network congestion. IP datagrams contain a
message, or one fragment of a message, that may be up to 65,535
bytes (octets) in length. IP does not provide a mechanism for flow
control.
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL | TOS | Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TTL | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options.... (Padding) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data...
+-+-+-+-+-+-+-+-+-+-+-+-+-
FIGURE 2. IP
packet (datagram) header format.
|
The basic IP packet header format
is shown in Figure 2. The format of the diagram is consistent with
the RFC; bits are numbered from left-to-right, starting at 0. Each
row represents a single 32-bit word; note that an IP header
will be at least 5 words (20 bytes) in length. The fields
contained in the header, and their functions, are:
- Version: Specifies the
IP version of the packet. The current version of IP is version
4, so this field will contain the binary value 0100.
[NOTE: Actually, many IP version numbers have been assigned
besides 4 and 6; see the IANA's
list of IP Version Numbers.]
- Internet Header Length (IHL):
Indicates the length of the datagram header in 32 bit (4
octet) words. A minimum-length header is 20 octets, so this
field always has a value of at least 5 (0101).
- Type of Service (TOS):
Allows an originating host to request different classes of
service for packets it transmits. Although not generally
supported today in IPv4, the TOS field can be set by the
originating host in response to service requests across the
Transport Layer/Internet Layer service interface, and can
specify a service priority (0-7) or can request that the route
be optimized for either cost, delay, throughput, or
reliability.
- Total Length: Indicates
the length (in bytes, or octets) of the entire packet,
including both header and data. Given the size of this field,
the maximum size of an IP packet is 64 KB, or 65,535
bytes. In practice, packet sizes are limited to the maximum
transmission unit (MTU).
- Identification: Used
when a packet is fragmented into smaller pieces while
traversing the Internet, this identifier is assigned by the
transmitting host so that different fragments arriving at the
destination can be associated with each other for reassembly.
- Flags: Also used for
fragmentation and reassembly. The first bit is called the More
Fragments (MF) bit, and is used to indicate the last fragment
of a packet so that the receiver knows that the packet can be
reassembled. The second bit is the Don't Fragment (DF) bit,
which suppresses fragmentation. The third bit is unused (and
always set to 0).
- Fragment Offset:
Indicates the position of this fragment in the original
packet. In the first packet of a fragment stream, the offset
will be 0; in subsequent fragments, this field will indicates
the offset in increments of 8 bytes.
- Time-to-Live (TTL): A
value from 0 to 255, indicating the number of hops that this
packet is allowed to take before discarded within the network.
Every router that sees this packet will decrement the TTL
value by one; if it gets to 0, the packet will be discarded.
- Protocol: Indicates the
higher layer protocol contents of the data carried in the
packet; options include ICMP (1), TCP (6), UDP (17), or OSPF
(89). A complete list of IP protocol numbers can be found at
the IANA's
list of Protocol Numbers.
- Header Checksum:
Carries information to ensure that the received IP header is
error-free. Remember that IP provides an unreliable
service and, therefore, this field only checks the header
rather than the entire packet.
- Source Address: IP
address of the host sending the packet.
- Destination Address: IP
address of the host intended to receive the packet.
- Options: A set of
options which may be applied to any given packet, such as
sender-specified source routing or security indication. The
option list may use up to 40 bytes (10 words), and will be
padded to a word boundary; IP options are taken from the IANA's
list of IP Option Numbers.
3.2.1. IP
Addresses
IP addresses are 32 bits in
length (Figure 3). They are typically written as a sequence of
four numbers, representing the decimal value of each of the
address bytes. Since the values are separated by periods, the
notation is referred to as dotted decimal. A sample IP
address is 208.162.106.17.
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
--+-------------+------------------------------------------------
Class A |0| NET_ID | HOST_ID |
|-+-+-----------+---------------+-------------------------------|
Class B |1|0| NET_ID | HOST_ID |
|-+-+-+-------------------------+---------------+---------------|
Class C |1|1|0| NET_ID | HOST_ID |
|-+-+-+-+---------------------------------------+---------------|
Class D |1|1|1|0| MULTICAST_ID |
|-+-+-+-+-------------------------------------------------------|
Class E |1|1|1|1| EXPERIMENTAL_ID |
--+-+-+-+--------------------------------------------------------
FIGURE 3. IP
Address Format. |
IP addresses are hierarchical for
routing purposes and are subdivided into two subfields. The
Network Identifier (NET_ID) subfield identifies the TCP/IP
subnetwork connected to the Internet. The NET_ID is used for
high-level routing between networks, much the same way as the
country code, city code, or area code is used in the telephone
network. The Host Identifier (HOST_ID) subfield indicates the
specific host within a subnetwork.
To accommodate different size
networks, IP defines several address classes. Classes A, B,
and C are used for host addressing and the only difference between
the classes is the length of the NET_ID subfield:
- A Class A address has a 7-bit
NET_ID and 24-bit HOST_ID. Class A addresses are intended for
very large networks and can address up to 16,777,216 (224)
hosts per network. The first digit of a Class A addresses will
be a number between 1 and 126. Relatively few Class A
addresses have been assigned; examples include 4.0.0.0 (BBN
Planet) and 9.0.0.0 (IBM).
- A Class B address has a 14-bit
NET_ID and 16-bit HOST_ID. Class B addresses are intended for
moderate sized networks and can address up to 65,536 (216)
hosts per network. The first digit of a Class B address will
be a number between 128 and 191. The Class B address space has
long been threatened with being used up and it is has been
very difficult to get a new Class B address for some time.
Class B address assignment examples include 128.138.0.0 (WestNet)
and 152.163.0.0 (America Online).
- A Class C address has a 21-bit
NET_ID and 8-bit HOST_ID. These addresses are intended for
small networks and can address only up to 254 (28-2)
hosts per network. The first digit of a Class C address will
be a number between 192 and 223. Most addresses assigned to
networks today are Class C (or sub-Class C!); examples include
208.162.102.0 (Hill Associates) and 209.198.87.0 (SoverNet,
Bellows Falls, VT).
The remaining two address classes
are used for special functions only and are not commonly assigned
to individual hosts. Class D addresses may begin with a value
between 224 and 239, and are used for IP multicasting (i.e.,
sending a single datagram to multiple hosts); the IANA maintains a
list of Internet
Multicast Addresses. Class E addresses begin with a
value between 240 and 255, and are reserved for experimental use.
Several address values are
reserved and/or have special meaning. A HOST_ID of 0 (as used
above) is a dummy value reserved as a place holder when referring
to an entire subnetwork; the address 208.162.106.0, then, refers
to the Class C address with a NET_ID of 208.162.106. A HOST_ID of
all ones (usually written "255" when referring to an
all-ones byte, but also denoted as "-1") is a broadcast
address and refers to all hosts on a network. A NET_ID value of
127 is used for loopback testing and the specific host address
127.0.0.1 refers to the localhost.
Several NET_IDs have been
reserved in RFC
1918 for private network addresses and packets will not
be routed over the Internet to these networks. Reserved NET_IDs
are the Class A address 10.0.0.0 (formerly assigned to ARPANET),
the sixteen Class B addresses 172.16.0.0-172.31.0.0, and the 256
Class C addresses 192.168.0.0-192.168.255.0.
An additional addressing tool is
the subnet mask. Subnet masks are used to indicate the
portion of the address that identifies the network (and/or
subnetwork) for routing purposes. The subnet mask is written in
dotted decimal and the number of 1s indicates the significant
NET_ID bits. For "classful" IP addresses, the subnet
mask and number of significant address bits for the NET_ID are:
| Class
| Subnet Mask
| Number of Bits
|
| A
| 255.0.0.0
| 8
|
| B
| 255.255.0.0
| 16
|
| C
| 255.255.255.0
| 24 |
Depending upon the context and
literature, subnet masks may be written in dotted decimal form or
just as a number representing the number of significant address
bits for the NET_ID. Thus, 208.162.106.17 255.255.255.0
and 208.162.106.17/24 both refer to a Class C NET_ID of
208.162.106.
Subnet masks can also be used to
subdivide a large address space or to combine multiple small
address spaces. For example, a network may subdivide their address
space to define multiple logical networks by segmenting the
HOST_ID subfield into a Subnetwork Identifier (SUBNET_ID) and
(smaller) HOST_ID. For example, a user might be assigned the Class
B address space 172.16.0.0 which might be segmented into a 16-bit
NET_ID, 4-bit SUBNET_ID, and 12-bit HOST_ID. In this case, the
subnet mask for routing to the NET_ID on the Internet would be
255.255.0.0 (or "/16"), while the mask for routing to
individual subnets within the larger Class B address space would
be 255.255.240.0 (or "/20").
Alternatively, a single user
might be assigned the four Class C addresses 192.168.128.0,
192.168.129.0, 192.168.130.0, and 192.168.131.0, and use the
subnet mask 255.255.252.0 (or "/22") for routing to this
domain. This use of subnet masks in routing tables to consolidate
addresses uses a process called Classless Interdomain Routing (CIDR),
described in RFCs 1518
and 1519.
It should be obvious from this example that CIDR address
consolidation results in smaller router tables; in the example
here, routing information for four Class C addresses can be
specified in a single router table entry.
As of January 1996, there were 95
Class A addresses, 5892 Class B addresses, and 128,378 Class C
addresses assigned; this number is undoubtedly larger today,
particularly in the Class C space. Because CIDR is becoming so
widely used, however, these numbers are not a true reflection of
the number of networks attached to the public Internet because
multiple addresses may be assigned to a single organizational
entity.
3.2.2. The Domain Name System
While IP addresses are 32 bits in
length, most users do not memorize the numeric addresses of the
hosts to which they attach; instead, people are more comfortable
with host names. Most IP hosts, then, have both a numeric IP
address and a name. While this is convenient for people, however,
the name must be translated back to a numeric address for routing
purposes.
Earlier discussion in this paper
described the domain naming structure of the Internet. In the
early ARPANET, every host maintained a file called HOSTS.TXT
that contained a list of all hosts, which included the IP address,
host name, and alias(es). This was an adequate measure while the
ARPANET was small and had a slow rate of growth, but was not a
scalable solution as the network grew.
[NOTE: HOSTS.TXT files
are still found on Unix systems although usually used to reconcile
names of hosts on the local network to cut down on local DNS
traffic. On Microsoft Windows systems, the file is called HOSTS
and can typically be found in the c:\windows folder.]
To handle the fast rate of new
names on the network, the Domain Name System (DNS) was created.
The DNS is a distributed database containing host name and IP
address information for all domains on the Internet. There is a
single authoritative name server for every domain that
contains all DNS-related information about the domain; each domain
also has at least one secondary name server that also contains a
copy of this information. Thirteen root
servers around the globe (most in the U.S.,
actually, with the remainder in Asia and Europe) maintain a list
of all of these authoritative name servers.
When a host on the Internet needs
to obtain a host's IP address based upon the host's name, a DNS
request is made by the initial host to the to a local name server.
The local name server may be able to respond to the request with
information that is either configured or cached at the name
server; if necessary information is not available, the local name
server forwards the request to one of the root servers. The root
server, then, will determine an appropriate name server for the
target host and the DNS request will be forwarded to the domain's
name server.
Name servers contain the
following types of information:
- A-record:
An address record maps a hostname to an IP address.
- PTR-record:
A pointer record maps an IP address to a hostname.
- NS-record:
A name server record lists the authoritative name server(s)
for a given domain.
- MX-record:
A mail exchange record lists the mail servers for a given
domain. As an example, consider the author's e-mail address, kumquat@hill.com.
Note that the "hill.com" portion of the address is a
domain name, not a host name, and mail has to be sent to a
specific host. The MX-records in the hill.com name
database specifies the host mail.hill.com is the mail
server for this domain.
More information about the DNS can
be found from the World
Internetworking Alliance (WIA) Web site. Additional DNS
references include DNS and BIND by P. Albitz and C. Liu
(O'Reilly & Associates) and "Setting
up Your own DNS" by G. Kessler. The concepts,
structure, and delegation of the DNS are described in RFCs 1034
and 1591.
In addition, the IANA maintains a list of DNS
parameters.
3.2.3. ARP and Address Resolution
Early IP implementations ran on
hosts commonly interconnected by Ethernet local area networks
(LAN). Every transmission on the LAN contains the local network,
or medium access control (MAC), address of the source and
destination nodes. MAC addresses are 48-bits in length and are
non-hierarchical, so routing cannot be performed using the MAC
address. MAC addresses are never the same as IP addresses.
When a host needs to send a
datagram to another host on the same network, the sending
application must know both the IP and MAC addresses of the
intended receiver; this is because the destination IP address is
placed in the IP packet and the destination MAC address is placed
in the LAN MAC protocol frame. (If the destination host is on
another network, the sender will look instead for the MAC address
of the default gateway, or router.)
Unfortunately, the sender's IP
process may not know the MAC address of the intended receiver on
the same network. The Address Resolution Protocol (ARP), described
in RFC
826, provides a mechanism so that a host can learn a
receiver's MAC address when knowing only the IP address. The
process is actually relatively simple: the host sends an ARP
Request packet in a frame containing the MAC broadcast address;
the ARP request advertises the destination IP address and asks for
the associated MAC address. The station on the LAN that recognizes
its own IP address will send an ARP Response with its own MAC
address. As Figure 1 shows, ARP message are carried directly in
the LAN frame and ARP is an independent protocol from IP. The IANA
maintains a list of all ARP
parameters.
Other address resolution
procedures have also been defined, including:
- Reverse ARP (RARP), which
allows a disk-less processor to determine its IP address based
on knowing its own MAC address
- Inverse ARP (InARP), which
provides a mapping between an IP address and a frame relay
virtual circuit identifier
- ATMARP and ATMInARP provide a
mapping between an IP address and ATM virtual path/channel
identifiers.
- LAN Emulation ARP (LEARP),
which maps a recipient's ATM address to its LAN Emulation (LE)
address (which takes the form of an IEEE 802 MAC address).
[NOTE: IP hosts maintain a cache
storing recent ARP information. The ARP cache can be viewed from a
Unix or DOS (in Windows 95/98/NT) command line using the arp -a
command.]
3.2.4. IP Routing: OSPF, RIP, and BGP
As an OSI Network Layer protocol,
IP has the responsibility to route packets. It performs this
function by looking up a packet's destination IP NET_ID in a
routing table and forwarding based on the information in the
table. But it is routing protocols, and not IP, that
populate the routing tables with routing information. There are
three routing protocols commonly associated with IP and the
Internet, namely, RIP, OSPF, and BGP.
OSPF and RIP are primarily used
to provide routing within a particular domain, such as within a
corporate network or within an ISP's network. Since the routing is
inside of the domain, these protocols are generically
referred to as interior gateways protocols.
The Routing Information Protocol
version 2 (RIP-2), described in RFC
2453, describes how routers will exchange routing table
information using a distance-vector algorithm. With RIP,
neighboring routers periodically exchange their entire routing
tables. RIP uses hop count as the metric of a path's cost, and a
path is limited to 16 hops. Unfortunately, RIP has become
increasingly inefficient on the Internet as the network continues
its fast rate of growth. Current routing protocols for many of
today's LANs are based upon RIP, including those associated with
NetWare, AppleTalk, VINES, and DECnet. The IANA maintains a list
of RIP
message types.
The Open Shortest Path First (OSPF)
protocol is a link state routing algorithm that is more robust
than RIP, converges faster, requires less network bandwidth, and
is better able to scale to larger networks. With OSPF, a router
broadcasts only changes in its links' status rather than entire
routing tables. OSPF Version 2, described in RFC
1583, is rapidly replacing RIP in the Internet.
The Border Gateway Protocol
version 4 (BGP-4) is an exterior gateway protocol because
it is used to provide routing information between Internet routing
domains. BGP is a distance vector protocol, like RIP, but unlike
almost all other distance vector protocols, BGP tables store the
actual route to the destination network. BGP-4 also supports
policy-based routing, which allows a network's administrator to
create routing policies based on political, security, legal, or
economic issues rather than technical ones. BGP-4 also supports
CIDR. BGP-4 is described in RFC
1771, while RFC
1268 describes use of BGP in the Internet. In addition,
the IANA maintains a list of BGP
parameters.
Figure 1 shows the protocol
relationship of RIP, OSPF, and BGP to IP. A RIP message is carried
in a UDP datagram which, in turn, is carried in an IP packet. An
OSPF message, on the other hand, is carried directly in an IP
datagram. BGP messages, in a total departure, are carried in TCP
segments over IP. Although all of the TCP/IP books mentioned above
discuss IP routing to some level of detail, Routing in the
Internet by Christian Huitema is one of the best available
references on this specific subject.
3.2.5. ICMP
The Internet Control Message
Protocol, described in RFC
792, is an adjunct to IP that notifies the sender of IP
datagrams about abnormal events. This collateral protocol is
particularly important in the connectionless environment of IP.
The commonly employed ICMP
message types include:
- Destination Unreachable:
Indicates that a packet cannot be delivered because the
destination host cannot be reached. The reason for the
non-delivery may be that the host or network is unreachable or
unknown, the protocol or port is unknown or unusable,
fragmentation is required but not allowed (DF-flag is set), or
the network or host is unreachable for this type of service.
- Echo
and Echo Reply: These two messages are used to check
whether hosts are reachable on the network. One host sends an
Echo message to the other, optionally containing some data,
and the receiving host responds with an Echo Reply containing
the same data. These messages are the basis for the Ping
command.
- Parameter Problem:
Indicates that a router or host encountered a problem with
some aspect of the packet's Header.
- Redirect:
Used by a host or router to let the sending host know that
packets should be forwarded to another address. For
security reasons, Redirect messages should usually be blocked
at the firewall.
- Source Quench:
Sent by a router to indicate that it is experiencing
congestion (usually due to limited buffer space) and is
discarding datagrams.
- TTL Exceeded:
Indicates that a datagram has been discarded because the TTL
field reached 0 or because the entire packet was not received
before the fragmentation timer expired.
- Timestamp
and Timestamp Reply: These messages are similar to the
Echo messages, but place a timestamp (with millisecond
granularity) in the message, yielding a measure of how long
remote systems spend buffering and processing datagrams, and
providing a mechanism so that hosts can synchronize their
clocks.
ICMP messages are carried in IP
packets. The IANA maintains a complete list of ICMP
parameters.
3.2.6. IP version 6
The official version of IP that
has been in use since the early 1980s is version 4. Due to
the tremendous growth of the Internet and new emerging
applications, it was recognized that a new version of IP was
becoming necessary. In late 1995, IP version 6 (IPv6) was entered
into the Internet Standards Track. The primary description of IPv6
is contained in RFC
1883 and a number of related specifications, including
ICMPv6.
IPv6 is designed as an evolution
from IPv4, rather than a radical change. Primary areas of change
relate to:
- Increasing the IP address size
to 128 bits
- Better support for traffic
types with different quality-of-service objectives
- Extensions to support
authentication, data integrity, and data confidentiality
For more information about IPv6,
check out:
3.3. The Transport Layer
Protocols
The TCP/IP protocol suite
comprises two protocols that correspond roughly to the OSI
Transport and Session Layers; these protocols are called the
Transmission Control Protocol and the User Datagram Protocol (UDP).
One can argue that it is a misnomer to refer to "TCP/IP
applications," as most such applications actually run over
TCP or UDP, as shown in Figure 1.
Higher-layer applications are
referred to by a port identifier in TCP/UDP messages. The port
identifier and IP address together form a socket, and the
end-to-end communication between two hosts is uniquely identified
on the Internet by the four-tuple (source port, source address,
destination port, destination address). Well-known port numbers
denote the server side of a connection and include:
| Port #
| Protocol
| Application
|
| 20
| TCP
| FTP data transfer
|
| 21
| TCP
| FTP control
|
| 23
| TCP
| Telnet
|
| 25
| TCP
| SMTP
|
| 43
| TCP
| whois
|
| 53
| TCP/UDP
| DNS
|
| 70
| TCP
| Gopher
|
| 79
| TCP
| finger
|
| 80
| TCP
| HTTP
|
| 110
| TCP
| POPv3
|
| 161
| UDP
| SNMP
|
| 162
| UDP
| SNMP-trap
|
| 520
| UDP
| RIP |
A complete list of port numbers
that have been assigned can be found in the IANA's
list of Port Numbers.
3.3.1. TCP
TCP, described in RFC
793, provides a virtual circuit (connection-oriented)
communication service across the network. TCP includes rules for
formatting messages, establishing and terminating virtual
circuits, sequencing, flow control, and error correction. Most of
the applications in the TCP/IP suite operate over the reliable
transport service provided by TCP.
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgement Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Offset |(reserved) | Flags | Window |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options.... (Padding) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data...
+-+-+-+-+-+-+-+-+-+-+-+-+-
FIGURE 4. TCP
segment format. |
The TCP data unit is called a segment;
the name is due to the fact that TCP does not recognize messages,
per se, but merely sends a block of bytes from the byte stream
between sender and receiver. The fields of the segment (Figure 4)
are:
- Source Port and Destination
Port: Identify the source and destination ports to
identify the end-to-end connection and higher-layer
application.
- Sequence Number:
Contains the sequence number of this segment's first data byte
in the overall connection byte stream; since the sequence
number refers to a byte count rather than a segment count,
sequence numbers in contiguous TCP segments are not numbered
sequentially.
- Acknowledgment Number:
Used by the sender to acknowledge receipt of data; this field
indicates the sequence number of the next byte expected from
the receiver.
- Data Offset: Points to
the first data byte in this segment; this field, then,
indicates the segment header length.
- Control Flags: A set of
flags that control certain aspects of the TCP virtual
connection. The flags include:
- Urgent Pointer Field
Significant (URG): When set, indicates that the
current segment contains urgent (or high-priority) data
and that the Urgent Pointer field value is valid.
- Acknowledgment Field
Significant (ACK): When set, indicates that the value
contained in the Acknowledgment Number field is valid.
This bit is usually set, except during the first message
during connection establishment.
- Push Function (PSH):
Used when the transmitting application wants to force TCP
to immediately transmit the data that is currently
buffered without waiting for the buffer to fill; useful
for transmitting small units of data.
- Reset Connection (RST):
When set, immediately terminates the end-to-end TCP
connection.
- Synchronize Sequence
Numbers (SYN): Set in the initial segments used to
establish a connection, indicating that the segments carry
the initial sequence number.
- Finish (FIN): Set
to request normal termination of the TCP connection in the
direction this segment is traveling; completely closing
the connection requires one FIN segment in each direction.
- Window: Used for flow
control, contains the value of the receive window size
which is the number of transmitted bytes that the sender of
this segment is willing to accept from the receiver.
- Checksum: Provides
rudimentary bit error detection for the segment (including the
header and data).
- Urgent Pointer: Urgent
data is information that has been marked as high-priority by a
higher layer application; this data, in turn, usually bypasses
normal TCP buffering and is placed in a segment between the
header and "normal" data. The Urgent Pointer, valid
when the URG flag is set, indicates the position of the first
octet of non-expedited data in the segment.
- Options: Used at
connection establishment to negotiate a variety of options;
maximum segment size (MSS) is the most commonly used option
and, if absent, defaults to an MSS of 536. The IANA maintains
a list of all TCP
Option Numbers.
3.3.2. UDP
UDP, described in RFC
768, provides an end-to-end datagram (connectionless)
service. Some applications, such as those that involve a simple
query and response, are better suited to the datagram service of
UDP because there is no time lost to virtual circuit establishment
and termination. UDP's primary function is to add a port number to
the IP address to provide a socket for the application.
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data...
+-+-+-+-+-+-+-+-+-+-+-+-+-
FIGURE 5. UDP
datagram format. |
The fields of a UDP datagram
(Figure 5) are:
- Source Port:
Identifies the UDP port at the source side of the connection;
use of this field is optional in UDP and may be set to 0.
- Destination Port:
Identifies the destination port of the end-to-end connection.
- Length:
Indicates the total length of the UDP datagram.
- Checksum:
Provides rudimentary bit error detection for the datagram
(including the header and data).
3.4. Applications
The TCP/IP Application Layer
protocols support the applications and utilities that are the
Internet. Commonly used protocols include:
- Telnet:
Short for Telecommunication Network, a virtual terminal
protocol allowing a user logged on to one TCP/IP host to
access other hosts on the network (RFC
854).
- FTP:
The File Transfer Protocol allows a user to transfer files
between local and remote host computers (RFC
959).
- Archie:
A utility that allows a user to search all registered
anonymous FTP sites for files on a specified topic.
- Gopher:
A tool that allows users to search through data repositories
using a menu-driven, hierarchical interface, with links to
other sites (RFC
1436).
- SMTP:
The Simple Mail Transfer Protocol is the standard protocol for
the exchange of electronic mail over the Internet (RFC
821). SMTP is used between e-mail servers on the
Internet or to allow an e-mail client to send mail to a
server. RFC
822 specifically describes the mail message body
format, and RFCs 1521
and 1522
describe MIME (Multipurpose Internet Mail Extensions).
Reference books on electronic mail systems include !%@::
Addressing and Networks by D. Frey and R. Adams (O'Reilly
& Associates, 1993) and THE INTERNET MESSAGE: Closing
the Book With Electronic Mail by M. Rose (PTR Prentice
Hall, 1993).
- HTTP:
The Hypertext Transfer Protocol is the basis for exchange of
information over the World Wide Web (WWW). Various versions of
HTTP are in use over the Internet, with HTTP version 1.0 (RFC
1945) being the most current. WWW pages are written
in the Hypertext Markup Language (HTML), an ASCII-based,
platform-independent formatting language (RFC
1866).
- Finger:
Used to determine the status of other hosts and/or users (RFC
1288).
- POP:
The Post Office Protocol defines a simple interface between a
user's mail client software and an e-mail server; POP is used
to download mail from the server to the client and allows the
user to manage their mailboxes. The current version is POP3 (RFC
1460).
- DNS:
The Domain Name System (described in slightly more detail in Section
3.2.2 above) defines the structure of Internet
names and their association with IP addresses, as well as the
association of mail and name servers with domains.
- SNMP:
The Simple Network Management Protocol defines procedures and
management information databases for managing TCP/IP-based
network devices. SNMP (RFC
1157) is widely deployed in local and wide area
network. SNMP Version 2 (SNMPv2, RFC
1441) adds security mechanisms that are missing in
SNMP, but is also very complex; widespread use of SNMPv2 has
yet to be seen. Additional information on SNMP and
TCP/IP-based network management can be found in SNMP by
S. Feit (McGraw-Hill, 1994) and THE SIMPLE BOOK: An
Introduction to Internet Management, 2/e, by M. Rose (PTR
Prentice Hall, 1994).
- Ping:
The Packet Internet Groper, a utility that allows a user at
one system to determine the status of other hosts and the
latency in getting a message to that host. Uses ICMP Echo
messages.
- Whois/NICNAME:
Utilities that search databases for information about Internet
domains and domain contact information (
RFC 954).
- Traceroute:
A tool that displays the route that packets will take when
traveling to a remote host.
A guide to using most of these
applications can be found in "A Primer on Internet and TCP/IP
Tools and Utilities" (FYI 30/RFC
2151) by Gary Kessler & Steve Shepard (also
available in HTML,
Postscript,
and Word).
3.5. Summary
As this discussion has shown, TCP/IP
is not merely a pair of communication protocols but is a suite of
protocols, applications, and utilities. Increasingly, these
protocols are referred to as the Internet Protocol Suite,
but the older name will not disappear anytime soon.
---------------- ----------------
| Application |<------ end-to-end connection ------>| Application |
|--------------| |--------------|
| TCP |<--------- virtual circuit --------->| TCP |
|--------------| ----------------- |--------------|
| IP |<-- DG -->| IP |<-- DG -->| IP |
|--------------| |-------+-------| |--------------|
| Subnetwork 1 |<-------->|Subnet1|Subnet2|<-------->| Subnetwork 2 |
---------------- --------+-------- ----------------
HOST GATEWAY HOST
FIGURE 6. TCP/IP
protocol suite architecture. |
Figure 6 shows the relationship
between the various protocol layers of TCP/IP. Applications and
utilities reside in host, or end-communicating, systems. TCP
provides a reliable, virtual circuit connection between the two
hosts. (UDP, not shown, provides an end-to-end datagram connection
at this layer.) IP provides a datagram (DG) transport service over
any intervening subnetworks, including local and wide area
networks. The underlying subnetwork may employ nearly any common
local or wide area network technology.
Note that the term gateway
is used for the device interconnecting the two subnets, a device
usually called a router in LAN environments or intermediate
system in OSI environments. In OSI terminology, a gateway
is used to provide protocol conversion between two networks and/or
applications.
This memo has only provided
background information about the TCP/IP protocols and the
Internet. There is a wide range of additional information that the
reader can access to further use and understand the tools and
scope of the Internet. The real fun begins now!
Internet specifications,
standards, reports, humor, and tutorials are distributed as
Request for Comments (RFC) documents. RFCs are all freely
available on-line, and most are available in ASCII text format.
Internet standards are documented
in a subset of the RFCs, identified with an "STD"
designation. RFC
2026 describes the Internet standards process and STD
1 always contains the official list of Internet
standards.
For Your Information (FYI)
documents are another RFC subset, specifically providing
background information for the Internet community. The FYI notes
are described in RFC
1150.
Frequently Asked Question (FAQ)
lists may be found for a number of topics, ranging from ISDN and
cryptography to the Internet and Gopher. Two such FAQs are of
particular interest to Internet users: "FYI on Questions and
Answers - Answers to Commonly asked 'New Internet User'
Questions" (RFC
1594) and "FYI on Questions and Answers: Answers
to Commonly Asked 'Experienced Internet User' Questions" (RFC
1207). All three of these documents point to even more
information sources.
| ARP |
Address Resolution
Protocol |
| ARPANET |
Advanced Research Projects
Agency Network |
| ASCII |
American Standard Code for
Information Interchange |
| ATM |
Asynchronous Transfer Mode |
| BGP |
Border Gateway Protocol |
| BSD |
Berkeley Software
Development |
| CCITT |
International Telegraph
and Telephone Consultative Committee |
| CIX |
Commercial Internet
Exchange |
| DARPA |
Defense Advanced Research
Projects Agency |
| DNS |
Domain Name System |
| DoD |
U.S. Department of Defense |
| FAQ |
Frequently Asked Questions
lists |
| FDDI |
Fiber Distributed Data
Interface |
| FTP |
File Transfer Protocol |
| FYI |
For Your Information
series of RFCs |
| GOSIP |
U.S. Government Open
Systems Interconnection Pro | |