Internet and Web Programming

Uses of the Internet

History of the Internet

References: An Abridged Time Line

Overview of the Internet

The Internet is a collection of computers connected by network cables or through satellite links. Rather than connecting every computer on the Internet with every other computer, individual computers in an organization are normally connected in a local area network (LAN). One node on this local area network is physically connected to the Internet. So the Internet is a network of networks . The Internet connectivity is provided by Internet Service Providers (ISP). These corporations dedicate computers to act as servers - that is they make information (such as Web pages or e-mail) available to users of the Internet.

Layers of the Internet

Physical Layer: Movement of bits through copper cable, fiber optic cable, satellites, and wireless transmitters.

Network Interface Layer: This is implemented in software and provides the protocols to interpret the physical bit stream. Each network card has a 6-byte hardware address. Data moves in the network in packets. Each packet has data and a header. If your computer is connected to a local ISP via telephone, the link uses Point-to-Point Protocol.

Internetwork Layer: The protocol governing this layer is Internetwork Protocol (IP). Each computer has a four byte address. There is a move to IPv6 that uses six byte addresses. Each packet header has the IP address of the sender and the destination.

This layer is concerned with sending the packets from the sender to the destination. Routers are are used on the way. Routers are computers having IP routing software. Each router has a buffer to store packets on a temporary basis. Routers can assemble information about routes on the network that can be used to perform route selection when forwarding packets.

The IP does not guarantee delivery. Each packet has Time To Live (TTL) in its header. TTL is an integer that specifies the maximum number of jumps that a packet can make before it is discarded. A packet can travel through various networks to reach its final destination.

Transport Layer: The protocol governing the transport layer is Transmission Control Protocol (TCP). This layer controls the transmission of a file (say) between two hosts. At the client end TCP sends a request to a server for a connection. If the server responds affirmatively a socket connection is established.

The TCP segments the file into packets, stamps each packet with a sequence number and a checksum. The checksum is the count of the bits in the packet. The packets are then sent to IP for internetwork routing. As the IP receives the packets they are sent to the TCP for checking. The client then sends an acknowledgment to the server as to the result.

Application Layer: Some examples:

Domain Names

The nodes on the Internet are identified by a unique 32-bit number or the IP address. They are normally written as four 8-bit numbers separated by periods. Because people have difficulty remembering numbers there are names associated with each machine. These names begin with the name of the host machine, followed by progressively larger enclosing collections of machines, called domains. The last domain name identifies the type of orgainzation in which the machine resides. Examples of such domain names are - .edu, .com, .net, .org, .gov. The textual name of the machine must be converted to an IP address. This is done by a dedicated computer known as the Domain Name Server.

World Wide Web

In 1989, Tim Berners-Lee at CERN proposed a protocol to exchange documents with colleagues around the world. The idea was that users could search for and retrieve any document on the Internet. The form of the documents was hypertext. This meant any given document could have links to other documents on the Internet. In a strict sense, the World Wide Web was this interconnected system of documents.

Web Browsers

A browser is a special software program also known as a client that requests servers for a specific web document and renders it on the computer terminal for the user. One of the oldest browsers is Lynx. This is a text only browser. The earliest graphical browser was Mosaic developed at the National Center for Supercomputer Applications (NCSA) at the University of Illinois. The developers of Mosaic latter formed a company that produced Netscape. Microsoft has its own browser called the Internet Explorer that comes with the Windows operating system. There are other browsers available - Mozilla, Opera, Safari. The browsers communicate with the servers using the standard Hypertext Transfer Protocol (HTTP).

Web Servers

A web server is a software program that provides documents to browsers. Apache is the most widely used web server with 68% of the market share. Second is Microsoft's Internet Information Server (IIS) with about 21% of the market share. And the remainder is spread over a large number of other servers. A web browser initiates a request with a server by sending it the URL of a document. The server searches, retreives and sends the document.

URL

Uniform (or universal) resource locators are used to identify documents or resources on the Internet. URLs have the same general format:

scheme:object-address

The scheme is the communcation protocol. These protocols include - http, ftp, telnet, mailto. Different schemes use different object addresses. HTTP uses an object address of the following form:

//fully-qualified-domain-name/path-to-document

Hypertext Transfer protocol

Web communications is conducted using the Hypertext Transfer Protocol (HTTP). HTTP consists of two phases a request and a response. The general form of the request is:

  1. HTTP method     Domain part of the URL     HTTP version
  2. Header fields
  3. Blank line
  4. Message body
The general form of an HTTP response is:
  1. Status line
  2. Response header fields
  3. Blank line
  4. Response body

Building a Web Program

There are several technologies that you will have to master to build an effective web site. Here are some for starters:

XHTML is markup language. An XHTML document consists of content and control tags. The control tags are used by the browser to render the document on the client monitor. It is easy write the control tags by hand. But you can also use XHTML editors like Microsoft FrontPage, Macromedia Dreamweaver, and Adobe PageMill. The World Wide Web Consortium (W3C) has provided the specification for this language.

Cascading Style Sheets specifies how the content is to be displayed. The trend in web design has been to separate the content from the redering instructions. The specifications for the style sheets are also determined by the W3C. Some amazing designs can be created using style sheets.

XML stands for eXtensible Markup Language. It allows users to create their own control tags. XHTML has a small set of predefined tags. With XML you can have any number of tags. These tags can be used to describe the content of the document not only for the human user but also for a machine that is capable of reading that document. This document specifies how to create and use control tags in XML.

JavaScript is a scripting language that is used for writing scripts that run on the client side. These scripts normally do client side verification of data that you enter in a form. These scripts can also be used with cascading style sheets to produce dynamic effects. Such scripts are called Dynamic HTML or DHTML.

Java is an object oriented language that is used for writing applets that can run on the client machine. Java can also be used on the server side. The client program can request an application be run on the server side. A Java class called Servlets can be used for these applications.

Perl is a systems programming language. This means that Perl can access operating system functions directly. Perl is used to process data that a user sends back in a form. It can also be used for accessing databases on the server side and returning that information to the client.

PHP was developed to be a web programming language. It is a scripting language like JavaScript but runs on the server side. The script is embedded in the XHTML document. With PHP you can process forms and access databases but the learning curve is less steep compared to Perl.

But this not enough. To be an effective web programmer you will also have to master some graphics package like PhotoShop to create your own graphics. You will also have to provide a database back end for your web page. There are several free database packages available like MySQL or PostgreSQL.

Current Trends in Web Programming

Web Services is the underlying web application that allows a business to offer services over the internet. Read the latest news on web services at at the W3C site or at a vendor neutral site.

Semantic Web is a network of information linked in such a way that can be access by machines in a meaningful way. You can think of this as a global database that you can query and get intelligent results.