Working with URLs |
If you've been surfing the World Wide Web, you have undoubtedly heard the term URL and used URLs to access various HTML pages from the Web. So, what exactly is a URL? Well, the following is a fairly simple, but formal definition of URL:
Definition: URL is an acronym that stands for Uniform Resource Locator and is a reference (an address) to a resource on the Internet.
It's often easiest (though not entirely accurate) to think of a URL as the name of a file on the network because most URLs refer to a file on some machine on the network. However, you should remember that URLs can point to other resources on the network such as database queries and command output.
The following is an example of a URL:
This particular URL addresses the Java Web site hosted by Sun Microsystems. The URL shown above, like all other URLs, has two main components separated by a colonhttp://java.sun.com/(:)
:In the example,
- the protocol identifier
- the resource name
http
is the protocol identifier and//java.sun.com/
is the resource name.The protocol identifier indicates the name of the protocol to be used to fetch the resource. The example uses the Hyper Text Transfer Protocol (HTTP), which is typically used to serve hypertext documents. HTTP is just one of many different protocols used to access different types of resources on the net. Other protocols include File Transfer Protocol (ftp), Gopher (gopher), File (file), and News (news).
The resource name is the complete address to the resource. The format of the resource name depends entirely on the protocol used, but for many formats the resource name contains one or more of the following components:
For many protocols, the host name and the filename are required and the port number and reference are optional. For example, the resource name for an HTTP URL must specify a server on the network (host name) and the path to the document on that machine (filename), and can also specify a port number and a reference. In the URL shown previously,
- host name
- the name of the machine the resource lives on
- filename
- the pathname to the file on the machine
- port number
- the port number to connect to (this is typically optional)
- reference
- a reference to a named anchor within a resource; usually identifies a specific location within a file (this is typically optional)
java.sun.com
is the hostname and the trailing slash '/
' is short-hand for the file named/index.html
.When constructing any URL, put the protocol identifer first, followed by a colon (:), followed by the resource name, like this:
protocolID:resourceNameThe java.net package contains a class named URL that Java programs use to represent a URL address. Your Java program can construct a URL object, open a connection to it, and read to and write from it. The remaining pages of this lesson show you how to work with URL objects in your Java programs.
See also
java.net.URL
Working with URLs |