URL: Difference between revisions
203.109.93.188 (talk) |
John Vandenberg (talk | contribs) m Reverted 1 edit by 203.109.93.188 to last revision by Academic Challenger. using TW |
||
Line 20: | Line 20: | ||
In its current strict technical meaning, a URL is a URI that, “in addition to identifying a [[Resource (Web)|resource]], [provides] a means of locating the resource by describing its primary access mechanism (e.g., its network ‘location’).”<ref name="RFC 3986">Tim Berners-Lee, Roy T. Fielding, Larry Masinter. (January 2005). “[http://gbiv.com/protocols/uri/rfc/rfc3986.html Uniform Resource Identifier (URI): Generic Syntax]”. Internet Society. <nowiki>RFC 3986</nowiki>; STD 66.</ref> |
In its current strict technical meaning, a URL is a URI that, “in addition to identifying a [[Resource (Web)|resource]], [provides] a means of locating the resource by describing its primary access mechanism (e.g., its network ‘location’).”<ref name="RFC 3986">Tim Berners-Lee, Roy T. Fielding, Larry Masinter. (January 2005). “[http://gbiv.com/protocols/uri/rfc/rfc3986.html Uniform Resource Identifier (URI): Generic Syntax]”. Internet Society. <nowiki>RFC 3986</nowiki>; STD 66.</ref> |
||
==Clean URLs== |
|||
"Clean" and "cruft-free" describe URLs which are: |
"Clean" and "cruft-free" describe URLs which are: |
||
Revision as of 09:39, 3 May 2007
Uniform Resource Locator (URL) is a technical, Web-related term used in two distinct meanings:
- In popular usage, it is a widespread synonym for Uniform Resource Identifier (URI) — many popular and technical texts will use the term "URL" when referring to URI;
- Strictly, the idea of a uniform syntax for global identifiers of network-retrievable documents was the core idea of the World Wide Web. In the early times, these identifiers were variously called "document names", "Web addresses" and "Uniform Resource Locators". These names were misleading, however, because not all identifiers were locators, and even for those that were, this was not their defining characteristic. Nevertheless, by the time the RFC 1630 formally defined the term "URI" as a generic term best suited to the concept, the term "URL" had gained widespread popularity, which has continued to this day.
URI/URL syntax in brief
Here is a typical URI dissected:
File:URL en.gif
Every URI (and therefore every URL) begins with the scheme name that defines its namespace, purpose, and the syntax of the remaining part of the URI. Most Web-enabled programs will try to dereference a URI according to the semantics of its scheme and a context-vbn For example, a Web browser will usually dereference a http://example.org/
by performing an HTTP request to the host example.org
, at the default HTTP port (see Port 80). Dereferencing URI mailto:bob@example.com
will usually open a "Compose e-mail" window with the address bob@example.com
in the "To" field.
"example.com" is a domain name; an IP address or other network address might be used instead.
URLs as locators
In its current strict technical meaning, a URL is a URI that, “in addition to identifying a resource, [provides] a means of locating the resource by describing its primary access mechanism (e.g., its network ‘location’).”[1]
Clean URLs
"Clean" and "cruft-free" describe URLs which are:
- Not tied to technical details, such as the software used or whether the resource comes from a file or a database - so that a change in the technology will not break existing links to the resource. e.g.
/cars/audi/
is preferable to/cars/audi/index.php
or/myprog.jsp?page=cars/audi/
. - Not tied to internal organisational structure, such as the current editor or department that created the document - so an internal reorganisation will not cause existing links to the document to break. e.g.
/recommendations/2007/xyz/
is better than/~users/jane/current-work/xyz/
or/xyz-team/recommendations/
. - Consistent with other URLs in the same site in terms of hierarchy. This is desirable so a user can see where they are in the structure of the site, and can predict where to find what they are looking for. e.g.
/cars/audi/
and/cars/ford/
, instead of/cars/audi/
but/ford-cars/
. - Consistent with other URLs in the same site in terms of action. This is desirable so a user can predict other, similar URLs on that site, e.g. if
/blogs/andrea/rss/
shows an RSS feed of Andrea's blog, then appending/rss/
to any another blog on the same site should show an RSS feed for that blog. - A single location for a single resource. The same resource should not be available from multiple URLs, as this results in both confusion (Are they the same resource, or is one a copy of the other? Which is the 'right' one? Is one new and the other due to be removed?) and technical difficulties, e.g. counting links to a particular resource, or caching content to speed up access but not being able to show the cached content when the resource is accessed using a different URL.
An example of the difference between "clean" and "standard" URLs could be seen as:
Standard:
http://example.com/index.php?section=articles&subsection=recent
Clean:
http://example.com/articles/recent/ or http://example.com/articles/2007/
Clean URLs with web services
Web services have been created that allow users to create short URLs which are easier to write down, remember or pass around. They are also more suitable for use where space is limited, for example in an IRC conversation, email signature, online forum or fixed width document (eg. email). A sample of current web services are provided below:
- TinyURL.com - probably the most widely used due to its memorable name. Example: http://www.tinyurl.com/2unsh
- doiop.com - one of the early services which offers keywords as opposed to random URLs. Example: http://doiop.com/keyword.
- SnipURL.com
- shorl.com
- URLStrip.com
Criticisms of third-party clean URLs
Ultimately these services hide the ultimate destination from a web user. This can be used to unwittingly send people to sites that offend their sensibilities, or crash or compromise their computer using browser vulnerabilities. To help combat such abuse, TinyURL allows a user to set a cookie-based preference such that TinyURL stops at the TinyURL website, giving a preview of the final link, when that user clicks TinyURLs. Substituting http://preview.tinyurl.com for http://tinyurl.com in the URL is another way of stopping at a preview of the final link before clicking through to it. Opaqueness is also leveraged by spammers[2], who can use such links in spam (mostly blog spam), bypassing URL blacklists.
Furthermore, this approach creates dependency on a third-party service that may change, go away, or maintain privacy-compromising logs of user activity indefinitely.
Address Bar
A URL you enter goes into the address or location bar in a web browser. To the right is a standard Microsoft Internet Explorer address bar, which may look different on other web browsers and Operating Systems.
See also
- CURIE (Compact URI)
- Extensible Resource Identifier (XRI)
- History of the Internet
- Internationalized Resource Identifier (IRI)
- Percent-encoding
- Rewrite engine
- Uniform Resource Identifier (URI)
- URL normalization
- Website
References
- ^ Tim Berners-Lee, Roy T. Fielding, Larry Masinter. (January 2005). “Uniform Resource Identifier (URI): Generic Syntax”. Internet Society. RFC 3986; STD 66.
- ^ Spam Spotted Using TinyURL