I thought it might be a nice idea to go over what the different identifiers on the web were. Tim Berners-Lee recently showed some exasperation at us all calling everything “URL”. With the the Semantic Web gaining momentum, it’s even more important to start calling things by their correct name. If not, in time it could cause serious problems. I wrote about the difference between URL, URI and URN already in a post called “WP Plugins to get semantic“:
URI vs URL:
On the web, people manipulate documents but on the semantic web, they can manipulate far more resources than that. We refer to them as URI’s (Universal Resource Identifier). URL’s (Universal Resource Locator) can be resolved on the web but a URI won’t necessarily be. In fact URL’s are URI’s, but not URI’s are URL’s. A URI may designate a topic, an author, a publication or website for example. “http:” is a URI scheme for example.Every URI has the method to access the resource and the identifier for that resource. It contains a location, or a name, or both.
URN (Universal Resource Name) is the other specification for a URI (the other being URL). This creates a namespace for the resource but doesn’t say how it can be accessed.
- URI is an identifier for a resource
- URL gives information on how to get to that resource and its name
- URN provides the name of that resource
A URI is also a URL, so actually the name URL is redundant when talking about applications. Switch to URI.
PURL:
I didn’t cover Purl though. PURL stands for “Persistent Uniform Resource Locators”. They are web addresses that act as permanent identifiers in the ever changing and dynamic web that we have now. They don’t resolve directly to web resources,but allow for a level of indirection which means that the addresses of pages can change over time without impairing other systems that depend on them. “This capability provides continuity of references to network resources that may migrate from machine to machine for business, social or technical reasons”.(Purl.org)
IRI:
IRI stands for “Internationalized Resource Identifiers”. They behave exactly like URIs except that they can use the whole range of Unicode characters, not just ASCII. “Every IRI has a corresponding encoding as a URI, in case an IRI needs to be used in a protocol (such as HTTP) that accepts only URIs”. (IBM)
What Tim says:
Tim Berners-Lee in his original document said that we should all follow these simple steps:
1 – Use URIs as names for things
2 – Use HTTP URIs so that people can look up those names.
3 – When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
4 – Include links to other URIs. so that they can discover more things.
Resources:
Report from the Joint W3C/IETF URI Planning Interest Group
Functional Requirements for Uniform Resource Names
Internationalized Resource Identifiers (IRIs)



