Every Web page on the Internet has a uniform resource locator, or URL, such as “http://www.example.com/products.html“. A URL cannot contain a space, which presents a problem if an HTML file is named “products and services.html.” Spaces and other characters that aren’t allowed in a URL must be encoded using a percent sign and the hexadecimal value assigned to the character in the ISO-Latin character set. A space is assigned number 32, which is 20 in hexadecimal. When you see “%20,” it represents a space in an encoded URL, for example, http://www.example.com/products%20and%20services.html.
Valid URL Characters
There are relatively few valid characters in a URL: uppercase and lowercase letters of the alphabet; digits “0” through “9”; special characters: dash, underscore, period, exclamation point, asterisk, apostrophe and opening and closing parentheses; and reserved characters: percent sign, dollar sign, ampersand, plus sign, comma, forward slash, colon, semi-colon, equal sign, question mark and “@” symbol. All other characters must be encoded using a percent sign and their assigned hexadecimal numbers. Reserved characters, such as the percent sign, are only valid to indicate special meaning, and they must be encoded when they are used without their special meaning. For example, to include a percent sign in a URL, you must encode it as “%25.”
Another Option to Encode a Space
The space character is the only character that has two acceptable URL-encoded representations. Instead of encoding a space as “%20,” you can use the plus sign reserved character to represent a space. For example, the URL “http://www.example.com/products%20and%20services.html” can also be encoded as http://www.example.com/products+and+services.html.
Passing Data With a URL
Each Web page in a website is a stand-alone collection of text, images, links and other Web content. Web pages don’t automatically share information among themselves. However, you can use the URL to pass data from one Web page to another by appending URL-encoded data using the special functionality of certain reserved characters. Append a question mark, the name of a variable, an equal sign and the value of the variable to the URL. Separate subsequent variables with an ampersand. For example, the following URL passes two variables, “composer” and “song” with their corresponding values to the “songsearch.php” Web page: “http://www.example.com/songsearch.php?composer=beethoven&song=moonlight%20sonata”.
URL-Encoding With Scripting Languages
To receive data passed to a Web page, you must use a scripting language such as PHP or ASP.net. Scripting languages enable a developer to create dynamic HTML pages with content that varies based on the information passed to the Web page. Scripting languages include functions that perform URL encoding for you so you don’t have to devise a method to encode data on our own. For example, PHP provides a function called “urlencode” that takes a string as an argument and returns the URL-encoded result. It also provides a function called “urldecode” that removes URL encoding symbols. PHP encodes spaces as plus signs instead of “%20.”
HTML Abbreviations
Many common symbols have HTML abbreviations. For example, the Euro symbol is represented as “€”. If you include a URL in an HTML document that passes more than one variable of data, you must substitute the ampersand character with “&”, the HTML abbreviation for an ampersand, for the Web page to pass HTML validity tests. For example, you can embed the following URL in an HTML document and the URL will pass HTML validity checks: “http://www.example.com?language=USen¤cy=USD”.