HTML

1 Aralık 2007

URL Encoding

Background

URL Encoding is the process of converting string into valid URL format. Valid URL format means that the URL contains only what is termed "alpha | digit | safe | extra | escape" characters. You can read more about the what and the whys of these terms on the World Wide Web Consortium site: http://www.w3.org/Addressing/URL/url-spec.html and http://www.w3.org/International/francois.yergeau.html. URL encoding is normally performed to convert data passed via html forms, because such data may contain special character, such as "/", ".", "#", and so on, which could either: a) have special meanings; or b) is not a valid character for an URL; or c) could be altered during transfer. For instance, the "#" character needs to be encoded because it has a special meaning of that of an html anchor. The character also needs to be encoded because is not allowed on a valid URL format. Also, some characters, such as "~" might not transport properly across the internet. Example One of the most common encounters with URL Encoding is when dealing with
s. Form methods (GET and POST) perform URL Encoding implicitly. As an example, click the form below to see the string being URL encoded.
  

  

This sample
sends the data in the text field using the GET method, which means that the data will be appended as query string. If you click the button and look at the resulting URL in the browser address bar, you should see something like this (the query string portion, which is automatically URL encoded by the browser, is shown in blue): http://www.permadi.com/tutorial/urlEncoding/example.html?var=This+is+a+simple+%26+short+test. Here, you can see that: * The character has been URL encoded as "+". * The & character has been URL encoded as "%26". character and & character are just some of the special characters that need to be encoded. Below are some others (click the button to see the result of the encoding). Here's the query string portion, which (as before) has been encoded by the browser automatically: var=%24+%26+%3C+%3E+%3F+%3B+%23+%3A+%3D+%2C+%22+%27+%7E+%2B+%25 As you can see, when a character is URL-encoded, it's converted as %XY, where X and Y is a number. You will see later where these numbers come from. What Should be URL Encoded? As a rule of thumb, any non alphanumeric character should be URL encoded. This of course applies to characters that are to be interpreted as is (ie: is not intend to have special meanings) . In such cases, there's no harm in URL-Encoding the character, even if the character actually does not need to be URL-Encoded. Some Common Special Characters Here's a table of some of often used characters and their URL encodings. Character
Character

URL Encoded

; %3B
? %3F
/ %2F
: %3A
# %23
& %24
= %3D
+ %2B
$ %26
, %2C
<space> %20 or +
% %25
< %3C
> %3E
~ %7E
% %25
Note that because the character is very commonly used, a special code ( the "+" sign) has been reserved as its URL encoding. Thus the string "A B" can be URL encoded as either "A%20B" or "A+B". Where Does the Numbers Come From? The number following the % sign is the hexadecimal ASCII code of the character being encoded. You can find an ASCII table here. Language Support Most web programming languages already provide built in method to perform URL Encoding and URL Decoding. Here are the common ones, click the method name to find more info.
Languagege URL Encoding URL Decoding
JavaScript escape(String) Note: does not encode '/' and '+' character unescape(String)
PHP urlencode(string) urldecode(string)
ASP Server.URLEncode(string) ?
Perl uri_escape
Use CGI.pm module.  Link.
uri_unescape
Java java.net.URLEncode.encode(String)
or see this link.
See this link.
Flash (MX or later) escape(expresiion) unescape(expression)
VBScript escape(string) unescape(string)
.NET HttpUtility.UrlEncode HttpUtility.UrlDecode
A JavaScript URL Encoder The example below uses the escape and unescape functions. Because the escape function does not properly encode the '+' and '/' character (I've no idea why it's programmed in that way), these characters need to be converted manually. This is done using String.replace function. function TestEncoding() { var inputString=document.forms["TestEncodingForm"]["inputString"].value; var encodedInputString=escape(inputString); encodedInputString=encodedInputString.replace("+", "%2B"); encodedInputString=encodedInputString.replace("/", "%2F"); document.forms["TestEncodingForm"]["encodedInputString"].value=encodedInputString; } Type anything on the Not encoded field, when you press the URL Encode button, the encoded string will be displayed on the Encoded field. Alternatively, type anything on the Encoded field, when you press the URL Decode button, the decoded string will be displayed on the Not encoded field.
Not encoded

Encoded      


URL-encoding from %00 to %8f
ASCII Value URL-encode ASCII Value URL-encode ASCII Value URL-encode
æ %00 0 %30 ` %60
  %01 1 %31 a %61
  %02 2 %32 b %62
  %03 3 %33 c %63
  %04 4 %34 d %64
  %05 5 %35 e %65
  %06 6 %36 f %66
  %07 7 %37 g %67
backspace %08 8 %38 h %68
tab %09 9 %39 i %69
linefeed %0a : %3a j %6a
  %0b ; %3b k %6b
  %0c < %3c l %6c
c return %0d = %3d m %6d
  %0e > %3e n %6e
  %0f ? %3f o %6f
  %10 @ %40 p %70
  %11 A %41 q %71
  %12 B %42 r %72
  %13 C %43 s %73
  %14 D %44 t %74
  %15 E %45 u %75
  %16 F %46 v %76
  %17 G %47 w %77
  %18 H %48 x %78
  %19 I %49 y %79
  %1a J %4a z %7a
  %1b K %4b { %7b
  %1c L %4c | %7c
  %1d M %4d } %7d
  %1e N %4e ~ %7e
  %1f O %4f   %7f
space %20 P %50 %80
! %21 Q %51   %81
" %22 R %52 %82
# %23 S %53 ƒ %83
$ %24 T %54 %84
% %25 U %55 %85
& %26 V %56 %86
' %27 W %57 %87
( %28 X %58 ˆ %88
) %29 Y %59 %89
* %2a Z %5a Š %8a
+ %2b [ %5b %8b
, %2c \ %5c Œ %8c
- %2d ] %5d   %8d
. %2e ^ %5e Ž %8e
/ %2f _ %5f   %8f

URL-encoding from %90 to %ff

ASCII Value URL-encode ASCII Value URL-encode ASCII Value URL-encode
  %90 À %c0 ð %f0
%91 Á %c1 ñ %f1
%92 Â %c2 ò %f2
%93 Ã %c3 ó %f3
%94 Ä %c4 ô %f4
%95 Å %c5 õ %f5
%96 Æ %c6 ö %f6
%97 Ç %c7 ÷ %f7
˜ %98 È %c8 ø %f8
%99 É %c9 ù %f9
š %9a Ê %ca ú %fa
%9b Ë %cb û %fb
œ %9c Ì %cc ü %fc
  %9d Í %cd ý %fd
ž %9e Î %ce þ %fe
Ÿ %9f Ï %cf ÿ %ff
  %a0 Ð %d0    
¡ %a1 Ñ %d1    
¢ %a2 Ò %d2    
£ %a3 Ó %d3    
  %a4 Ô %d4    
¥ %a5 Õ %d5    
| %a6 Ö %d6    
§ %a7   %d7    
¨ %a8 Ø %d8    
© %a9 Ù %d9    
ª %aa Ú %da    
« %ab Û %db    
¬ %ac Ü %dc    
¯ %ad Ý %dd    
® %ae Þ %de    
¯ %af ß %df    
° %b0 à %e0    
± %b1 á %e1    
² %b2 â %e2    
³ %b3 ã %e3    
´ %b4 ä %e4    
µ %b5 å %e5    
%b6 æ %e6    
· %b7 ç %e7    
¸ %b8 è %e8    
¹ %b9 é %e9    
º %ba ê %ea    
» %bb ë %eb    
¼ %bc ì %ec    
½ %bd í %ed    
¾ %be î %ee    
¿ %bf ï %ef    
URL Encoding (VB.net)
<%@ Page Language="VB" %>
<html>
   <head>
      <title>URLEncoding</title>
   <script runat="server">
      Sub Page_Load()
         If IsPostBack
            Response.Write(Server.UrlEncode(Request.Form("name")))
         End If
      End Sub
   </script>
   </head>
<body>
<%--    <form id="form1" action="UrlEncode.aspx"  method="POST" runat="server">
 --%>
   <form id="form1" method="POST" runat="server">
      <h3>Name:</h3>
      <input type="text" id="name" runat="server">
      <input type="submit" runat="server">
   </form>
</body>
</html>