Characters allowed in GET parameter

ParametersGetSpecial CharactersCharacter

Parameters Problem Overview


Which characters are allowed in GET parameters without encoding or escaping them? I mean something like this:

http://www.example.org/page.php?name=XYZ

What can you have there instead of XYZ? I think only the following characters:

  • a-z (A-Z)
  • 0-9
  • _

Is this the full list or are there additional characters allowed?

I hope you can help me. Thanks in advance!

Parameters Solutions


Solution 1 - Parameters

There are reserved characters, that have a reserved meanings, those are delimiters — :/?#[]@ — and subdelimiters — !$&'()*+,;=

There is also a set of characters called unreserved characters — alphanumerics and -._~ — which are not to be encoded.

That means, that anything that doesn't belong to unreserved characters set is supposed to be %-encoded, when they do not have special meaning (e.g. when passed as a part of GET parameter).

See also RFC3986: Uniform Resource Identifier (URI): Generic Syntax

Solution 2 - Parameters

The question asks which characters are allowed in GET parameters without encoding or escaping them.

According to RFC3986 (general URL syntax) and RFC7230, section 2.7.1 (HTTP/S URL syntax) the only characters you need to percent-encode are those outside of the query set, see the definition below.

However, there are additional specifications like HTML5, Web forms, and the obsolete Indexed search, W3C recommendation. Those documents add a special meaning to some characters notably, to symbols like = & + ;.

Other answers here suggest that most of the reserved characters should be encoded, including "/" "?". That's not correct. In fact, RFC3986, section 3.4 advises against percent-encoding "/" "?" characters. >it is sometimes better for usability to avoid percent- >encoding those characters.

RFC3986 defines query component as:

query       = *( pchar / "/" / "?" )
pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"
pct-encoded = "%" HEXDIG HEXDIG
sub-delims  = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~" 

>A percent-encoding mechanism is used to represent a data octet in a component when that octet's corresponding character is outside the allowed set or is being used as a delimiter of, or within, the component.

The conclusion is that XYZ part should encode:

special: # % = & ;
Space
sub-delims
out of query set: [ ]
non ASCII encodable characters

Unless special symbols = & ; are key=value separators.

Encoding other characters is allowed but not necessary.

Solution 3 - Parameters

I did a test using the Chrome address bar and a $QUERY_STRING in bash, and observed the following:

~!@$%^&*()-_=+[{]}\|;:',./? and grave (backtick) are passed through as plaintext.

, ", < and > are converted to %20, %22, %3C and %3E respectively.

# is ignored, since it is used by ye olde anchor.

Personally, I'd say bite the bullet and encode with base64 :)

Solution 4 - Parameters

All of the rules concerning the encoding of URIs (which contains URNs and URLs) are specified in the RFC1738 and the RFC3986, here's a TL;DR of these long and boring documents:

Percent-encoding, also known as URL encoding, is a mechanism for encoding information in a URI under certain circumstances. The characters allowed in a URI are either reserved or unreserved. Reserved characters are those characters that sometimes have special meaning, but they are not the only characters that needs encoding.

There are 66 unreserved characters that doesn't need any encoding: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_.~

There are 18 reserved characters which needs to be encoded: !*'();:@&=+$,/?#[], and all the other characters must be encoded.

To percent-encode a character, simply concatenate "%" and its ASCII value in hexadecimal. The php functions "urlencode" and "rawurlencode" do this job for you.

Solution 5 - Parameters

From RFC 1738 on which characters are allowed in URLs:

> Only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.

The reserved characters are ";", "/", "?", ":", "@", "=" and "&", which means you would need to URL encode them if you wish to use them.

Solution 6 - Parameters

Alphanumeric characters and all of

~ - _ . ! * ' ( ) ,

are valid within an URL.

All other characters must be encoded.

Solution 7 - Parameters

"." | "!" | "~" | "*" | "'" | "(" | ")" are also acceptable [RFC2396]. Really, anything can be in a GET parameter if it is properly encoded.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestioncawView Question on Stackoverflow
Solution 1 - ParametersMichael Krelin - hackerView Answer on Stackoverflow
Solution 2 - ParametersdmitriView Answer on Stackoverflow
Solution 3 - ParametersjimmetryView Answer on Stackoverflow
Solution 4 - ParametersNino FiliuView Answer on Stackoverflow
Solution 5 - ParametersctfordView Answer on Stackoverflow
Solution 6 - ParameterswompView Answer on Stackoverflow
Solution 7 - Parametersgeowa4View Answer on Stackoverflow