Why Do URLs Have Weird Characters Like %20? (Secrets of URL Encoding)
Have you ever copied a URL from your browser and pasted it into a messenger, only to find it turned into a mess of %EA%B0%80... characters?
Specifically, spaces turn into %20. Why can't URLs just use characters as they are instead of converting them into this complex format?
The official name for this phenomenon is URL Encoding or Percent-encoding.
1. The Early Rules of the Internet: ASCII
When the internet was first built, the standard for computer communication was ASCII. ASCII is a 7-bit character set that includes only English letters, numbers, and a few special characters.
The URL standard (RFC 3986) strictly limited the characters allowed in a URL.
- Unreserved Characters:
A-Z,a-z,0-9,-,_,.,~ - Reserved Characters:
:,/,?,#,&, etc. (Used for URL structure)
Any character not in this list (like spaces, non-English characters, or other symbols) had to be converted into an ASCII-compatible format for the system to understand it.
2. How Percent-Encoding Works
The conversion rule is simple.
- Take the byte value of the character in hexadecimal.
- Prefix it with a
%sign.
Example: Space
In ASCII, a space has the decimal value 32, which is 20 in hexadecimal.
Therefore, a space becomes %20.
Example: Non-ASCII Characters
For characters like emojis or letters from other languages, the UTF-8 byte sequence is encoded. If a character takes 3 bytes in UTF-8, it becomes three percent-encoded groups (e.g., %E2%9C%A8 for ✨).
3. Security and Stability
URL encoding is important not just for representation, but also for security and stability.
For example, what if a URL parameter value contains & or =?
search?query=A&B
The system might misinterpret this as query=A and a new parameter B.
By encoding & as %26 to send search?query=A%26B, the system can correctly interpret it as the literal character '&'.
Conclusion
The alien-looking strings like %20 are actually the result of efforts to maintain internet history and compatibility.
While browser address bars show decoded characters for user convenience, remember that behind the scenes, percent signs are still busily traveling across the network.
Explore Related Tools
Try these free developer tools from Pockit