Decode escaped characters in URL

PythonEscaping

Python Problem Overview


I have a list containing URLs with escaped characters in them. Those characters have been set by urllib2.urlopen when it recovers the html page:

http://www.sample1webpage.com/index.php?title=%E9%A6%96%E9%A1%B5&action=edit
http://www.sample1webpage.com/index.php?title=%E9%A6%96%E9%A1%B5&action=history
http://www.sample1webpage.com/index.php?title=%E9%A6%96%E9%A1%B5&variant=zh 

Is there a way to transform them back to their unescaped form in python?

P.S.: The URLs are encoded in utf-8

Python Solutions


Solution 1 - Python

Using urllib package (import urllib) :

Python 2.7

From official documentation :

> urllib.unquote(string) > > Replace %xx escapes by their single-character equivalent. > > Example: unquote('/%7Econnolly/') yields '/~connolly/'.

Python 3

From official documentation :

> urllib.parse.unquote(string, encoding='utf-8', errors='replace') > > […] > > Example: unquote('/El%20Ni%C3%B1o/') yields '/El Niño/'.

Solution 2 - Python

And if you are using Python3 you could use:

import urllib.parse
urllib.parse.unquote(url)

Solution 3 - Python

or urllib.unquote_plus

>>> import urllib
>>> urllib.unquote('erythrocyte+membrane+protein+1%2C+PfEMP1+%28VAR%29')
'erythrocyte+membrane+protein+1,+PfEMP1+(VAR)'
>>> urllib.unquote_plus('erythrocyte+membrane+protein+1%2C+PfEMP1+%28VAR%29')
'erythrocyte membrane protein 1, PfEMP1 (VAR)'

Solution 4 - Python

You can use urllib.unquote

Solution 5 - Python

import re

def unquote(url):
  return re.compile('%([0-9a-fA-F]{2})',re.M).sub(lambda m: chr(int(m.group(1),16)), url)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTonyView Question on Stackoverflow
Solution 1 - PythonIgnacio Vazquez-AbramsView Answer on Stackoverflow
Solution 2 - PythonVladir Parrado CruzView Answer on Stackoverflow
Solution 3 - PythondliView Answer on Stackoverflow
Solution 4 - PythonKlaus Byskov PedersenView Answer on Stackoverflow
Solution 5 - PythonmistercxView Answer on Stackoverflow