How to replace whitespaces with underscore?

PythonString

Python Problem Overview


I want to replace whitespace with underscore in a string to create nice URLs. So that for example:

"This should be connected" 

Should become

"This_should_be_connected" 

I am using Python with Django. Can this be solved using regular expressions?

Python Solutions


Solution 1 - Python

You don't need regular expressions. Python has a built-in string method that does what you need:

mystring.replace(" ", "_")

Solution 2 - Python

Replacing spaces is fine, but I might suggest going a little further to handle other URL-hostile characters like question marks, apostrophes, exclamation points, etc.

Also note that the general consensus among SEO experts is that dashes are preferred to underscores in URLs.

import re

def urlify(s):

    # Remove all non-word characters (everything except numbers and letters)
    s = re.sub(r"[^\w\s]", '', s)

    # Replace all runs of whitespace with a single dash
    s = re.sub(r"\s+", '-', s)

    return s

# Prints: I-cant-get-no-satisfaction"
print(urlify("I can't get no satisfaction!"))

Solution 3 - Python

This takes into account blank characters other than space and I think it's faster than using re module:

url = "_".join( title.split() )

Solution 4 - Python

Django has a 'slugify' function which does this, as well as other URL-friendly optimisations. It's hidden away in the defaultfilters module.

>>> from django.template.defaultfilters import slugify
>>> slugify("This should be connected")

this-should-be-connected

This isn't exactly the output you asked for, but IMO it's better for use in URLs.

Solution 5 - Python

Using the re module:

import re
re.sub('\s+', '_', "This should be connected") # This_should_be_connected
re.sub('\s+', '_', 'And     so\tshould this')  # And_so_should_this

Unless you have multiple spaces or other whitespace possibilities as above, you may just wish to use string.replace as others have suggested.

Solution 6 - Python

use string's replace method:

"this should be connected".replace(" ", "_")

"this_should_be_disconnected".replace("_", " ")

Solution 7 - Python

Python has a built in method on strings called replace which is used as so:

string.replace(old, new)

So you would use:

string.replace(" ", "_")

I had this problem a while ago and I wrote code to replace characters in a string. I have to start remembering to check the python documentation because they've got built in functions for everything.

Solution 8 - Python

Surprisingly this library not mentioned yet

python package named python-slugify, which does a pretty good job of slugifying:

pip install python-slugify

Works like this:

from slugify import slugify

txt = "This is a test ---"
r = slugify(txt)
self.assertEquals(r, "this-is-a-test")

txt = "This -- is a ## test ---"
r = slugify(txt)
self.assertEquals(r, "this-is-a-test")

txt = 'C\'est déjà l\'été.'
r = slugify(txt)
self.assertEquals(r, "cest-deja-lete")

txt = 'Nín hǎo. Wǒ shì zhōng guó rén'
r = slugify(txt)
self.assertEquals(r, "nin-hao-wo-shi-zhong-guo-ren")

txt = 'Компьютер'
r = slugify(txt)
self.assertEquals(r, "kompiuter")

txt = 'jaja---lol-méméméoo--a'
r = slugify(txt)
self.assertEquals(r, "jaja-lol-mememeoo-a") 

Solution 9 - Python

I'm using the following piece of code for my friendly urls:

from unicodedata import normalize
from re import sub

def slugify(title):
    name = normalize('NFKD', title).encode('ascii', 'ignore').replace(' ', '-').lower()
    #remove `other` characters
    name = sub('[^a-zA-Z0-9_-]', '', name)
    #nomalize dashes
    name = sub('-+', '-', name)

    return name

It works fine with unicode characters as well.

Solution 10 - Python

You can try this instead:

mystring.replace(r' ','-')

Solution 11 - Python

mystring.replace (" ", "_")

if you assign this value to any variable, it will work

s = mystring.replace (" ", "_")

by default mystring wont have this

Solution 12 - Python

OP is using python, but in javascript (something to be careful of since the syntaxes are similar.

// only replaces the first instance of ' ' with '_'
"one two three".replace(' ', '_'); 
=> "one_two three"

// replaces all instances of ' ' with '_'
"one two three".replace(/\s/g, '_');
=> "one_two_three"

Solution 13 - Python

x = re.sub("\s", "_", txt)

Solution 14 - Python

perl -e 'map { $on=$_; s/ /_/; rename($on, $_) or warn $!; } <*>;'

Match et replace space > underscore of all files in current directory

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLucasView Question on Stackoverflow
Solution 1 - PythonrogeriopvlView Answer on Stackoverflow
Solution 2 - PythonKenan BanksView Answer on Stackoverflow
Solution 3 - PythonxOnecaView Answer on Stackoverflow
Solution 4 - PythonDaniel RosemanView Answer on Stackoverflow
Solution 5 - PythonJarret HardieView Answer on Stackoverflow
Solution 6 - PythonmdirolfView Answer on Stackoverflow
Solution 7 - PythonIonisView Answer on Stackoverflow
Solution 8 - PythonYashView Answer on Stackoverflow
Solution 9 - PythonArmandasView Answer on Stackoverflow
Solution 10 - PythonMeghaa YadavView Answer on Stackoverflow
Solution 11 - PythonRajeshView Answer on Stackoverflow
Solution 12 - PythonskilleoView Answer on Stackoverflow
Solution 13 - PythonAllen JoseView Answer on Stackoverflow
Solution 14 - Pythonfo0View Answer on Stackoverflow