List of all unique characters in a string?

PythonPerformanceData Structures

Python Problem Overview


I want to append characters to a string, but want to make sure all the letters in the final list are unique.

Example: "aaabcabccd""abcd"

Now of course I have two solutions in my mind. One is using a list that will map the characters with their ASCII codes. So whenever I encounter a letter it will set the index to True. Afterwards I will scan the list and append all the ones that were set. It will have a time complexity of O(n).

Another solution would be using a dict and following the same procedure. After mapping every char, I will do the operation for each key in the dictionary. This will have a linear running time as well.

Since I am a Python newbie, I was wondering which would be more space efficient. Which one could be implemented more efficiently?

PS: Order is not important while creating the list.

Python Solutions


Solution 1 - Python

The simplest solution is probably:

In [10]: ''.join(set('aaabcabccd'))
Out[10]: 'acbd'

Note that this doesn't guarantee the order in which the letters appear in the output, even though the example might suggest otherwise.

You refer to the output as a "list". If a list is what you really want, replace ''.join with list:

In [1]: list(set('aaabcabccd'))
Out[1]: ['a', 'c', 'b', 'd']

As far as performance goes, worrying about it at this stage sounds like premature optimization.

Solution 2 - Python

Use an OrderedDict. This will ensure that the order is preserved

>>> ''.join(OrderedDict.fromkeys( "aaabcabccd").keys())
'abcd'

PS: I just timed both the OrderedDict and Set solution, and the later is faster. If order does not matter, set should be the natural solution, if Order Matter;s this is how you should do.

>>> from timeit import Timer
>>> t1 = Timer(stmt=stmt1, setup="from __main__ import data, OrderedDict")
>>> t2 = Timer(stmt=stmt2, setup="from __main__ import data")
>>> t1.timeit(number=1000)
1.2893918431815337
>>> t2.timeit(number=1000)
0.0632140599081196

Solution 3 - Python

For completeness sake, here's another recipe that sorts the letters as a byproduct of the way it works:

>>> from itertools import groupby
>>> ''.join(k for k, g in groupby(sorted("aaabcabccd")))
'abcd'

Solution 4 - Python

char_seen = []
for char in string:
    if char not in char_seen:
        char_seen.append(char)
print(''.join(char_seen))

This will preserve the order in which alphabets are coming,

output will be

abcd

Solution 5 - Python

if the result does not need to be order-preserving, then you can simply use a set

>>> ''.join(set( "aaabcabccd"))
'acbd'
>>>

Solution 6 - Python

Store Unique characters in list

Method 1:

uniue_char = list(set('aaabcabccd'))
#['a', 'b', 'c', 'd']

Method 2: By Loop ( Complex )

uniue_char = []
for c in 'aaabcabccd':
    if not c in uniue_char:
        uniue_char.append(c)
print(uniue_char)
#['a', 'b', 'c', 'd']

Solution 7 - Python

I have an idea. Why not use the ascii_lowercase constant?

For example, running the following code:

# string module contains the constant ascii_lowercase which is all the lowercase
# letters of the English alphabet
import string
# Example value of s, a string
s = 'aaabcabccd'
# Result variable to store the resulting string
result = ''
# Goes through each letter in the alphabet and checks how many times it appears.
# If a letter appears at least once, then it is added to the result variable
for letter in string.ascii_letters:
    if s.count(letter) >= 1:
        result+=letter

# Optional three lines to convert result variable to a list for sorting
# and then back to a string
result = list(result)
result.sort()
result = ''.join(result)

print(result)

Will print 'abcd'

There you go, all duplicates removed and optionally sorted

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAliView Question on Stackoverflow
Solution 1 - PythonNPEView Answer on Stackoverflow
Solution 2 - PythonAbhijitView Answer on Stackoverflow
Solution 3 - PythonmartineauView Answer on Stackoverflow
Solution 4 - PythonAmit GuptaView Answer on Stackoverflow
Solution 5 - PythongefeiView Answer on Stackoverflow
Solution 6 - Pythondipenparmar12View Answer on Stackoverflow
Solution 7 - PythonBrent PappasView Answer on Stackoverflow