Remove Punctuation Marks from a String in Python

By James L.

When working with strings in Python, it is often necessary to get rid of punctuation marks. This guide will show you different ways to do that easily!

We will discuss the following methods:

  1. Using string.punctuation and translate() method
  2. Using string.punctuation and generator expression
  3. Using filter() with lambda function
  4. Using regular expression

# Using string.punctuation and translate() method

You can remove punctuation marks from a string in Python using the string.punctuation constant and the translate() method.

Here’s an example:

import string

my_string = "Python is awesome!?."

# Create a translation table using str.maketrans()
translator = str.maketrans('', '', string.punctuation)

# Use translate() method to remove punctuation
new_string = my_string.translate(translator)

print(new_string)  # Output: 'Python is awesome'

Here’s a brief explanation of the code:

import string: Imports the string module, which contains the punctuation constant, a string containing all ASCII punctuation characters.

translator = str.maketrans('', '', string.punctuation): Creates a translation table using str.maketrans() method. A translation table is required because the translate() method removes or replaces specific characters in a string based on a translation table. Empty strings ('') in the first and second parameters indicate we don’t want to perform any replacement, and only the characters specified in the third argument (string.punctuation) are to be removed.

new_string = my_string.translate(translator): Uses the translate() method to apply the translation table (translator) to the original string (my_string). The result is a new string (new_string) without the punctuation marks.

The above code removes all punctuation marks (!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~) from a string. However, if you wish to remove specific punctuation marks selectively, you can achieve this by creating a translation table specifying the punctuations you want to remove using the str.maketrans() method.

Here’s an example:

my_string = "Python is awesome!?."

# Specify the punctuation marks you want to remove
translator = str.maketrans('', '', '.?')

# Use translate() method to remove specified punctuations
new_string = my_string.translate(translator)

print(new_string)  # Output: 'Python is awesome!'

# Using string.punctuation and generator expression

Here’s how you can use the string.punctuation constant and a generator expression to remove punctuation marks from a string in Python:

import string

my_string = "Python is awesome!?."

new_string = ''.join(
    char for char in my_string if char not in string.punctuation)

print(new_string)  # Output: 'Python is awesome'

char for char in my_string if char not in string.punctuation: This is a generator expression. It iterates through each character in my_string and includes only those characters that are not in the set of string.punctuation characters from the string module.

Don’t confuse generator expressions with list comprehensions. Generator expressions use parentheses whereas list comprehension uses square brackets. Generator expressions are more memory efficient compared to list comprehensions.

# Using filter() and lambda function

You can use the join() method along with the filter() and lambda function to remove punctuation marks from a string in Python.

Here’s an example:

import string
my_string = "Python is awesome!?."

# Use the filter() with a lambda function to remove punctuations from a string
new_string = ''.join(
    filter(lambda char: char not in string.punctuation, my_string))

print(new_string)  # Output: 'Python is awesome'

In this example, we use filter() with the lambda function to keep only characters not in string.punctuation, and then use the join() method to concatenate the remaining characters in a new string. The result is a new string without any punctuation marks.

You can also use the isalnum() method and the isspace() method to check if the character is not a punctuation mark.

Here’s an example:

import string
my_string = "Python is awesome!?."

# Use the filter() with a lambda function to remove punctuations from a string
new_string = ''.join(
    filter(lambda char: char.isalnum() or char.isspace(), my_string))

print(new_string)  # Output: 'Python is awesome'

The isalnum() method returns True if a character is alphanumeric (a-z, A-Z, 0-9).

The isspace() method returns True if a character is whitespace (space, tab, or newline).

# Using regular expressions

You can use the re.sub() function of the re module to remove punctuation marks from a string in Python.

Here’s an example:

import re

my_string = "Python is awesome!?."

# Compile a regex pattern to match any character
# that is not a word character or a whitespace character
pattern = re.compile(r'[^\w\s]')

# Use the re.sub() to replace any matched character with an empty string
new_string = re.sub(pattern, '', my_string)

print(new_string)  # Output: 'Python is awesome'

In this example, the re.compile() function compiles the regular expression pattern [^\w\s] into a regular expression object. It is then used with the re.sub() function to replace any character matching the compiled pattern with an empty string, effectively removing punctuation marks.

Here’s  a breakdown of the regular expression:

[ ]: Square brackets denote a character class, which means “match any one of the characters inside the brackets”.

^: Inside the character class, the caret symbol negates the pattern, meaning “match any character that is not in the specified set”.

\w: This represents a word character. It is equivalent to the character set [a-zA-Z0-9_], which includes alphanumeric characters (letters and numbers) and underscores.

\s: This represents a whitespace character. It includes spaces, tabs, and newline characters.

Together, the regular expression pattern [^\w\s] matches any character that is not an alphanumeric character or whitespace character.

Note: Inside square brackets, the caret (^) negates the character class. Outside square brackets, at the beginning of a pattern, the caret anchors the pattern at the beginning of the string.

The above code removes all punctuation marks (!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~) from a string. However, if you wish to remove specific punctuation marks selectively, you can achieve this by modifying the regular expression to include only the quotation marks you want to remove.

Here’s an example:

import re

my_string = "Python is awesome!?."

# Compile a regex pattern to match the question mark and period
pattern = re.compile(r'[?.]')

# Use the re.sub() to replace any matched character with an empty string
new_string = re.sub(pattern, '', my_string)

print(new_string)  # Output: 'Python is awesome!'

Conclusion

In conclusion, this guide has provided a comprehensive overview of how to remove quotation marks from a string in Python.

For quick removal of punctuation marks, the string.punctuation and translate() method is the efficient choice.

For more complex patterns or customized removal, regular expression offers flexibility.

Generator expression and filter() can also be used, but might be less readable for this specific task.

Feel free to experiment with these methods and choose the one that best fits your requirements.

Happy coding  → 3000