Python String encode() Method

The encode() method is used to convert a string into bytes using a specified encoding. It is particularly useful for storing strings in binary format or transmitting data over a network.

Syntax

string.encode(encoding="utf-8", errors="strict")

Parameters

encoding (Optional): The encoding to use. Default is "utf-8". Other common encodings are "ascii", "latin-1", and "utf-16".

errors (Optional): Specifies how to handle encoding errors.

'strict' (Default): Raises a UnicodeEncodeError on failure.
'ignore': Ignores characters that cannot be encoded.
'replace': Replaces unencodable characters with a question mark (?) or a replacement character.
'xmlcharrefreplace': Replaces unencodable characters with an appropriate XML character reference.
'backslashreplace': Replaces unencodable characters with a backslash escape sequence.

Return Value

Returns an encoded version of the string as a bytes object.

Examples

Default Encoding (UTF-8)

text = "café"
encoded_text = text.encode()
print(encoded_text) # Output: b'caf\xc3\xa9'

Notice the b prefix in the output, which signifies a bytes literal.

You can convert the bytes back to a string using the decode() method.

Specifying Encoding

text = "café"
encoded_text = text.encode("ascii") # Raises UnicodeEncodeError
print(encoded_text)

The above code raises an UnicodeEncodeError because "é" is not an ASCII character.

Handling Errors with `ignore` and `replace`

Using `ignore`

text = "café"
encoded_text = text.encode("ascii", errors="ignore") 
print(encoded_text) # Output: b'caf'

The errors="ignore" parameter tells Python to skip any characters that cannot be encoded using the specified encoding (in this case, ASCII). As a result, the unencodable character "é" is simply removed from the output.

Using `replace`

text = "café"
encoded_text = text.encode("ascii", errors="replace") 
print(encoded_text) # Output: b'caf?'

The errors="replace" parameter tells Python to substitute any characters that cannot be encoded using the specified encoding (in this case, ASCII) with a replacement character, typically a question mark. Therefore, the unencodable character "é" is replaced with "?", resulting in the output b'caf?'.