Python String encode() Method
The encode() method is used to convert a string into bytes using a specified encoding. It is particularly useful for storing strings in binary format or transmitting data over a network.
Syntax
string.encode(encoding="utf-8", errors="strict")
Parameters
encoding (Optional): The encoding to use. Default is "utf-8". Other common encodings are "ascii", "latin-1", and "utf-16".
errors (Optional): Specifies how to handle encoding errors.
'strict'(Default): Raises aUnicodeEncodeErroron failure.'ignore': Ignores characters that cannot be encoded.'replace': Replaces unencodable characters with a question mark (?) or a replacement character.'xmlcharrefreplace': Replaces unencodable characters with an appropriate XML character reference.'backslashreplace': Replaces unencodable characters with a backslash escape sequence.
Return Value
Returns an encoded version of the string as a bytes object.
Examples
Default Encoding (UTF-8)
text = "café"
encoded_text = text.encode()
print(encoded_text) # Output: b'caf\xc3\xa9'
Notice the b prefix in the output, which signifies a bytes literal.
You can convert the bytes back to a string using the decode() method.
Specifying Encoding
text = "café"
encoded_text = text.encode("ascii") # Raises UnicodeEncodeError
print(encoded_text)
The above code raises an UnicodeEncodeError because "é" is not an ASCII character.
Handling Errors with ignore and replace
Using ignore
text = "café"
encoded_text = text.encode("ascii", errors="ignore")
print(encoded_text) # Output: b'caf'
The errors="ignore" parameter tells Python to skip any characters that cannot be encoded using the specified encoding (in this case, ASCII). As a result, the unencodable character "é" is simply removed from the output.
Using replace
text = "café"
encoded_text = text.encode("ascii", errors="replace")
print(encoded_text) # Output: b'caf?'
The errors="replace" parameter tells Python to substitute any characters that cannot be encoded using the specified encoding (in this case, ASCII) with a replacement character, typically a question mark. Therefore, the unencodable character "é" is replaced with "?", resulting in the output b'caf?'.