Python: Reading and Writing CSV Files
CSV (Comma-Separated Values) files are plain text files that use commas to separate values, organizing data in rows and columns. They are commonly used to export data from databases or spreadsheets and to exchange data between different applications.
Python’s Built-in CSV Module
Python’s csv module provides built-in functions to read from and write to CSV files.
Before working with CSV files in Python, you need to import the csv module at the top of your file:
import csv
The csv module provides four commonly used functions:
csv.reader(): Reads a CSV file and returns each row as a listcsv.writer(): Writes data to a CSV file one row at a timecsv.DictReader(): Reads a CSV file and returns each row as a dictionarycsv.DictWriter(): Writes dictionaries to a CSV file
Reading CSV Files in Python
There are several ways to read CSV files in Python, depending on how you want to use your data. Let’s explore each approach with practical examples.
(1) Using csv.reader()
The csv.reader() function allows you to read data from a CSV file one row at a time. It returns each row as a list of strings, making it easy to access, manipulate, and process individual values in a structured and reliable way.
Let’s say we have the following CSV file called employees.csv:
Name,Department,Salary
James,Computer Science,140000
Bruce,Marketing,110000
Riley,Finance,120000
Here is how our project structure looks like:
project_folder/
├── app.py
└── employees.csv
Now let’s read it using csv.reader:
import csv
with open('employees.csv', 'r', newline='', encoding='utf-8') as file:
reader = csv.reader(file)
# Skip the header row if it exists
next(reader, None)
# Process each row in the CSV file
for row in reader:
name = row[0]
department = row[1]
salary = row[2]
print(f'Name: {name}, Department: {department}, Salary: {salary}')
Output:
Name: James, Department: Computer Science, Salary: 140000
Name: Bruce, Department: Marketing, Salary: 110000
Name: Riley, Department: Finance, Salary: 120000
Note: Always use newline='' when opening CSV files. This allows the csv module to handle line endings correctly and prevents issues like extra blank lines or incorrect row parcing, specially across different operaing systems.
(2) Using csv.DictReader()
The csv.DictReader() function allows you to read data from a CSV file into a dictionary format, where the keys corresponds to the column names (from the header row), and the values correspond to the data in each row. This makes accessing data more intuitive because you can refer to the columns by name rather than by index.
import csv
with open('employees.csv', 'r', newline='', encoding='utf-8') as file:
reader = csv.DictReader(file)
for row in reader:
name = row['Name']
department = row['Department']
salary = row['Salary']
print(f"Name: {name}, Department: {department}, Salary: {salary}")
Output:
Name: James, Department: Computer Science, Salary: 140000
Name: Bruce, Department: Marketing, Salary: 110000
Name: Riley, Department: Finance, Salary: 120000
(3) Reading CSV Files With Custom Delimiters
Not all CSV files use commas as delimiters; some may use tabs, simicolons, or pipes instead. Python’s csv module handles this effortlessly through the delimiter parameter.
Let’s update our employees.csv file to use | as delimiter:
Name|Department|Salary
James|Computer Science|140000
Bruce|Marketing|110000
Riley|Finance|120000
Now let’s read this pipe-separated file by specifying the delimiter parameter:
import csv
with open('employees.csv', 'r', newline='', encoding='utf-8') as file:
reader = csv.DictReader(file, delimiter='|')
for row in reader:
name = row['Name']
department = row['Department']
salary = row['Salary']
print(f"Name: {name}, Department: {department}, Salary: {salary}")
Writing CSV Files in Python
There are several ways to write CSV files in Python, depending on how you want to structure and store your data. Let’s explore each approach with practical examples:
(1) Using csv.writer()
The csv.writer() function writes data to a CSV file row by row and is best suited for data stored as lists or tuples where column order matters. It gives you control over how each row is written and is ideal for simple tabular data.
To write data, open the file in write mode ('w') with newline='' to prevent extra blank lines on some operating systems, create a writer object, and then use writerow() for a single row or writerows() for multiple rows.
import csv
# Data to write
data = [
['Name', 'Age', 'City'],
['James', '35', 'London'],
['Hannah', '28', 'Los Angeles'],
['Logan', '20', 'Houston']
]
# Writing to csv file
with open('output.csv', 'w', newline='') as file:
writer = csv.writer(file)
# Write all rows at once
writer.writerows(data)
print("CSV file has been created successfully.")
After running the above code, you should see a file named output.csv created in the same directory where your Python script is saved.
If you open the file, its contents will look like this:
Name,Age,City
James,35,London
Hannah,28,Los Angeles
Logan,20,Houston
csv.writer() is best suited for structured data stored as sequences (lists or tuples) where column order matters.
(2) Using csv.DictWriter()
The csv.DictWriter() function writes data to a CSV file using dictionaries instead of lists, where each dictionary represents a row. The keys of the dictionary act as a column headers, making this method ideal when working with structured data that includes named fields.
import csv
data = [
{"name": "James", "age": 35, "city": "London"},
{"name": "Hannah", "age": 28, "city": "Los Angeles"},
{"name": "Logan", "age": 20, "city": "Houston"}
]
with open("data.csv", "w", newline="") as file:
fieldnames = ["name", "age", "city"]
writer = csv.DictWriter(file, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(data)
print("data.csv file has been created successfully")
After running this code, you should see an data.csv file created in your project directory. Its content will look like this:
name,age,city
James,35,London
Hannah,28,Los Angeles
Logan,20,Houston
(3) Appending to an Existing CSV File
To add new rows to an existing CSV file without overwriting it, open the file in the append mode ('a') instead of write mode ('w').
import csv
new_data = [
{"name": "Willow", "age": 18, "city": "Oslo"},
{"name": "Rose", "age": 22, "city": "Paris"}
]
with open("data.csv", "a", newline="") as file:
fieldnames = ["name", "age", "city"]
writer = csv.DictWriter(file, fieldnames=fieldnames)
# Do not call writeheader() when appending to avoid writing the header again
writer.writerows(new_data)
print("New rows added successfully.")
Now your data.csv file should contain the updated rows, with the new entries appended to the existing ones, like this:
name,age,city
James,35,London
Hannah,28,Los Angeles
Logan,20,Houston
Willow,18,Oslo
Rose,22,Paris
(4) Writing CSV With Custom Delimiter and Quoting
Just like reading, you can customize the delimiter, quoting character, and quoting behavior when writing CSV files.
import csv
data = [
{"name": "James", "age": 35, "city": "London"},
{"name": "Bob", "age": 50, "city": "Edinburgh"},
{"name": "Charlie", "age": 30, "city": "Manchester"}
]
with open('custom_output.csv', 'w', newline='') as file:
fieldnames = ['name', 'age', 'city']
writer = csv.DictWriter(
file,
fieldnames=fieldnames, # Column names
delimiter="|", # Custom delimiter
quotechar='"', # Character to quote fields containing special characters
quoting=csv.QUOTE_ALL # Quote all fields
)
writer.writeheader() # Write the column names to the CSV file
writer.writerows(data) # Write the data rows to the CSV file
print("Data has been written to custom_output.csv with custom formatting.")
After running this code, you should see a custom_output.csv file created in your project directory. Its content should look like this:
"name"|"age"|"city"
"James"|"35"|"London"
"Bob"|"50"|"Edinburgh"
"Charlie"|"30"|"Manchester"
Reading and Writing CSV Files With Pandas
While Python’s built-in csv module is great for basic reading and writing tasks, Pandas provides a far more powerful and flexible solution for working with CSV files.
With Pandas, you can efficiently handle large datasets, automatically manage headers and missing values, and easily customize delimiters, quoting rules, and data types. It also offers built-in data cleaning and transformation capabilities, allowing you to filter, sort, aggregate, and analyze data with minimal code.
Because of its performance, flexibility, and seamless integration with data analysis workflows, Pandas is the preferred choice for real-world CSV processing and data-driven applications.
Reading CSV Files With Pandas
You can read CSV files in Python using pandas.read_csv(). Before doing so, however, you need to install the pandas library (if it is not already installed) and import it into your script.
To install pandas using pip in VS Code:
- Open VS Code
- Go to Terminal > New Terminal
- In the terminal window, run:
pip install pandas
This will install pandas in the currenlty selected Python environment.
If you are using Python 3 and the command above does not work, you can try:
python -m pip install pandas
or
python3 -m pip install pandas
Make sure the correct Python interpreter is selected in VS Code before installing:
- Press Ctrl + Shift + P
- Select Python: Select Interpreter
- Choose your desired environment
Here is a complete blog post I have written to set up Python in VS Code.
After installing pandas, you can import it and read a CSV file as shown below:
import pandas as pd
# Read the CSV file
df = pd.read_csv('data.csv')
# Display the first three rows
print(df.head(3))
Output:
name age city
0 James 35 London
1 Hannah 28 Los Angeles
2 Logan 20 Houston
In this example, the data from the CSV file is stored in a DataFrame object called df, and df.head(3) displays its first three rows.
Writing CSV Files With Pandas
You can write a DataFrame to a CSV files using the DataFrame.to_csv(). You can either have an existing DataFrame or create one from scratch.
import pandas as pd
data = {
"Name": ["James", "Sarah", "Logan"],
"Age": [35, 18, 20],
"city": ["London", "New York", "Dubai"]
}
df = pd.DataFrame(data)
# Write the DataFrame to a CSV file
df.to_csv("pandas_output.csv", index=False)
After running this code, you should see a pandas_output.csv file created in your project directory. It contents should look like this:
Name,Age,city
James,35,London
Sarah,18,New York
Logan,20,Dubai