Python: Reading and Writing JSON Files

JSON (JavaScript Object Notation) is a lightweight, text-based data used to store and exchange structured data, organized using key-value pairs and arrays. Despite its name, you don’t need to know JavaScript to learn or use JSON.

Whether you are working with REST APIs, managing configuration settings, or exchanging data between the frontend and backend, you will encounter JSON.

Python’s Built-in JSON module

Python provides a built-in json module that allows you to work with JSON files easily. This module enables you to convert Python objects (such as dictionaries and lists) into JSON format, and parse JSON data back into Python objects.

The json module provides four commonly used functions:

json.dump(): Converts a Python object to JSON and writes it directly to a file
json.dumps(): Converts a Python object to a JSON-formatted string
json.load(): Reads JSON data from a file and converts it into a Python object
json.loads(): Parses a JSON-formatted string and converts it into a Python object

Note: The 's' in dumps() and loads() stands for string. Use these functions when working with JSON data in memory, rather than with actual files.

Before working with a JSON file, you need to import the json module at the top of your file using the code below:

import json

JSON vs Python Data Types

When working with JSON data types in Python, the JSON data is automatically converted to its corresponding Python data types.

JSON Type	Python Type
object	dict
array	list
string	str
number (int)	int
number (real)	float
true	True
false	False
null	None

Reading JSON Files in Python

There are different ways to read JSON files in Python, each suited to various data formats and use cases. Let’s explore these methods with hands-on example to help you efficiently handle JSON data in your projects:

Reading JSON From a File Using `json.load()`

The json.load() method reads a JSON file and converts it into a Python object.

First let’s create a data.json file using VS Code or any other text editor (Notepad on Windows, TextEdit on macOS, Nano/Vim on Linux) and paste the following data:

{
    "name": "James",
    "age": 35,
    "is_employed": true,
    "skills": ["Python", "Data Analysis", "Machine Learning"],
    "education": {
        "degree": "Master's in Computer Science",
        "university": "New York Institute of Technology",
        "graduation_year": 2014
    }
}

Here is how my project structure looks:

project_folder/
├── app.py
└── data.json

Now let’s read it using json.load():

import json

# Open and read the JSON file
with open('data.json', 'r') as file:
    data = json.load(file)
    # Print the data
    print(data)

Output:

{'name': 'James', 'age': 35, 'is_employed': True, 'skills': ['Python', 'Data Analysis', 'Machine Learning'], 'education': {'degree': "Master's in Computer Science", 'university': 'New York Institute of Technology', 'graduation_year': 2014}}

Handling JSON Strings With `json.loads()`

Sometimes, you may need to parse a JSON string directly, such as when receiving data from a web API, rather than reading from a file. In these cases, you can use json.loads() to convert the JSON string into a Python object.

import json

# JSON response from an API response
json_string = '{"name": "James","age": 35, "city": "Barcelona"}'

# Parse the JSON string into a Python dictionary
data = json.loads(json_string)
print(data)

Output:

{'name': 'James', 'age': 35, 'city': 'Barcelona'}

Writing JSON Files in Python

Writing data to JSON files is just as straightforward as reading them. Let’s explore the best ways to save Python data as JSON.

Writing JSON Data to a File Using `json.dump()`

The json.dump() function saves Python data to a file in JSON format.

import json

user_data = {
    "name": "Michael",
    "age": 50,
    "is_employed": True,
    "skills": ["singing", "dancing", "acting"]
}

# Writing JSON data to a file
with open("user_data.json", "w", encoding="utf-8") as file:
    json.dump(user_data, file)

After running the code above, you will see a user_data.json file created in your project directory containing the following compressed data:

{"name": "Michael", "age": 50, "is_employed": true, "skills": ["singing", "dancing", "acting"]}

Making JSON Files Readable in Python

By default, json.dump() writes JSON data in a compact form, which can be hard to read. To make your JSON file more readable, you can use the indent parameter to add indentation and the sort_keys parameter to organize the keys alphabetically.

import json

user_data = {
    "name": "Michael",
    "age": 50,
    "is_employed": True,
    "skills": ["singing", "dancing", "acting"]
}

# Writing JSON data to a file
with open("user_data.json", "w", encoding="utf-8") as file:
    json.dump(user_data, file, indent=4, sort_keys=True)

After running the above code, the user_data.json file will be updated with neatly formatted data like this:

{
    "age": 50,
    "is_employed": true,
    "name": "Michael",
    "skills": [
        "singing",
        "dancing",
        "acting"
    ]
}

Converting Python Objects to JSON Strings With `json.dumps()`

Sometimes, you need a JSON-formatted string rather than writing the data to a file. For this purpose, you can use json.dumps().

import json

user_data = {
    "name": "Michael",
    "age": 50,
    "is_employed": True,
    "skills": ["singing", "dancing", "acting"]
}

# Convert the Python dictionary to a JSON string
json_string = json.dumps(user_data)
print(json_string)

Output:

{"name": "Michael", "age": 50, "is_employed": true, "skills": ["singing", "dancing", "acting"]}

You can use the indent parameter to add indentation.

import json

user_data = {
    "name": "Michael",
    "age": 50,
    "is_employed": True,
    "skills": ["singing", "dancing", "acting"]
}

# Convert the Python dictionary to a JSON string
json_string = json.dumps(user_data, indent=4)
print(json_string)

Output:

{
    "name": "Michael",
    "age": 50,
    "is_employed": true,
    "skills": [
        "singing",
        "dancing",
        "acting"
    ]
}

Using The `separators` Parameter for Compact JSON

You should use the separators parameter when you want more compact JSON output and reduce necessary whitespace.

By default, json.dumps() uses:

", ", ": "

That means it adds spaces after commas and colons for readibility.

When you specify:

json.dumps(data, separator=(",", ":"))

you remove those extra spaces, producing a minified JSON string.

Let’s see a working code example:

import json

user_data = {
    "name": "Michael",
    "age": 50,
    "is_emloyed": True,
    "skills": ["singing", "dancing", "acting"]
}

# Convert the Python dictionary to a JSON string
json_string = json.dumps(user_data, separators=(',', ':'))
print(json_string)

Output:

{"name":"Michael","age":50,"is_emloyed":true,"skills":["singing","dancing","acting"]}

Creating a Custom JSON Encoder

Many native Python types such as datetime, objects, Decimal values, sets, tuples, bytes, and custom classes cannot be serialized directly using Python’s built-in json module.

When you try to serialize these unsupported types with json.dumps(), Python raises a TypeError. To handle this and produce valid JSON output, you can create a custom JSON encoder. This allows you to define exactly how special or complex Python objects should be converted into JSON-serializable representations, giving you full control over the encoding process.

You can create a custom JSON encoder by either providing a custom function to the default parameter of json.dumps() (or json.dump()) or by creating a subclass of json.JSONEncoder class and overriding its default() method.

Using the `default` parameter (function-based)

You use this approach for occasional use or when you need to handle a few specific types.

In this approach, you define a function that takes an object and either returns a JSON-serialization version of it or raises a TypeError if the object cannot be handled and pass it to the default parameter of json.dumps() (or json.dump()).

Example: Custom JSON encoder for Sets and Tuples

import json

def custom_encoder(obj):
    if isinstance(obj, set):
        return list(obj)
    raise TypeError(f"Object of type {obj.__class__.__name__} is not JSON serializable")

    if isinstance(obj, tuple):
        return list(obj)
    raise TypeError(f"Object of type {obj.__class__.__name__} is not JSON serializable")

data = {
    'unique_id': {1, 2, 3},
    'coordinate':(35.67, 139.65)
}

json_data = json.dumps(data, default=custom_encoder, indent=4)
print(json_data)

Output:

{
    "unique_id": [
        1,
        2,
        3
    ],
    "coordinate": [
        35.67,
        139.65
    ]
}

In this example, we convert the set and tuple into lists when serializing the data to JSON because the standard JSON format does not support these Python data types.

Subclassing `json.JSONEncoder` (Class-based)

You use this approach when you need a reusable, structured solution for handling multiple custom types.

In this approach, you create a custom class that inherits from json.JSONEncoder, override the default() method to define how unsupported objects should be converted into a JSON-serialized format, and then pass this encoder class to the cls parameter of json.dumps() (or json.dump()).

First, let’s look at a simple example of serializing a datetime objects to JSON:

Example: Serializing `datetime` objects using a custom JSON decoder

import json
from datetime import datetime

# Custom JSON encoder to handle datetime objects
class CustomJSONEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.isoformat()  # Convert datetime to ISO format string
        return super().default(obj) 
    
# Example data containing a datetime object
data = {
    "name": "James",
    "age": 35,
    "joined": datetime.now()
}

with open('registration_data.json', 'w') as file:
    json.dump(data, file, cls=CustomJSONEncoder, indent=4)

After running this code, you will see a registration_data.json file created in your project directory containing the following data:

{
    "name": "James",
    "age": 35,
    "joined": "2026-02-14T20:26:49.438815"
}

Now let’s see how to serialize complex Python objects like datetime, Decimal, and even custom classes into JSON by using a custom JSONEncoder.

Example: Serializing complex Python objects with a custon JSON encoder

import json
from datetime import datetime
from decimal import Decimal

# Custom class
class User:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
# Custom JSON Encoder handling multiple types
class CustomJSONEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.isoformat() # Convert datetime to ISO format string
        
        if isinstance(obj, Decimal):
            return str(obj) # Convert Decimal to string to preserve precision
        
        if isinstance(obj, User):
            return {
                'name': obj.name,
                'age': obj.age
            } # Convert User object to a dictionary
        return super().default(obj)

# Sample data containing multiple types
data = {
    "User": User("James", 35),
    "created_at": datetime.now(),
    "account_balance": Decimal("1000.567")
}

# Serialize the data to JSON using the custom encoder
json_data = json.dumps(data, cls=CustomJSONEncoder, indent=4)
print(json_data)

Output:

{
    "User": {
        "name": "James",
        "age": 35
    },
    "created_at": "2026-02-15T19:23:39.183500",
    "account_balance": "1000.567"
}

Creating a Custom JSON Decoder

You can also customize how JSON is converted back into Python objects. Python’s built-in json module allows us to define a custom decoder by using the object_hook parameter in json.loads().

import json

# Custom class
class User:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def __repr__(self):
        return f"User(name={self.name}, age={self.age})"
    
# Custom JSON decoder
def custom_decoder(obj):
    if "name" in obj and "age" in obj:
        return User(obj["name"], obj['age'])
    return obj

# Sample JSON data
json_data = '''{
    "User": {
        "name": "James",
        "age": 35
    }
}'''

# Deserialize JSON to User object
data = json.loads(json_data, object_hook=custom_decoder)
print(data)

Output:

{'User': User(name=James, age=35)}

Handling Large JSON Files

Loading large JSON files entirely into memory can lead to performance issues. Here are some strategies for handling big files efficiently.

Streaming JSON Data

For very large JSON files, consider using the ijson library, which allows you to parse JSON incrementally.

But first install ijson using the command below:

pip install ijson

Let’s create a large_file.json file using VS Code in your project directory and paste this JSON data:

{
    "users": [
        {
            "id": 1,
            "name": "James",
            "email": "james@example.com",
            "address": {
                "street": "Fifth Avenue",
                "city": "New York",
                "state": "NY",
                "zip": "10022"
            }
        },
        {
            "id": 2,
            "name": "Bruce",
            "email": "bruce@email.com",
            "address": {
                "street": "Sunset Boulevard",
                "city": "Los Angeles",
                "state": "CA",
                "zip": "90026"
            }
        }
    ]
}

Let me give you a basic overview of the structure of the JSON data above. The root element contains a key called "users", which is an array. Each element of this array represents user object with several properties:

id: A unique identifier for the user.
name: The name of the user.
email: The email address of the user.
address: An object containing the user’s address details, including:
- street – The street name.
- city – The city name.
- state – The state abbreviation.
- zip – The ZIP code.

Now let’s stream and process each user item from large_file.json using ijson.

import ijson

# Stream large JSON file
with open('large_file.json', 'r') as file:
    users = ijson.items(file, 'users.item') # Stream each user object in the 'users' array

    for user in users:
        # Process each user object as it is read
        print(f"Name: {user['name']}")
        print(f"Email: {user['email']}")

        # Access and print the addresses details
        address = user['address']
        print(f"Street: {address['street']}")
        print(f"City: {address['city']}")
        print(f"State: {address['state']}") 
        print(f"Zip: {address['zip']}")

Output:

Name: James
Email: james@example.com
Street: Fifth Avenue    
City: New York
State: NY
Zip: 10022
Name: Bruce
Email: bruce@email.com  
Street: Sunset Boulevard
City: Los Angeles
State: CA
Zip: 90026

In the above code, ijson.items(file, 'users.item') streams one complete user dictionary at a time, including nested addresses, making it memory efficient, even for JSON files with thousands of users.

JSON Schema Validation

JSON schema allows us to validate the structure of JSON data, ensuring it adheres to a predefined format. It is essentially a blueprint for the data, specifying what properties are required, what data types are allowed, and any other constraints on the data.

To perform JSON schema validataion, you can use the jsonschema library, which provides an easy way to validate your JSON objects against a schema.

First you should install jsonschema using the following code in your terminal:

pip install jsonschema

Example:

Let’s assume we want to validate this JSON structure.

Sample JSON Data:

Copy and paste this JSON data to replace the contents of the large_file.json in your project directory.

{
    "users": [
        {
            "id": 1,
            "name": "James",
            "email": "james@example.com"
        },
        {
            "id": 2,
            "name": "Bruce",
            "email": "bruce@email.com"
        }
    ]
}

JSON Schema:

You will define the expected structure for this data, including the properties and their types.

{
    "type": "object",
    "properties": {
        "users": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "id": {"type": "integer"},
                    "name": {"type": "string"},
                    "email": {"type": "string", "format": "email"},
                },
                "required": ["id", "name", "email"]
            }
        }
    },
    "required": ["users"]
}

Validation Code:

import json
import jsonschema
from jsonschema import validate, exceptions

# Load the JSON data from a file
with open('large_file.json', 'r') as file:
    json_data = json.load(file)

schema = {
    "type": "object",
    "properties": {
        "users": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "id": {"type": "integer"},
                    "name": {"type": "string"},
                    "email": {"type": "string", "format": "email"},
                },
                "required": ["id", "name", "email"]
            }
        }
    },
    "required": ["users"]
}

# Validate JSON data against the schema
try: 
    validate(instance=json_data, schema=schema)
    print("JSON data is valid.")
except exceptions.ValidationError as e:
    print("JSON data is invalid:", e.message)

Output:

JSON data is valid.

Python: Reading and Writing JSON Files

Python’s Built-in JSON module

JSON vs Python Data Types

Reading JSON Files in Python

Reading JSON From a File Using json.load()

Handling JSON Strings With json.loads()

Writing JSON Files in Python

Writing JSON Data to a File Using json.dump()

Making JSON Files Readable in Python

Converting Python Objects to JSON Strings With json.dumps()

Using The separators Parameter for Compact JSON

Creating a Custom JSON Encoder

Using the default parameter (function-based)

Example: Custom JSON encoder for Sets and Tuples

Subclassing json.JSONEncoder (Class-based)

Example: Serializing datetime objects using a custom JSON decoder

Example: Serializing complex Python objects with a custon JSON encoder

Creating a Custom JSON Decoder

Handling Large JSON Files

Streaming JSON Data

JSON Schema Validation

Example:

Sample JSON Data:

JSON Schema:

Validation Code:

Reading JSON From a File Using `json.load()`

Handling JSON Strings With `json.loads()`

Writing JSON Data to a File Using `json.dump()`

Converting Python Objects to JSON Strings With `json.dumps()`

Using The `separators` Parameter for Compact JSON

Using the `default` parameter (function-based)

Subclassing `json.JSONEncoder` (Class-based)

Example: Serializing `datetime` objects using a custom JSON decoder