Introduction to Working with files and JSON
In real-world programs, you often need to save and load data so that you can leave and pick up from when the program last ran. This is called persistence. There are various ways of storing data, from simple text files or spreadsheets, to highly structured databases where data in each column have to adhere to stringent formatting constraints.
Reading and writing text files
A text file is a simple file that stores data as plain text. You can use Python to read from and write to text files using the built-in open() function.
Reading from a Text File
Read in the text in one block
with open("a_text_file.txt", "r") as file:
content = file.read()
print(content)
with open("a_text_file.txt", "r") as file:- Opens the file for reading."a_text_file.txt"is the file name"r"means “read mode”withautomatically closes the file with a file context manager when done
"r"means “read mode”.withautomatically closes the file when done.
Read in the text line by line
with open("a_text_file.txt", "r") as file:
for line in file:
print(line.strip())
with open("a_text_file.txt", "r") as file:- Opens the file for reading."a_text_file.txt"is the file name"r"means “read mode”withautomatically closes the file with a file context manager when done
"r"means “read mode”.withautomatically closes the file when done..strip()removes extra spaces and newlines.
Handling spreadsheet-style data files with a separator, e.g. csv files
data_dictionary = {}
with open("data/comma_separated_file.txt", "r") as file:
for line in file:
# Remove whitespace and split by comma
line = line.strip()
if line: # Skip empty lines
## Assume you know there are three columns
key_column, value_column, other_column = line.split(",")
data_dictionary[key_column] = value_column
with open("data/comma_separated_file.txt", "r") as file:- Opens the file for reading."data/comma_separated_file.txt"is the file path"r"means “read mode”withautomatically closes the file with a file context manager when done
for line in file:- Reads each line from the file one by oneline.strip()- Removes whitespace (spaces, newlines) from the lineline.split(",")- Splits the line into the columns at the separator you have chosen (in this case a comma)"dog,mammal,animal"becomes["dog", "mammal", "animal"]key_column, value_column, other_column = line.split(",")- Unpacks the two parts into variables
Writing to a Text File
append mode: {#append-mode}
Write an entire block of text (a_string).
with open("a_text_file.txt", "a") as file:
file.write(a_string)
"a"means “append mode”- Adds to the end of the file without writing over the existing content
- If the file doesn’t exist, it is created
file.write(a_string)- Writes
a_string
- Writes
Write a line of text:
with open("a_text_file.txt", "a") as file:
file.write(f"{a_string}\n")
"a"means “append mode”- Adds to the end of the file without writing over the existing content
- If the file doesn’t exist, it is created
file.write(f"{a_string}\n")- Writes
a_string \nadds a new line.
- Writes
write mode:
with open("flashcards.txt", "w") as file:
file.write(a_string)
"w"means “write mode”- Overwrites the file.
- If the file doesn’t exist, it is created
file.write(a_string)- Writes
a_string
- Writes
File context managers and the with statement
It is generally considered best practice to use a context manager when for working with files in Python, since it reduced the likelihood that you will leave files open when the program terminates.
The Problem with Manual File Handling
You could open and close files manually like this:
# ❌ Not recommended - easy to forget to close!
file = open("flashcards.txt", "r")
data = file.read()
file.close() # What if an error happens before this line?
Problems:
- If an error occurs before
file.close(), the file stays open - You might forget to call
.close() - Open files consume system resources
The Solution: Context Managers with with
The with statement automatically handles opening and closing files, e.g.
# With a context manager, the file is automatically closed
with open("a_file.txt", "r") as file:
# 1. Opens the file
# 2. Assigns it to 'file'
# 3. Runs your code inside the block
content = file.read()
# File is open and available here
# Content is read into the program's working memory
print(content) # ✅ Works - file is open
# File is automatically closed here, even if errors occurred
# Even though the file is now closed, the content is still available in the program's memory
print(content) # ✅ Still works! The data was copied to the variable
The pattern:
with open(filename, mode) as variable_name:
# Do something with the file
# File is open and available here
# File is automatically closed here
# Any variables into which the data from the file was read
# will still continue to hold the data
Why use with?
✅ Automatic cleanup - File closes even if errors occur
✅ Cleaner code - No need to remember .close()
✅ Best practice - Used by professional Python developers
✅ Resource efficient - Prevents file handle leaks
Example: Reading with Context Manager
# Read entire file content
with open("a_file.txt", "r") as file:
content = file.read()
print(content)
# File is already closed here - safe!
Example: Writing with Context Manager
# Write to a file
with open("a_file.txt", "w") as file:
file.write("Dog\n")
file.write("Cat\n")
# File is saved and closed automatically
Multiple Files at Once
You can even open multiple files in one with statement:
with open("input.txt", "r") as input_file, open("output.txt", "w") as output_file:
for line in input_file:
output_file.write(line)
# Both files automatically closed
Context managers Beyond Files
The with statement isn’t just for files. It is used for any resource that needs cleanup, for example when connecting to a database or a website.
# Database connection
with database.connect() as connection:
connection.execute("SELECT * FROM users")
# Network connection to URL
with requests.get(url) as response:
data = response.json()
What is JSON?
JSON (JavaScript Object Notation) is a standard format that is used for data exchange between apps and websites. It is used to store structured data in a dictionary-like structure so that it can be read reliably by computers.
E.g.
{
"Dog": "Hund",
"Cat": "Katze"
}
For many purposes, this makes it easier to work with than text files, since you can immediately get dictionary-like structures out without having to add lots of code to process the special characters (e.g. {, }) and separators (e.g. ,).
Reading and writing JSON Files
Python has a built-in json module for working with JSON files.
dump() and load() JSON file operations
The json module provides two main functions for file operations:
json.dump(data, file)
- Writes Python data (dictionaries, lists) to a JSON file
- Converts Python objects → JSON format → saves to file
json.load(file)
- Reads JSON data from a file and converts it to Python data
- Reads from file → converts JSON format → returns Python objects
Writing Data to a JSON File
import json
flashcards = {"Dog": "Hund", "Cat": "Katze"}
with open("flashcards.json", "w") as file:
json.dump(flashcards, file)
# Takes the flashcards dictionary and writes it to the file as JSON
Reading Data from a JSON File
import json
with open("flashcards.json", "r") as file:
flashcards = json.load(file)
print(flashcards)
Adding to a JSON File
import json
# 1. Load existing data
try:
with open("data.json", "r") as file:
data_list = json.load(file)
except FileNotFoundError:
data_list = []
# 2. Modify in memory
data_list.append(new_item)
# 3. Save everything back
with open("data.json", "w") as file:
json.dump(data_list, file, indent=2)
dumps() and loads() JSON string operations (not for files)
As well as the dump() and load() file operations, json module provides two functions, dumps() and loads() (note the s) for converting between JSON strings and Python data:
json.dumps(data)
- Converts Python data to a JSON string (the ‘s’ stands for ‘string’)
json.loads(string)
- Converts a JSON string to Python data
Nested JSON Structures
JSON can contain nested structures, such as dictionaries within dictionaries, lists within dictionaries, or any combination of these. Nested structures enable you to organize complex data hierarchically.
Simple vs. Nested JSON
Simple (flat) structure:
{
"Dog": "Hund",
"Cat": "Katze"
}
In this simple dictionary structure, the keys are the English animal names ("Dog", "Cat").
Nested structure:
{
"Dog": {
"translation": "Hund",
"category": "Animals",
},
"Cat": {
"translation": "Katze",
"category": "Animals"
}
}
This nested structure consists of a dictionary within a dictionary. The outer dictionary has the English animal name ("Dog", "Cat") as the keys, while the inner dictionaries have the kind of information you are storing about the animal ("translation", "category").
Mixing Single Values and Lists
A common pattern is to have some fields as single values and others as lists:
{
"Dog": {
"translations": ["Hund", "Köter"],
"category": "Animals",
"examples": [
"Der Hund bellt",
"Ich habe einen Hund"
]
},
"Cat": {
"translations": ["Katze", "Mieze"],
"category": "Animals",
"examples": [
"Die Katze miaut",
"Ich have eine Katze"
]
}
}
"category"is a single value (string)"translations"and"examples"are lists (multiple values)
Accessing Nested Data
import json
with open("a_file.json", "r") as file:
data = json.load(file)
# Access single value
category = data["Dog"]["category"]
print(category) # "Animals"
# Access list
translations = data["Dog"]["translations"]
print(translations) # ["Hund", "Köter"]
# Access first item in list
first_translation = data["Dog"]["translations"][0]
print(first_translation) # "Hund"
# Loop through list
for example in data["Dog"]["examples"]:
print(example)
# Output:
# Der Hund bellt
# Ich habe einen Hund
Creating Nested Structures
import json
# Build nested structure
data = {}
data["Dog"] = {
"translations": ["Hund", "Köter"],
"category": "Animals",
"examples": [
"Der Hund bellt",
"Ich habe einen Hund"
]
}
data["Cat"] = {
"translations": ["Katze", "Mieze"],
"category": "Animals",
"examples": [
"Die Katze miaut",
"Ich have eine Katze"
]
}
# Save to file
with open("a_file.json", "w") as file:
json.dump(data, file, indent=2)
Result in file:
{
"Dog": {
"translations": ["Hund", "Köter"],
"category": "Animals",
"examples": [
"Der Hund bellt",
"Ich habe einen Hund"
]
},
"Cat": {
"translations": ["Katze", "Mieze"],
"category": "Animals",
"examples": [
"Die Katze miaut",
"Ich have eine Katze"
]
}
}
List of Dictionaries
Another common pattern is a list where each item is a dictionary:
[
{
"event_id": 1,
"date": "2024-10-31",
"severity": 8,
"staff_involved": ["Alexa", "Han", "Zoe"]
},
{
"event_id": 2,
"date": "2024-11-01",
"severity": 9,
"staff_involved": ["Sandra", "Ashish"]
}
]
Accessing data:
import json
with open("a_file.json", "r") as file:
data = json.load(file)
# Access first event
first_event = data[0]
print(first_event["severity"]) # 8
# Loop through all sessions
for event in data:
print(f"Event {event['event_id']}: {event['severity']}")
# Output:
# Event 1: Severity 8
# Event 2: Severity 9
# Access list within dictionary
staff = data[0]["staff_involved"]
print(staff) # ["Alexa", "Han", "Zoe"]
Error Handling with Files
Reading and writing files is a risky business. Sometimes files might not exist yet, or the data might be corrupted.
Use try and except to handle these situations gracefully:
import json
filename = "a_file.json"
try:
with open(filename, "r") as file:
content = json.load(file)
except FileNotFoundError:
print(f"No file called {filename} found.")
except json.JSONDecodeError:
print(f"File {filename} corrupted")
