Introduction to Working with Files and JSON

Why Work with Files?

In real-world programs, you often need to save and load data so that your app remembers things between runs. This is called persistence.
Files let you store information like flashcards, user progress, or settings on your computer.


Reading and writing text files

A text file is a simple file that stores data as plain text. You can use Python to read from and write to text files using the built-in open() function.

Reading from a Text File

Read in the text in one block

with open("a_text_file.txt", "r") as file:
    content = file.read()
    print(content)
  • with open("a_text_file.txt", "r") as file: - Opens the file for reading.
    • "a_text_file.txt" is the file name
    • "r" means “read mode”
    • with automatically closes the file with a file context manager when done
  • "r" means “read mode”.
  • with automatically closes the file when done.

Read in the text line by line

with open("a_text_file.txt", "r") as file:
    for line in file:
        print(line.strip())
  • with open("a_text_file.txt", "r") as file: - Opens the file for reading.
    • "a_text_file.txt" is the file name
    • "r" means “read mode”
    • with automatically closes the file with a file context manager when done
  • "r" means “read mode”.
  • with automatically closes the file when done.
  • .strip() removes extra spaces and newlines.

Handling spreadsheet-style data files with a separator, e.g. csv files

data_dictionary = {}
with open("data/comma_separated_file.txt", "r") as file:
    for line in file:
        # Remove whitespace and split by comma
        line = line.strip()
        if line:  # Skip empty lines
            ## Assume you know there are three columns
            key_column, value_column, other_column = line.split(",")
            data_dictionary[key_column] = value_column
  • with open("data/comma_separated_file.txt", "r") as file: - Opens the file for reading.
    • "data/comma_separated_file.txt" is the file path
    • "r" means “read mode”
    • with automatically closes the file with a file context manager when done
  • for line in file: - Reads each line from the file one by one
  • line.strip() - Removes whitespace (spaces, newlines) from the line
  • line.split(",") - Splits the line into the columns at the separator you have chosen (in this case a comma)
    • "dog,mammal,animal" becomes ["dog", "mammal", "animal"]
    • key_column, value_column, other_column = line.split(",") - Unpacks the two parts into variables

Writing to a Text File

append mode: {#append-mode} Write an entire block of text (a_string).

with open("a_text_file.txt", "a") as file:
    file.write(a_string)
  • "a" means “append mode”
    • Adds to the end of the file without writing over the existing content
    • If the file doesn’t exist, it is created
  • file.write(a_string)
    • Writes a_string

Write a line of text:

with open("a_text_file.txt", "a") as file:
    file.write(f"{a_string}\n")
  • "a" means “append mode”
    • Adds to the end of the file without writing over the existing content
    • If the file doesn’t exist, it is created
  • file.write(f"{a_string}\n")
    • Writes a_string
    • \n adds a new line.

write mode:

with open("flashcards.txt", "w") as file:
    file.write(a_string)
  • "w" means “write mode”
    • Overwrites the file.
    • If the file doesn’t exist, it is created
  • file.write(a_string)
    • Writes a_string


← Back to where you were | → Start the course from the beginning

File context managers and the with Statement

Before we dive into reading and writing files, let’s understand the best practice for working with files in Python.

The Problem with Manual File Handling

You could open and close files manually like this:

# ❌ Not recommended - easy to forget to close!
file = open("flashcards.txt", "r")
data = file.read()
file.close()  # What if an error happens before this line?

Problems:

  • If an error occurs before file.close(), the file stays open
  • You might forget to call .close()
  • Open files consume system resources

The Solution: Context Managers with with

The with statement automatically handles opening and closing files:

# ✅ Recommended - automatically closes file!
with open("flashcards.txt", "r") as file:
    data = file.read()
    # File is available here

# File is automatically closed here, even if errors occurred

How with Works

Think of with like a smart assistant:

with open("flashcards.txt", "r") as file:
    # 1. Opens the file
    # 2. Assigns it to 'file'
    # 3. Runs your code inside the block
    content = file.read()
    
# 4. Automatically closes the file when done
#    (even if there was an error!)

The pattern:

with open(filename, mode) as variable_name:
    # Do something with the file
    # File is open and available here
    
# File is automatically closed here

Why Use with?

Automatic cleanup - File closes even if errors occur
Cleaner code - No need to remember .close()
Best practice - Used by professional Python developers
Resource efficient - Prevents file handle leaks

Example: Reading with Context Manager

# Read entire file content
with open("flashcards.txt", "r") as file:
    content = file.read()
    print(content)

# File is already closed here - safe!

Example: Writing with Context Manager

# Write to a file
with open("flashcards.txt", "w") as file:
    file.write("Dog,Hund\n")
    file.write("Cat,Katze\n")

# File is saved and closed automatically

Multiple Files at Once

You can even open multiple files in one with statement:

with open("input.txt", "r") as input_file, open("output.txt", "w") as output_file:
    for line in input_file:
        output_file.write(line.upper())

# Both files automatically closed

Context Managers Beyond Files

The with statement isn’t just for files. It is used for any resource that needs cleanup:

# Database connections
with database.connect() as connection:
    connection.execute("SELECT * FROM users")

# Network connections
with requests.get(url) as response:
    data = response.json()

💡 Remember: Always use with when working with files. It’s the Python way!


← Back to where you were | → Start the course from the beginning

What is JSON?

JSON (JavaScript Object Notation) is a standard format that is used for data exchange between apps and websites. It is used to store structured data in a dictionary-like structure so that it can be read reliably by computers.

E.g.

{
  "Dog": "Hund",
  "Cat": "Katze"
}

For many purposes, this makes it easier to work with than text files, since you can immediately get dictionary-like structures out without having to add lots of code to process the special characters (e.g. {}) and separators (e.g. ,).


Reading and writing JSON Files

Python has a built-in json module for working with JSON files.

Writing Data to a JSON File

import json

flashcards = {"Dog": "Hund", "Cat": "Katze"}
with open("flashcards.json", "w") as file:
    json.dump(flashcards, file)

Reading Data from a JSON File

import json

with open("flashcards.json", "r") as file:
    flashcards = json.load(file)
print(flashcards)

Adding to a JSON File

import json

# 1. Load existing data
try:
    with open("data.json", "r") as file:
        data_list = json.load(file)
except FileNotFoundError:
    data_list = []

# 2. Modify in memory
data_list.append(new_item)

# 3. Save everything back
with open("data.json", "w") as file:
    json.dump(data_list, file, indent=2)


← Back to where you were | → Start the course from the beginning


Nested JSON Structures

JSON can contain nested structures - dictionaries within dictionaries, lists within dictionaries, or any combination. This lets you organize complex data hierarchically.

Simple vs. Nested JSON

Simple (flat) structure:

{
  "Dog": "Hund",
  "Cat": "Katze",
  "House": "Haus"
}

Nested structure:

{
  "Dog": {
    "translation": "Hund",
    "category": "Animals",
    "difficulty": "easy"
  },
  "Cat": {
    "translation": "Katze",
    "category": "Animals",
    "difficulty": "easy"
  }
}

Mixing Single Values and Lists

A common pattern is to have some fields as single values and others as lists:

Example: Flashcard with multiple translations and examples

{
  "Dog": {
    "translations": ["Hund", "Köter"],
    "category": "Animals",
    "difficulty": "easy",
    "examples": [
      "Der Hund bellt",
      "Ich habe einen Hund"
    ]
  },
  "Run": {
    "translations": ["laufen", "rennen"],
    "category": "Verbs",
    "difficulty": "medium",
    "examples": [
      "Ich laufe jeden Tag",
      "Er rennt schnell"
    ]
  }
}

Notice:

  • "category" and "difficulty" are single values (strings)
  • "translations" and "examples" are lists (multiple values)

Accessing Nested Data

import json

with open("flashcards.json", "r") as file:
    flashcards = json.load(file)

# Access single value
category = flashcards["Dog"]["category"]
print(category)  # "Animals"

# Access list
translations = flashcards["Dog"]["translations"]
print(translations)  # ["Hund", "Köter"]

# Access first item in list
first_translation = flashcards["Dog"]["translations"][0]
print(first_translation)  # "Hund"

# Loop through list
for example in flashcards["Dog"]["examples"]:
    print(example)
# Output:
# Der Hund bellt
# Ich habe einen Hund

Creating Nested Structures

import json

# Build nested structure
flashcards = {}

flashcards["Dog"] = {
    "translations": ["Hund"],
    "category": "Animals",
    "difficulty": "easy",
    "examples": ["Der Hund bellt"]
}

flashcards["Cat"] = {
    "translations": ["Katze"],
    "category": "Animals", 
    "difficulty": "easy",
    "examples": ["Die Katze miaut"]
}

# Save to file
with open("flashcards.json", "w") as file:
    json.dump(flashcards, file, indent=2)

Result in file:

{
  "Dog": {
    "translations": ["Hund"],
    "category": "Animals",
    "difficulty": "easy",
    "examples": ["Der Hund bellt"]
  },
  "Cat": {
    "translations": ["Katze"],
    "category": "Animals",
    "difficulty": "easy",
    "examples": ["Die Katze miaut"]
  }
}

List of Dictionaries

Another common pattern is a list where each item is a dictionary:

[
  {
    "session": 1,
    "date": "2024-10-31",
    "score": 8,
    "total": 10,
    "cards_practiced": ["Dog", "Cat", "House"]
  },
  {
    "session": 2,
    "date": "2024-11-01",
    "score": 9,
    "total": 10,
    "cards_practiced": ["Car", "Book", "Tree"]
  }
]

Accessing data:

import json

with open("progress.json", "r") as file:
    sessions = json.load(file)

# Access first session
first_session = sessions[0]
print(first_session["score"])  # 8

# Loop through all sessions
for session in sessions:
    print(f"Session {session['session']}: {session['score']}/{session['total']}")
# Output:
# Session 1: 8/10
# Session 2: 9/10

# Access list within dictionary
cards = sessions[0]["cards_practiced"]
print(cards)  # ["Dog", "Cat", "House"]

Complex Example: Progress Tracking

Structure with multiple levels:

{
  "user": "Alice",
  "total_sessions": 3,
  "settings": {
    "cards_per_session": 10,
    "show_hints": true
  },
  "history": [
    {
      "session": 1,
      "date": "2024-10-31",
      "score": 8,
      "total": 10
    },
    {
      "session": 2,
      "date": "2024-11-01",
      "score": 9,
      "total": 10
    }
  ]
}

Working with this structure:

import json

with open("user_data.json", "r") as file:
    data = json.load(file)

# Single values at top level
username = data["user"]
total = data["total_sessions"]

# Nested dictionary
cards_per_session = data["settings"]["cards_per_session"]
show_hints = data["settings"]["show_hints"]

# List of dictionaries
history = data["history"]
latest_session = history[-1]  # Last session
latest_score = latest_session["score"]

print(f"{username} has completed {total} sessions")
print(f"Latest score: {latest_score}/{latest_session['total']}")

Common mistakes:

# ❌ Wrong - forgetting nested structure
flashcards = json.load(file)
translation = flashcards["Dog"]  # This is now a dict, not a string!

# ✅ Correct - access the nested value
translation = flashcards["Dog"]["translations"][0]

Checking structure before accessing:

# Safe access with error handling
try:
    translation = flashcards["Dog"]["translations"][0]
except (KeyError, IndexError):
    print("Flashcard or translation not found")


← Back to where you were | → Start the course from the beginning

Error Handling with Files

Sometimes files might not exist yet, or the data might be corrupted.
Use try and except to handle these situations gracefully:

import json

try:
    with open("flashcards.json", "r") as file:
        flashcards = json.load(file)
except FileNotFoundError:
    print("No flashcard data found. Starting fresh!")
    flashcards = {}
except json.JSONDecodeError:
    print("Flashcard data is corrupted. Starting fresh!")
    flashcards = {}

Summary

  • Text files are good for simple, line-by-line data.
  • JSON files are better for structured, complex data.
  • Nested JSON lets you organize complex data with dictionaries, lists, and mixed structures.
  • Mix single values and lists to store both simple properties and collections.
  • Use proper error handling when accessing nested data.
  • Use Python’s open() for text files and the json module for JSON files.
  • Always handle errors when working with files to make your app robust.


← Back to where you were | → Start the course from the beginning