06.13.2025: Organizing the Archive for Smarter Answers

Jun 13

June 13 was about defining and scaffolding. It was the day I laid the groundwork for the next wave of functionality in AutonomousMatt—the day I stopped thinking of this project as a clever one-off and started treating it like a living system. The goal wasn’t just to make it work. It was to make it scale.

Here's everything that happened the day before yesterday in the evolution of the autonomous archive engine.

🧱 Foundation Work

1. Metadata Injection into Archive Files

I established a new metadata convention for every .txt file in the archive:
Each file now begins with a structured header block including:

txt

CopyEdit

title: The Conscious Cruelty of I, Daniel Blake author: Matthew Shadbolt date: Nov 30 summary: Ken Loach’s portrait of state indifference and bureaucratic despair in austerity-era Britain. key insight: Bureaucracy itself becomes a form of cruelty when deliberately weaponized against the vulnerable. source: https://www.archivalmatt.com/i-daniel-blake

This unlocked several future-facing features:

Generating blog-style previews for each file.
Indexing entries in a visual archive grid.
Allowing source link display below each AI response (coming soon).

🔧 It also meant I had to update dozens of files by hand to get the formatting right. But the payoff was instant clarity and future automation.

2. Rebuilding the `.txt` File Naming Convention

I began systematically renaming all archive text files for consistency. Prior to this, file names like daniel_blake.txt or product_manager_notes.txt lacked style or structure.

The new format:

txt

CopyEdit

/film_i-daniel-blake.txt /story_how-to-be-a-great-product-manager.txt /talk_destiny-habituation-tactics.txt

This change improves:

Visual organization in the repo.
Keyword routing clarity in keywordMap.
File-type scoping (/film_, /story_, /talk_) which will power filters and styling soon.

🔁 System Behavior Improvements

3. Introduced GPT-Contextual Source Rendering Design

I mocked up how GPT answers would eventually:

Show a user’s prompt as the header
Display the AI response below it
Cite the file at the bottom clearly, e.g.:

bash

CopyEdit

source: /film_i-daniel-blake.txt link: https://www.archivalmatt.com/i-daniel-blake

This structure felt closer to a research assistant than a chatbot. More archival. More transparent. More me.

4. Laid the First Draft of the Semantic Keyword Expansion Plan

June 13 was the day I started sketching the conceptual clusters that would later fuel the keyword routing improvements built on June 14.

Here’s what it looked like at the time:

Film: Blue → “AIDS,” “voiceover,” “Jarman,” “illness,” “monologue,” “queer”
Talk: Destiny → “loot,” “grind,” “dopamine,” “MMO,” “habit,” “tactics,” “sunk cost”
Story: Migration → “airport,” “emigration,” “identity,” “starting over,” “nostalgia,” “America”

It was still a mess. Redundant keywords. Overlapping ideas. But it was a start—and it gave shape to the problem of recall vs. precision in the user query → file matching flow.

🧩 Friction Points and Breakdowns

1. Keyword Queries Weren’t Surfacing Files Reliably

I discovered that queries like “gaming compulsion” returned nothing, despite the existence of /talk_destiny-habituation-tactics.txt. Same with “queer cinema” or “HIV film”—no file loaded.

This led to a full investigation into how keywords were parsed, and whether to treat them as:

Exact matches?
Token matches?
Synonym matches?

Conclusion (for now): Use manually expanded keyword maps and keep them human-readable.

2. Mismatch Between GPT Prompts and File Relevance

June 13 revealed an uncomfortable truth: just because a file exists, doesn’t mean GPT will pull from it well. The AI needs more than a file—it needs relevance, specificity, and focused chunking.

At the time, everything was one big .txt. Long form. No sections. No headers. That was beginning to hurt retrieval quality.

Plan established: start breaking larger files into smaller, thematic, linkable segments.

✍️ UX and Presentation Ideas Jotted Down

While no visual changes shipped on the 13th, I brainstormed:

A Markdown-to-HTML renderer to beautify raw .txt archive entries.
A potential archive browser UI with tags, years, and formats.
Adding user prompt headers to every GPT response for clarity and scannability.

🧠 Reflection

June 13 wasn’t glamorous. Nothing shipped to the frontend. No buttons changed color. But it was a day of infrastructure decisions—the kind that make future progress cleaner, faster, and easier to maintain.

It was also a moment to reorient the project: