Knowledge Site / Content Detail
The Ollama Automation That Keeps Me Accountable Even In Nature
Many of your best ideas will never see the light of day. You’re walking the dog, in the grocery aisle, taking a shower, or dropping off your child at school, obviously not all at the same time, and a task, a meeting, or a video idea pops up. Then you unlock the phone, hunt for the right app, and that little spark fades. I wanted one move that always works. So now I triple‑tap the back of my iPhone, I talk, and then I stop. That’s it. It figures out if it’s a task, a calendar event, a grocery item, or a note for Obsidian, and routes itself. No app juggling. No cleanup.
Here’s the pain. Fleeting notes end up in the wrong app, or the wording is off, or they just sit in an inbox you’ll never process. You lose time copying the same thing into different places. You miss tasks because they weren’t tagged or dated. You promise to buy organic milk and end up bringing home three kinds of yogurt, which can be tasty, but isn’t helpful for anyone. The root cause is capture friction and manual sorting. There’s no single entry point, so you hesitate, and the moment passes. I wanted one capture flow with two entry points: a triple‑tap on the phone, or a quick voice memo on the watch or in CarPlay when the phone isn’t nearby or when driving. One microphone per device, instant routing, and a reliable fallback. If it can’t classify the note with confidence, it should still be stored somewhere safe. The fix needed to be fast, private, and hard to break. Some local processing, offline behavior where it makes sense but still private and secure, and an audit trail so I can see what happened later. The goal is simple: reduce taps, reduce decisions, and keep every thought.
So I committed to two primary capture methods with voice. No more choosing an app. I’d speak once, and the system would decide where it goes without me having to wait for it. Criteria were clear: it has to work offline where it makes sense, keep data private by default, add almost no latency to my interactions, be reversible if it guesses wrong, and give me visibility into each step. For tooling, I used Apple shortcuts for the initial capture, plus the app Just Press Record to capture from my watch and CarPlay, n8n for routing and the fallback, Ollama (via Ollama Cloud) for intent and entity extraction, Akiflow for tasks and calendar, Apple Notes for the shared grocery list, and Obsidian for everything else. I started with a pilot: send every capture to a holding queue while I configured the classifier. Nothing went straight to a destination until I saw consistent behavior. Then I mapped intents, defined a simple schema, wired the n8n webhook, built two Shortcuts to capture audio, and a final shortcut for the note integration. From there, it was iterate fast and measure what breaks. It may look complicated now, but it built slowly and isn’t all that bad.
Let’s take a quick look at the build. There are two paths to text. On the phone, a capture dialog records your voice and converts it to text immediately. On the watch and in the car, it saves a voice memo that syncs to iCloud; then a shortcut pulls that audio and transcribes it. After that step, the payload is unified so every capture looks the same: it’s just text. The first Shortcut is tied to the Back Tap on the iPhone, a feature you’ll find under accessibility. Triple‑tap opens a voice recorder, you speak, it stops, and the Shortcut sends a POST to an n8n webhook with an auth header. For the watch and car, a shortcut watches the folder used by Just Press Record and then sends the text to that same n8n webhook. On the n8n side, the first node is a Webhook that accepts JSON. I write every raw payload to a data table with fields to fill in later with the destination and the formatted text. That’s the write‑ahead log, so nothing gets lost. Right away I report back to the Shortcut that the data was captured and being processed.
Next, I classify the text. The n8n Text Classifier node does an amazing job with minimal configuration. I specify the input text, then describe the categories so that the AI can pick the right one. Each category goes to a separate branch of nodes.
The next step depends on the destination. Some things like video ideas or grocery items need to get condensed down to the core idea or product. So I use the information extractor node to pull that out. Then all of the branches update the data table with the processed text and the destination type. And finally it does the last stage where it triggers a write to obsidian or notes or my task manager.
The first week, the wins were clear. One take, minimal taps, and items landed where they belonged. I said, “I have a dentist appointment tomorrow at 3,” and it created a calendar event in Akiflow with the right time. I said, “buy oat milk,” and it appended to the shared grocery note in Apple Notes, deduped against what was already there. I said, “Make a video that shows how to route with n8n text classifier node,” and it added to my video ideas note in Obsidian. No app hunting. No manual tagging. The fastest path from speech to organized data.
The insight is that a central webhook paired with Ollama removes the decision tax of choosing an app. You unload your brain once, and the system does the sorting. Even when it wasn’t confident, I could still trust the safety net; everything was in the data table and everything that it couldn’t figure out a destination for went to the Other note. That trust is the unlock. With the basics working, the next step was handling real‑world edge cases. That’s how this moved from a neat demo to a tool I actually use every day.
Matt from the future here with a shorter beard. Automation is really at the heart of all of this. There are a lot of options that can go here, but the real winner in this space is n8n. I have been using it for 6 or 7 years at least and have tried others along the way, but always come back to n8n, especially since working with Ollama. You can self host for free or let them host it for you for a reasonable subscription fee. i really do think it is awesome and just getting better all the time. So its super exciting that n8n is the sponsor for this video and many others on this channel. You can check them out at n8n.io. Now, lets get back to the past.
It wasn’t all smooth. Ambiguous notes tripped the model. “Dinner with Sam next week” needs a date, a time, and a place, not a task. So I shifted from label-only to extraction.
Duplicates annoyed me. That’s why one of the first steps is to see if it exists in the data table, and if not, add it and move forward.
Reliability mattered most. I think most folks would want to just setup an AI agent to do all the work for them, but when setting up any automation, an AI agent is almost always the wrong way to do it. Yeah, that goes against most advice you will find on YouTube, but most of those creators have no idea what they are doing and AI is still sexy. Most processes are well defined or they can be. Having the AI figure out that process each time it runs is a waste of time and it will get it wrong some percentage of the time. You should always use the AI where it makes sense and offers the most value. Using an AI agent in most cases may be easier to develop, saving you 10 minutes one time, but at the cost of lower reliability and a 5 to 45 second delay every single time you add a note. Those delays add up quickly, making you regret that agent choice.
And the first version left me hanging for 30 seconds waiting for the n8n process to finish. Turns out whats really most important is getting a confirmation that n8n is working on it and has recorded the input. that now happens at the 4th step, before AI has a hand in the process, rather than at the end after two or three prompts.
With everything in place, I wired the destinations. Akiflow, while I love it enough to have paid for 5 years in advanced, can be super frustrating. There is no public API. There have been feature requests for years and they just ignore it, instead focusing on less important gimmicks. A public API would solve so many feature requests for them. So we are stuck with using an email to their AI agent. It gets it right most of the time so there is that, but a public API would allow us to do it far more reliably and quickly. That’s a huge win for a tool like Todoist over Akiflow.
For Apple Notes, groceries live in one shared note. Sure there are dedicated grocery apps, but none of them are good for anyone other than the geekiest of geeks, and Apple Notes just works without paying another subscription or setting up a single use server. The flow for Notes is actually almost identical to Obsidian. n8n writes a json file to Dropbox, and when my mac sees that file show up, it triggers a shortcut. Shortcuts are great for automating the Mac. It already knows about Apple Notes and how to manipulate each Note. And with the app called Actions for Obsidian, you get the same benefits with Obsidian.
Now, capture is just a triple tap and talk. My brain unload happens on command. Tasks, events, groceries, and notes land where they belong without me choosing an app. Follow‑through is higher because the system takes the next step for me. I spend time doing, not sorting.
It’s faster in two ways: fewer steps, and less thinking. I don’t manage inboxes; I check a review lane when I want, I even get an email at the end of the day with a summary of what I captured just in case. Privacy holds because the LLM runs with Ollama. If I’m offline, it queues; when I’m back, it replays. The audit trail means I can always answer, “Where did that go?”
If you want this, clone the idea, not just my tools. Build a single capture path. Normalize the payload. Classify. Write ahead. Route with idempotency. Mirror to a safety net. Then swap in your destinations—Akiflow, Things, Apple Notes, Obsidian, Notion—whatever fits your stack.
What I have covered here may be enough to get you going, but if you need more detail on some of my decisions and how it works, I’m doing a six‑part deep dive next. We’ll walk through architecture and principles, setup and first run, integration patterns, a better voice memo, n8n text classification and extraction, and troubleshooting and debugging.
In Episode 1, we draw the map: trigger to transport to brain to router to destinations to safety net. We’ll cover local versus cloud models, webhook auth, and why schema‑first and idempotency make this reliable. You’ll see latency targets and retry windows, and exactly what lives on phone versus server. We’ll lock down least privilege on connectors, and define boundaries so one failure can’t cascade. Clear principles, simple moving parts, strong enough guardrails.
Episode 2 is the hands‑on build. I’ve already installed n8n in other videos, but I’ll create the webhook, set secrets, and bring up Ollama. Then we’ll build the Shortcut, map Back Tap, and test with a few phrases. We’ll validate payloads, inspect logs, and tune the prompt for clean intents. Finally, we’ll route a single destination end to end and verify the round trip with the journal.
Episode 3 goes deep on connectors. We’ll handle Akiflow first. We’ll append to Apple Notes with dedupe. We’ll write to Obsidian. Along the way, we’ll use idempotency and replay‑safe operations so retries don’t double‑post. Clean patterns you can reuse anywhere.
In this 4th episode, I want to look at the app that enables the off-phone part of the flow. I tried a number of things without much luck but eventually found a cheap tool that is perfect in all the different places I need it and has replaced voice memos for me entirely.
In Episode 5, we’ll set up the two core nodes that make this work: Text Classifier and Information Extractor in n8n. We’ll walk through inputs and labels for the classifier so it cleanly splits tasks, events, groceries, and notes. Then we’ll wire the extractor to pull fields like title, date/time, and item name using simple instructions.
You’ll see the exact node config, prompt templates, and how to pass data between them. We’ll validate outputs with a few sample phrases, and show how to write the results back to the capture table before routing.
Episode 6 is the fix-it kit. We’ll break things on purpose, then walk through how to spot and repair them fast.
We’ll cover the common failures: webhook timeouts, duplicate posts from retries, folder‑watch hiccups on the watch flow, and shortcuts that stop mid‑run. You’ll see how to read n8n execution logs, and trace a payload from the webhook through classifier, extractor, and out to the destination. We’ll add clear status fields to the capture table so you can see stuck items at a glance.
For misclassifications, we’ll tune labels and thresholds, add a “send to Other” fallback, and figure out a one‑tap replay to fix and resend without re‑recording. Finally, we’ll set up alerting for hard failures and a daily review lane so nothing slips through.
If these topics look interesting, hit subscribe and the bell so you don’t miss Episode 1. We’re starting with the architecture map and the first working capture—triple tap to text to router—so you can see the whole path end to end.
Until then, watch this video. It’ll give you more of what you need to make Episode 1 go smoothly.
Thanks so much for watching. Links and starter files are in the description, with most of the details to get you going today. I’ll see you soon in Episode 1.