Every CrossFit workout I do lands in Strava as "Afternoon HIIT" with an empty description. My Garmin watch records it as a generic high-intensity session and uploads it automatically, but it has no idea what the workout actually was. It shows up as a HIIT block, not as the workout of the day I really did. I wanted the activity to say what I did: "CrossFit WOD (Workout of the Day)," plus the movements and weights off the gym whiteboard.

The first version was almost too simple to feel like a build. After a workout I'd open the gym app, screenshot the whiteboard, and drop the image into a chat with my agent: "put this on today's Strava." Thirty seconds later the activity Garmin had filed as "Afternoon HIIT" read "CrossFit WOD INDEPENDENCE," with all five rounds spelled out underneath. The first time it worked. A photo of a whiteboard had turned into a clean workout log, and I hadn't touched a keyboard.

But I wanted more. I wanted to cut the screenshot out.

Integrating to Strava was the easy part

Strava has a real, documented REST API. One request updates an activity's name and description. The agent reads the WOD, builds the description, sends the call, and the change shows up in the Strava app within thirty seconds. I used it that day to fix an HIIT activity that was actually CrossFit.

The write side was never the problem. The problem was the screenshot. As long as I had to open the gym app, screenshot the whiteboard, and send it over, I hadn't really automated anything. I'd just moved the chore into a chat window. What I wanted was a cron that fires after class, pulls the WOD on its own, and writes it to Strava with me out of the loop entirely.

So the build is really two halves: a writer that pushes to Strava, which I'd just finished, and a reader that pulls the WOD from the gym, which didn't exist yet. The writer was easy because Strava hands you a manual. The reader is where all the work turned out to be, because the gym's app doesn't have an API.

The catch: there's nothing to connect to

No developer portal. No docs. No OAuth app, no tokens page, no "API access" buried in settings. Just a web app and a phone app and a login. From the outside there's nothing to connect to.

Except there is. The web app talks to something every time it loads a workout. That traffic is a private API. It's just not one anybody published. So the job changed from "read the docs" to "figure out the API the app is already using, by watching it use it."

Reverse-engineering an app from its own browser tab

I logged into the web app, opened DevTools, and watched the Network tab while I clicked around. Every action fired requests at a backend host. There was the API, right there in front of me, fully functional and completely undocumented.

The auth is where it got weird. I assumed I'd see an Authorization: Bearer ... header like every other modern app. There isn't one. The auth rides on four custom headers:

access-token: <copied-from-devtools>
client:       <copied-from-devtools>
uid:          <copied-from-devtools>
expiry:       <unix-epoch>   # decode it to a real date

The actual bearer hides under access-token, a header name I'd never have guessed to look for. I confirmed the shape by reading the app's JavaScript bundle directly, all 9.7 MB of it, and matching it against the headers on a live request.

Then I tried to log in programmatically, so the agent could refresh its own session, and hit the second half of the wall. The login endpoint is captcha-protected and rejects scripted calls. The "email me a code" reset flow returns 200 OK, but the step that actually consumes the code returns 401 no matter how I shape the request. There's no clean way in from code.

So I did the thing that actually works: logged in with my own browser, copied the four headers straight out of the DevTools Network tab, and pasted them into .env. The agent isn't logging in. It's borrowing a session I already have, on my own account, for my own workout data. The tokens last about 84 days, there's no refresh, and when they die I grab four new ones by hand. Ugly, but it's the only door that opens.

I want to be clear about the line here, because it matters: this is my account, my data, one request after my own class. I'm not scraping other members, not hammering anyone's server, not redistributing a thing. That's the only version of this I'd write up, and it's the only version you should build.

Once you're in, the endpoints still lie

With the session working, I built the reader: it pulls the workouts I'm on the roster for, lifts each tier (RX / Intermediate / Beginner) out of the HTML, and strips it to the clean text that goes to Strava.

Then it lied to me. I asked the agent for last Wednesday's workout and it came back with "not on the roster," for a class I knew I'd done. I almost believed it. The only reason I caught it is that I went and looked myself, saw I was on the roster, and realized the endpoint, not the agent, was wrong: it ignores the date you pass for anything in the past and just returns the next seven days from today. The agent had reported exactly what the API told it.

That's the part I want to underline. The agent is only as right as the question it knows to ask, and it won't warn you when it's been handed garbage. I had to stay in the loop, check its answer against what I already knew, and nudge it toward the real one. Probe an endpoint with curl | jq before you trust your own code, and don't hand your judgment to the agent just because it answered without flinching.

Why "it works" still isn't "it's right"

"It works" and "it's right" are different sentences, and the space between them is where this gets dangerous. The pieces work, but "work" only means the plumbing works. The reader pulls whatever the gym posted. The writer puts whatever string I hand it into Strava. Neither one knows whether I actually did the Intermediate tier or just told it I did, or whether the roster reflects who really showed up. When I say "I did the Intermediate one," the agent believes me and writes it down. A correct API call and a true workout log are two different things, and only one of them is the agent's job. (Same trap as the fantasy football agent that lied to me four times, just in different clothes.)

Now it runs on its own

When I first drafted this, the cron was the part I hadn't built. Now it's the part that ties everything together. On a schedule after my class, it fires, runs the reader to pull the day's WOD, hands the text to the writer, and updates the Strava activity. No screenshot, no chat, no me. It took a couple of iterations to get the timing and the matching right, but it now does exactly what I wanted from the start.

That chain is the agent. Not a chatbot I talk to, but a schedule plus two skills plus nobody watching: it wakes up, reads the gym, writes to Strava, and goes back to sleep.

Which is also where the honesty comes in. The one thing that will eventually break it is the tokens. They expire in about three months with no refresh, and when they do the whole thing fails silently, writing nothing, until I notice my workouts have gone generic again. So the first guardrail it needs is a loud one: text me the second it sees a 401, instead of dying quietly for weeks. "It runs on its own" is true today. It has a shelf life, and I built it knowing that.

Build this yourself

The transferable part here isn't Strava. It's the recipe for connecting an agent to an app that never meant to give you one.

One caveat before you copy this: reverse-engineering a private API can run against an app's terms of service. Read them first, and only point this at accounts and data that are your own. What worked for me is shared as-is, with no warranty. What you do with it is on you.

Time: an afternoon, most of it in DevTools, not in code.

Cost: $0.

Stack:

  • Python 3, standard library only. urllib, json, argparse. No requests.

  • The target app open in a browser, plus its DevTools Network tab.

  • For the write side, whatever has a real API. Strava's is free and documented.

The method:

  1. Open the app in a browser, log in, open DevTools, and watch the Network tab while you do the thing you want to automate. Those requests are the private API.

  2. Copy a real request and reproduce it with curl. If your data comes back, you have an API.

  3. Log in through the browser, copy the auth headers into .env, and treat them as a session you'll have to refresh by hand.

  4. Build a small reader that calls the endpoints and cleans up the response, then write the result wherever there's a real API. Mine writes to Strava.

Pitfalls, each one cost me real time:

  • Auth won't look the way you expect. Don't assume Authorization: Bearer. The real token may hide under a header name you'd never guess.

  • There's usually no scripted login, and the session expires (mine after about 84 days) with no refresh. Make the failure loud, a text or an email, or your cron dies in silence.

  • Endpoints lie. Probe each one with curl | jq before you trust it. Mine ignored the date I asked for and returned a different window.

  • Don't outsource your judgment. The agent reports what the API says, confidently, even when the API is wrong. Stay in the loop and check its work against what you actually know.

The prompt I used

Point this at your own no-API app. The constraints at the bottom are the lesson, so keep them:

I want to connect an agent to an app that has no public API. Help me
reverse-engineer its private API from the browser and build a reader.

- I'll log into the app, open DevTools -> Network, and paste the real
  requests and their headers. Map the endpoints I need (for me: today's
  workout for a given user) from that traffic.
- Figure out the auth from the live requests. Do NOT assume
  Authorization: Bearer. Here it was four custom headers (access-token,
  client, uid, expiry) copied from a logged-in session, with no refresh
  flow. The real bearer was under access-token.
- Build a stdlib-only Python reader (urllib, json, argparse) that calls
  those endpoints, strips HTML/markup out of the response, and prints
  clean text plus a --json mode.
- Write the result to a documented API (Strava): validate fields against
  an allowlist and PUT /activities/{id}.

Constraints that bit me, keep them exact:
- Probe every endpoint with curl | jq before trusting it. Mine ignored
  ?date= for past dates and returned a forward 7-day window from today.
- Match the response's date format exactly (titles were YYYY/MM/DD).
- Tokens expire with no refresh. Fail loudly when they do.

It started as a screenshot and a guess. Now a schedule does the whole thing while I'm still cooling down from the workout, right up until the day the tokens expire and I have to go grab four headers out of my browser again. If you've wired an agent into an app that never gave you an API, hit reply and tell me where it fought back.

— Ben

Keep Reading