Biohacking With Whoop Data And AI: How I Actually Fixed My Circadian Rhythm

I stopped chasing Whoop recovery scores and started using AI to interrogate my sleep data. That is when my circadian rhythm finally started to make sense.
Biohacking With Whoop Data And AI: How I Actually Fixed My Circadian Rhythm
Photo by Egor Komarov / Unsplash

Why I stopped trusting “green recovery” as my sleep metric

I like gadgets. I also coach baseball, build web stuff, and experiment on my own biology. So of course I ended up with a Whoop strap glued to my wrist.

For a while I did what everyone does. Chase the green recovery score. Feel good when it is high. Feel guilty when it is low.

Then I started noticing something that annoyed me. I could get a high recovery score after 7.5 hours in bed but still feel like a zombie at 11:00. Other nights I got less sleep, a mediocre score, but felt sharp all morning.

The usual explanation is “sleep quality,” which is a nice way of saying “we do not really know, but vibes.” I wanted more than vibes. I wanted to see if my circadian rhythm was actually aligned or if I was just stacking sleep debt in a slightly slower way.

So I pulled my Whoop data out of the nice dashboard, fed it to an LLM, and started asking it questions I could not easily answer in the app. This post is exactly what I did, what actually changed my behavior, and where the AI genuinely helped instead of just dressing up some line charts.

Getting the data out of Whoop without losing your mind

Whoop gives you a slick interface but hides most of the raw structure. That is fine for normal people. I am not normal. I wanted timestamps, not gradients.

Here is what I actually used:

  • Whoop export (CSV) of 6 months of data.
  • Sleep, recovery, and activities exports, not just the summary.
  • Local scripts in Node + Python because I am not religious, just pragmatic.

The raw exports are boring but they have what matters: sleep onset, wake time, time in bed, latency, wake after sleep onset, light/deep/REM breakdown, and recovery scores. Enough to reconstruct my nights.

I cleaned the CSVs a bit. Mostly removing naps, fixing time zones, and normalizing everything to local time. This part is not glamorous. I spent more time here than in the AI prompts. If your data is messy, the model will happily hallucinate patterns out of noise.

Framing the real question: not “how do I sleep more?”

The obvious goal is “sleep better.” That is too vague. I know how to sleep more. Stop scrolling and go to bed. My actual question was tighter:

Is my sleep timing aligned with my natural circadian rhythm, or am I constantly fighting it and paying the price in the second half of the day?

That is a different beast than just optimizing “hours slept.” You can hit 8 hours and still be completely off in terms of phase.

So I wrote down the specific things I wanted the AI to look for:

  • Patterns between sleep onset time and next-day subjective alertness.
  • How consistent my bedtimes actually were, not just what I remembered.
  • Whether late training sessions corrupted deep sleep for more than one night.
  • Whether weekend social jet lag was actually trashing my Tuesday performance.

I made a simple log in Obsidian where I rated my morning alertness from 1 to 5 and noted two things: “felt wired at night?” and “crash in afternoon?” Nothing elaborate.

That gave the model both objective data from Whoop and subjective labels from my brain. The combination is where it started to get useful.

My minimal pipeline: CSV to AI without turning it into a research project

You can absolutely overbuild this. I forced myself not to.

I ended up with three simple steps.

1. Pre-process with a script, not in the prompt

I wrote a small Node script to normalize the CSVs and spit out a single JSON file per night with only the fields I cared about:

  • date, sleep_onset, wake_time, time_in_bed_minutes
  • sleep_efficiency, deep_minutes, rem_minutes, latency_minutes
  • recovery_score, hrv, resting_hr
  • training_load_evening from the activities file
  • subjective_alertness, evening_wired, afternoon_crash from my notes

The script also added some helpful derived fields:

  • Sleep midpoint relative to 03:00 (rough proxy for circadian phase).
  • Bedtime consistency score over the last 7 days.
  • Binary flags for “late training” and “late screen use” based on my journal tags.

I do not want the LLM to waste context figuring out what 02:17 means. I want to hand it: “midpoint 1.3 hours later than baseline.”

2. Feed batches into an LLM and ask for patterns, not magic

I fed 60–90 day windows of those JSON objects into an LLM with a system message that framed the task like this:

  • You are analyzing one human’s sleep over time.
  • Your goal is to find stable relationships, not coincidences.
  • Ignore single-night outliers unless they repeat.
  • Focus on links between timing consistency and next-day subjective alertness.

Then I asked very blunt questions:

  • "Which sleep onset window correlates with the best next-day alertness for me? Give a range in 30-minute blocks."
  • "How many consecutive nights outside that window does it usually take before my afternoon crash score worsens?"
  • "Is there a difference between going to bed late but consistent, vs. early but inconsistent?"
  • "What is the minimum bedtime consistency (same 1-hour window) for 7 days that precedes my best HRV and alertness combo?"

I asked the model to reply in two parts: plain language summary, and a small table of rules with confidence scores. Not statistical confidence. Just “how often did this pattern hold in the sample you saw.”

3. Convert insights into rules the app cannot give you

Whoop already tells you how long to sleep. What it did not tell me was how brutal my circadian drift was.

The LLM surfaced four rules that were boringly specific and surprisingly useful for me:

  • Bedtime between 22:45 and 23:30, at least 5 nights in a row, followed by wake between 06:45 and 07:30. Those weeks had the highest pairing of HRV and subjective alertness.
  • Two or more nights with sleep onset after 00:30 shifted my sleep midpoint by ~1 hour and were followed by afternoon crashes 70% of the time.
  • Late training (after 20:30) only hurt me if combined with a late bedtime. If I still went to bed in my target window, HRV dropped slightly but alertness did not.
  • Weekend bedtimes more than 90 minutes later than my weekday baseline wrecked Tuesday, not Monday. That lag mattered psychologically because I usually blamed Monday.

None of this is some universal law. This is just my body, my schedule, my caffeine abuse level. The point is that the AI could hold 90 days of context and reason about patterns in a way that my brain could not and the app interface does not try to.

Where the AI actually helped my circadian rhythm

I care about two things: how I feel when I wake up and how stable my focus is around 15:00. Not just what color Whoop paints the circle.

Here is what noticeably changed after I started using AI to interrogate the data.

1. I stopped chasing “more sleep,” started chasing “consistent window”

I used to treat any extra hour of sleep as a win. The model convinced me that for my specific pattern, a stable 7.5 hours in the same window was better than 8.5 hours in a drifting one.

The data backed it up. When my midpoint stayed roughly centered around 03:15 for a week, my afternoon crash notes almost disappeared. If I let the midpoint drift later than 04:00, I could still get 8–9 hours and feel like I was walking through fog.

So I created a stupidly simple rule for myself: protect the midpoint, not just the duration. If I stay up later, I do not “sleep in” to compensate. I drag myself out of bed at the usual time and pay the pain tax early to avoid shifting the whole rhythm.

2. I treated weekend social jet lag as a design constraint

The LLM made the weekend damage painfully obvious. It highlighted a pattern I kind of knew but ignored.

Two late nights in a row after 01:00 meant that Tuesday was statistically my worst focus day over six months. Not Monday. Tuesday.

Once I saw that written out, with specific dates and percentage of “afternoon crash” notes, it was harder to excuse. Not impossible. Just harder.

I did not stop going out. I shifted the frame. If I choose a late Saturday, Sunday becomes a hard center of gravity reset. No double late nights. I pick one.

That single rule cut a lot of the drift without feeling like a lifestyle death sentence.

3. I stopped blaming evening workouts for everything

I used to assume training after 20:00 was evil. Recovery scores often dropped, so I blamed the timing.

The AI analysis showed something more nuanced.

  • Late training plus late bedtime was bad. Deep sleep shrank. HRV dropped. Brain fog followed.
  • Late training plus normal bedtime usually left deep sleep intact. I woke a bit sore but fine mentally.

So the main variable was not the training hour. It was whether I let the training session drag my sleep midpoint later. The solution was boring. Stop scrolling after training, shower, eat, blackout the room, sleep.

I still prefer earlier sessions. But when coaching schedules push me late, I no longer catastrophize. I just protect the bedtime window harder.

How I would set this up if I was starting from zero tomorrow

If you have a Whoop or similar tracker and you want to run a similar experiment without building an entire analytics stack, here is the lightweight version I would do again.

1. 60 days of data, minimally organized

Export 60–90 days of sleep and recovery data. Clean only what you must: time zones, duplicate entries, obvious outliers.

For each night, keep five simple things:

  • Sleep onset and wake time.
  • Total sleep duration.
  • HRV and resting HR.
  • Recovery score.
  • Any heavy evening training load flag.

2. Add ridiculously simple subjective notes

For the same days, log three fields in a note app or spreadsheet:

  • Morning alertness from 1 to 5.
  • Afternoon crash yes/no.
  • Went to bed wired yes/no.

Do not overthink the scale. You are not publishing a paper. You just need relative patterns.

3. Use the LLM as a pattern explainer, not a scientist

When you feed the data to an LLM, treat it like a very fast junior analyst, not a magical oracle. Your job is to ask grounded questions.

Examples:

  • "Which 1-hour bedtime window leads to my highest average morning alertness?"
  • "How does bedtime consistency over 7 days relate to my afternoon crash rate?"
  • "After how many late nights in a row does my pattern usually break down?"
  • "If I keep sleep duration constant, does bedtime timing still matter for me?"

Ask it to ignore single-night events and focus on patterns that show up in at least 60% of the relevant cases. That heuristic worked fine for me.

Where AI falls short and where it fits perfectly

I do not think AI replaces a proper sleep lab or a good doctor. If your sleep is completely wrecked, do not start with CSV exports. Start with a human.

For me, AI shines in the middle zone. Not emergencies. Just subtle misalignment.

It is very good at:

  • Holding months of data in context and noticing boring patterns.
  • Translating them into human language rules that I can remember.
  • Letting me run “what if” questions on my past behavior.

It is not good at:

  • Making causal claims when there is not enough data.
  • Understanding your life context outside the numbers unless you feed it.
  • Deciding what tradeoffs you should actually make.

The real win for me was not some magical optimization. It was the shift from reacting emotionally to a green or red score to designing my week around a circadian window that clearly works better for me.

What actually stuck, months later

Some experiments die after the blog post is written. This one mostly stuck.

Here are the things I still do, months after the initial analysis:

  • I aim for a 22:45–23:30 window on weeknights and wake around 07:00.
  • I let myself have one truly late night on the weekend, not two.
  • I ignore recovery guilt if my midpoint stayed stable and my brain feels good.
  • I occasionally re-run the analysis every few months to see if my patterns drift.

That last point matters. Your circadian rhythm is not a constant. Training volume, season, light exposure, coaching schedule, and stress shift it.

Having an AI “sleep analyst” that I can pull in every quarter to re-check the patterns feels more useful than yet another wearable. The strap collects the data. The model helps me actually respect the rhythm hiding inside it.

Subscribe to my newsletter

Subscribe to my newsletter to get the latest updates and news

Member discussion