Biohacking My Diet With AI: What Actually Worked

I wired my diet into a feedback loop with bloodwork, wearables, and AI models. This is what broke, what worked, and how the system looks now.
Biohacking My Diet With AI: What Actually Worked
Photo by Brooke Lark / Unsplash

Why I Stopped Trusting Generic Nutrition Advice

Nutrition advice on the internet feels like religion. Keto vs high-carb. Carnivore vs plant-based. Everyone is right, as long as you ignore the people they broke along the way.

I coach baseball, write code, and experiment on myself. My schedule is messy. Double training days, late games, deep work blocks. A fixed “2,000 calories and 150 grams of protein” template never really matched my actual life.

So I decided to wire my diet into a feedback loop. Bloodwork. Wearables. Training logs. Then feed it all into a simple machine learning stack and ask a better question:

Given my body and my schedule, what should I eat today to hit tomorrow feeling dangerous, not wrecked?

This post is not theoretical. I ran this for months. Parts of it sucked. Parts of it were surprisingly effective. I will walk through the exact system, warts included.

The Stack: What I Actually Used

I did not build a fancy neural network from scratch. This is more glue engineering than cutting edge research.

Here is the real stack:

  • Data in: Apple Watch, Oura, Cronometer, occasional bloodwork, a Google Sheet for training logs.
  • Storage: Postgres for structured data, a folder of CSV exports for sanity checks.
  • ML layer: Python with scikit-learn and a bit of XGBoost. Nothing magical.
  • AI layer: An LLM that generates actual meal ideas and shopping lists given my macro and micronutrient targets.
  • Frontend: A very boring internal web app. Next.js, Tailwind, minimal styling.

I think most biohackers overcomplicate the model and undercomplicate the data. Fancy architecture does not fix trash inputs. I spent more time on logging than on hyperparameters.

The Inputs: Turning My Body Into a Dataset

First, I had to answer a basic question. What signals are strong enough that it is worth feeding them into a model?

I settled on four categories:

  • Daily metrics: weight, resting heart rate, HRV, sleep duration and stages, perceived energy.
  • Training: type (strength, conditioning, skills), duration, RPE, any soreness notes.
  • Food: full macro breakdown plus key micros that I actually care about: omega 3, magnesium, vitamin D, fiber, sodium, potassium.
  • Periodic labs: fasting glucose, HbA1c, lipids, CRP, vitamin D, ferritin, B12.

Most of this came from existing tools. Nothing exotic.

  • Apple Watch + Oura synced into Apple Health, then exported.
  • Cronometer handled nutrition tracking and micronutrients.
  • Bloodwork every 3 to 4 months through a standard lab.

The smallest but most annoying part was subjective logging. I built a one-click mobile view that asked three questions each morning:

  • “How is your energy?” (1 to 5)
  • “Any brain fog?” (yes, neutral, sharp)
  • “How do you feel about yesterday’s recovery?” (1 to 5)

Those three fields ended up being more useful than half the fancy metrics. Of course.

The Core Idea: Predict Tomorrow, Eat Today

Most nutrition planning works forward. Set macros, then eat.

I flipped it. I tried to work backward from tomorrow.

Question: Given yesterday’s food and training, how did I feel and perform today?
Goal: Nudge macros and key micronutrients so that future days land in the “high energy, good training, solid sleep” bucket more often.

In practical terms that turned into a simple supervised learning task:

  • Features: yesterday’s intake (macros and micros), training load, sleep, current bodyweight, current average HRV.
  • Targets: today’s energy score, today’s training performance (scaled 1 to 5), and whether I hit sleep efficiency above a threshold.

I started basic. A gradient boosted model predicting those targets from the features. No deep learning. I wanted decent feature importances and a model that trains in seconds, not hours.

Once the model was semi-reasonable, I asked it a slightly different question through a small optimization loop:

“Given tomorrow is a heavy training day, what macro and micro ranges give me the highest probability of high energy + strong training + decent sleep?”

The output was not a meal plan. It was just numbers:

  • Calories: 2,550 to 2,750
  • Protein: 165 to 190 grams
  • Carbs: 230 to 280 grams
  • Fat: 70 to 90 grams
  • Fiber: 28 to 36 grams
  • Sodium: 3g to 4.5g
  • Key micronutrient nudges, like “push magnesium up” or “watch saturated fat”

Good, but not usable yet. I still had to decide what to actually eat.

Where AI Actually Helped: From Numbers To Food

Macros are easy. Meals are hard.

I had this gap between a macro sheet and my fridge. That is where I put the LLM.

The flow looked like this:

  1. Model produces target macro and micro ranges for the day.
  2. I send that, plus my dietary constraints and what food I have in stock, to the LLM.
  3. The LLM generates 3 possible daily meal structures that roughly hit those numbers.

The prompt was very explicit. Stuff like:

Given:
- Target calories: 2600 (+/- 100)
- Protein: 180g (+/- 10g)
- Carbs: 250g (+/- 20g)
- Fat: 80g (+/- 10g)
- Fiber: 30g (+/- 5g)
- Emphasize: magnesium, omega 3, high satiety
- Constraints: no gluten at breakfast, keep prep under 20 minutes on weekdays
- Foods available: chicken breast, eggs, oats, frozen spinach, Greek yogurt, berries, rice, olive oil, potatoes, salmon, etc.

Return:
- 3 daily meal plans
- Each with 3–4 meals and 1 snack
- Include macro estimates for each meal
- Keep instructions short

The first attempts were chaos. The model ignored fiber. It broke calorie budgets. It invented foods not in my pantry. Classic.

After a lot of prompt tightening and some post-processing checks, it got surprisingly useful. I added a validator that recalculated macros using my own food database and flagged any meal plans that were too far off. Those got auto-rejected, and the model had to retry with feedback.

Result: I opened the app in the morning and saw something like:

  • Plan A: high carb for evening training.
  • Plan B: slightly higher fat and more stable if I had a coding marathon.
  • Plan C: lower calories if recovery markers looked rough.

I picked one and hit “send to Cronometer”, which pushed the foods into my tracker through their API. Then I just executed.

Machine Learning That Actually Influenced My Diet

So did the model discover anything interesting? A few patterns kept showing up.

1. My carbs were too random

On heavy training days my carb intake varied wildly. Sometimes 150 grams, sometimes 320. No wonder my sessions felt inconsistent.

The model kept nudging carbs up toward a tighter band on high-load days. Not insanely high, just more predictable. Performance scores improved. RPE for similar loads went down.

2. Magnesium and sleep

I know, everyone talks about magnesium. I thought it was overhyped. Then the feature importances and partial plots basically yelled at me.

Days where magnesium intake tanked correlated with worse sleep efficiency and lower HRV the next morning. I shifted more food-based sources instead of just supplements. Pumpkin seeds, spinach, some dark chocolate. My sleep data looked less noisy after a few weeks.

3. Late fat bombs wrecked my evenings

High fat dinners plus late training meant my sleep latency exploded. I knew this subjectively, but the model reinforced it. The probability of hitting my “good sleep” threshold dropped sharply when calories and fat were both stacked late.

I changed one rule. After 7 pm my plates became lighter on fat and heavier on low fiber carbs and protein. Sleep improved more than any gadget upgrade I tried in the last year.

Where The Whole Thing Broke

Not everything worked. Some parts were annoying enough that I almost trashed the project.

Data quality

Nutrition tracking is rough. Even if you weigh your food there is noise everywhere. Restaurant meals. Wrong database entries. “One large egg” that somehow becomes 90 calories instead of 70.

The model learned some garbage correlations from this. For example, it decided “sushi night” predicted poor sleep. The real problem was that sushi often came with late social evenings and extra alcohol, not the rice or fish.

I had to manually label some “social nights” and add that as a feature before the pattern made sense. Messy human life always leaks into data like that.

Overfitting to short-term signals

My early runs tried to react to every small bump in HRV or energy. That created a weird pendulum effect. One bad night and the macros would swing too hard the next day.

I fixed this with more smoothing. Rolling averages, and fewer sharp tweaks. Instead of “change your macros by 20% today”, the system pushed small adjustments over several days unless things looked really off.

LLM creativity vs accuracy

LLMs like to be creative. Creativity in recipes is fine. Creativity in nutrient math is not.

Without a tight validator, the room-temperature IQ of the meal plans was impressive. Wild calorie gaps. Miscounted protein. I had to treat the LLM as a UI for meal structure, not a source of truth for numbers. All macro math happened in my own code.

How I Use It Now (And What I Stopped Doing)

I no longer run daily model training. The full closed-loop experiment taught me enough that I simplified the system.

Here is the current setup:

  • Weekly batch training on recent data.
  • The model suggests macro ranges for three categories: heavy training day, light training day, rest day.
  • I lock in those targets for the week, then let the LLM handle daily meal variation.

The system still looks at sleep and recovery, but mostly to flag outliers. If HRV and sleep both crash, it suggests a lower-calorie, higher-carb, easy-digesting day with less fat. Nothing extreme.

Things I stopped doing:

  • Daily macro micromanagement from the model.
  • Monstrous feature sets. I cut half the features that never moved the needle.
  • Obsessing about perfect logging on weekends. The model already treats weekend data with suspicion.

The net result feels calmer. The AI is now more like a very data-obsessed nutrition assistant that sets the rails. I still drive.

The Stuff I Would Keep If I Had To Start Over

If I wiped the codebase tomorrow and rebuilt from scratch, I would keep these pieces.

1. One simple question for the model

Trying to predict everything made the model noisy and my life annoying.

Now I use a single core question: “Given the training plan and recent recovery, what macro and micro ranges increase my chances of high energy and solid performance tomorrow?”

That is it. Everything else is secondary.

2. Narrow micronutrient focus

I do not try to optimize every vitamin. That feels like fake precision.

I focus on a few that my data actually supports as levers for me: magnesium, omega 3, fiber, and sodium/potassium balance. The model cares about those. The rest are “nice to have” and mostly handled by eating real food.

3. AI as a UX layer, not an oracle

The LLM is best used as glue. It translates constraints and targets into human-friendly meal ideas. It is not the authority on nutrition science. It is a fast interface.

Macros, feature engineering, and evaluation live in code I can audit. The AI just helps me stick to the plan without wanting to throw my phone at the wall.

Should You Do This? Probably Only If You Enjoy This Kind of Pain

Would I recommend this to everyone? No.

If you hate logging food, do not own a wearable, and do not care about your data, then this system will die in a week. You would get more value from lifting three times a week and walking outside.

But if you already track, and you like building weird internal tools, using machine learning and AI for nutrition can push you out of default mode. It forces you to confront how messy your habits are compared to your theory of your habits.

For me the main win was not “AI discovered a magical macro formula”. It was this:

  • My heavy training days feel more predictable.
  • My sleep is less random.
  • I removed a ton of daily food decisions from my brain.

The code is not special. The model is not special. The feedback loop is.

Hook your body data into something that can learn, add a layer that speaks human (the LLM), and then see what kind of diet falls out the other side. Just keep a hand on the wheel.

You are still the experiment. Not the model.

Subscribe to my newsletter

Subscribe to my newsletter to get the latest updates and news

Member discussion