Why I care about a lean repo
I hate slow Git operations. A bloated repo feels like running a marathon in a wet hoodie.
My ideebv-2026 repo runs a bunch of experiments for a long-running client project at Idee. Designs, spikes, one-off migrations, random export dumps. If I let it grow unchecked, cloning the repo would feel like downloading a game on Steam.
So I treat Git hygiene as part of the build. Not something you fix once a year. The goal is simple. Keep the main history tight. Keep backups and trash out of Git. Be able to rewrite history safely when I screw up.
This is the exact strategy and the actual commands I use to keep ideebv-2026 lean.
First line of defense: saying “no” with .gitignore
The best Git cleanup is the one you never need to run. So I start with a noisy, opinionated .gitignore.
Everyone ignores node_modules and build artifacts. That is table stakes. I go harder on anything that smells like a backup or a local dump.
# Standard stuff
node_modules/
.dist/
.build/
.cache/
# Design & exports
*.sketch
*.fig
*.xd
# Mac noise
.DS_Store
*.swp
# My local junk
backups/
*.backup
*.bak
*.tmp
*.log
# Tool-specific trash
.vscode/
.idea/
# Framework-specific
.next/
.svelte-kit/
.nuxt/
# Storybook, tests, etc
storybook-static/
coverage/
In ideebv-2026 I have a backups/ folder where I sometimes throw quick zips from clients or CMS exports. I want that folder on disk. I never want it in Git.
So it sits near the top of .gitignore. Non-negotiable.
The problem: backups already leaked into history
Of course I did not start this way. Early on, I committed a bunch of things that had no right living in version control.
- Raw CMS exports with personal data.
- Zip backups straight from production.
- Big binary design assets.
The repo got chunky. Clones took too long. Fetches felt sluggish on hotel Wi-Fi. Worse, I had sensitive stuff buried in old commits.
So I stopped pretending this was fine and decided to rewrite history.
Why I use git filter-repo instead of filter-branch
I used git filter-branch in the past. I think it is a trap for most people now. Slow, awkward, easy to misconfigure.
git filter-repo is faster, more explicit, and does not feel like you are casting a spell. It is not built into Git, but installing it once is worth it.
# On macOS with Homebrew
brew install git-filter-repo
# Or via pip
pip install git-filter-repo
Important detail. git filter-repo must run on a clean working tree. No staged changes, no dirty files. I always stash or commit first.
Back up before you rewrite
History rewriting is destructive. You change commit hashes. That breaks any old clone that tries to push later.
So for ideebv-2026 I always create a throwaway backup clone before I touch anything.
cd ~/dev
cp -R ideebv-2026 ideebv-2026-pre-filter-backup
Sometimes I zip that folder and label it with a date.
tar -czf ideebv-2026-pre-filter-2026-04-10.tar.gz ideebv-2026-pre-filter-backup
I almost never need that backup. I still sleep better knowing it exists.
Step 1: identify the junk
Before I rewrite anything, I want a list of the worst offenders. The big binary files. The weird folders I forgot about.
I use git rev-list plus git ls-tree to get a sense of what lives in history, but honestly, for ideebv-2026 I took a simpler approach. I knew the main culprits.
backups/folder with imports and zips.- Some old
.zipand.sqldumps in random subfolders. - Old screenshot folders with massive PNGs.
So I built a hit list.
backups/
legacy-assets/
**/*.zip
**/*.sql
**/*.psd
I do not try to be clever. I would rather over-delete junk and re-add the one file I still need than drag a full backups directory around forever.
Step 2: dry-run with git filter-repo using path specs
git filter-repo does not really have a “dry run” flag that shows a diff of what would change. The way I fake it is by running the command on a clone of the repo first, not the real origin.
For ideebv-2026 I did:
cd ~/dev
cp -R ideebv-2026 ideebv-2026-filter-test
cd ideebv-2026-filter-test
Then I nuked the junk from that test clone.
git filter-repo \
--path backups/ \
--path legacy-assets/ \
--path-glob "*.zip" \
--path-glob "*.sql" \
--path-glob "*.psd" \
--invert-paths
The important detail here is --invert-paths. With it, git filter-repo keeps everything except the paths you list. Without it, you keep only what you list and delete the rest. I do not like that kind of risk.
After it finishes, I check what changed.
git count-objects -vH
This shows the new repo size and loose objects. I compare it with the original. If it dropped by a few hundred megabytes and my worktree still looks normal, I consider that a win.
Step 3: commit and tag cleanup plan
History rewriting affects branches and tags. For ideebv-2026 I had a bunch of old tags from internal milestones. Most of them were dead weight.
Before running git filter-repo on the actual repo, I made two decisions.
- Only keep tags that map to production deploys.
- Delete any branch that has not been touched in 6 months and is fully merged.
So I listed tags and branches.
git tag
git branch -r
Then I removed the noise locally.
# Delete a few pointless tags
git tag -d spike-cms-test
git tag -d wip-migration-2024
# Delete merged branches locally
git branch -d feature/old-landing
I also pushed tag and branch deletions to origin before rewriting, so the remote state was clean.
git push origin :refs/tags/spike-cms-test
git push origin :refs/tags/wip-migration-2024
git push origin --delete feature/old-landing
Step 4: run git filter-repo for real
Once I liked the result in the test clone, I did the same thing on the real repo.
cd ~/dev/ideebv-2026
# Sanity check: clean working tree
git status
# Optional safety tag
git tag pre-filter-2026-04-10
# Run the actual rewrite
git filter-repo \
--path backups/ \
--path legacy-assets/ \
--path-glob "*.zip" \
--path-glob "*.sql" \
--path-glob "*.psd" \
--invert-paths
This walks through every commit and strips those files and directories from history. They vanish as if they never existed.
Then I sanity check again.
git log --oneline | head
git count-objects -vH
ls backups
At this point the backups/ directory is gone entirely from the repo history. That is the goal. If I still want a local backups/ directory for my own use, I recreate it and let .gitignore keep it out of version control.
Step 5: force push with intent and communication
After git filter-repo every commit hash has changed. The remote must be replaced.
For ideebv-2026 we are a small team. I can coordinate without bureaucracy, but I still treat this as a breaking change.
# Overwrite the remote main branch
git push origin main --force
If there are other active branches, I also force push those.
git push origin experimental-layout --force
Then I tell teammates, explicitly, what they need to do.
# Example message
"I rewrote history to remove backups and binary junk.
If you have a local clone of ideebv-2026, do this:
1. git fetch origin
2. git checkout main
3. git reset --hard origin/main
If you have local branches with work, stash or patch them before you reset."
I prefer clear, slightly annoying instructions over some mysterious Git error days later.
Ongoing hygiene: hooks, scripts, and habits
Once the big cleanup is done, I focus on not repeating the same mistakes.
Pre-commit check for obvious junk
I use a small pre-commit hook that shouts if I try to add zip files, SQL dumps, or anything in backups/. It is ugly, but effective.
# .git/hooks/pre-commit
#!/bin/sh
# Block obvious junk from getting committed
if git diff --cached --name-only | grep -E '\\.(zip|sql|psd)$|^backups/'; then
echo "\n[BLOCKED] You are trying to commit backup or binary junk."
echo "Remove it from the index or add it to .gitignore."
exit 1
fi
exit 0
I keep this script in the repo as scripts/pre-commit and have a one-liner in the README telling teammates to copy it into .git/hooks if they want the guard rails.
Short script to re-check repo size
Every few weeks I run a quick script to see if the repo inflated again. Nothing fancy.
# scripts/git-size
#!/usr/bin/env bash
git count-objects -vH | sed -n '1,6p'
Then I call it.
chmod +x scripts/git-size
./scripts/git-size
If I see the size creeping up, I check what landed recently. Usually it is someone adding a giant PNG or an export dump. Easy to fix if you catch it early.
Handling the “oops I committed a secret” moment
Even with this setup, mistakes happen. Someone commits an API key to .env.local. Or I paste a production URL with a token in a config file.
For ideebv-2026 I treat leaked secrets differently from random backups.
- Revoke or rotate the secret immediately.
- Then scrub it from Git history with filter-repo.
Here is what that looks like in practice for a simple text replacement.
git filter-repo \
--replace-text <<EOF
production-api-key-12345
# Replace with placeholder
@@@@@REDACTED_API_KEY@@@@@
EOF
That walks history and replaces every instance of production-api-key-12345 with a placeholder. I still rotate the key, but now the repo does not drag the old secret forever.
Why I like this workflow
This is not the only way to keep a Git repo lean. It is just the one that has worked for ideebv-2026 without turning Git into a full-time job.
- .gitignore is aggressive. I would rather say no by default and manually commit the rare big file I really need.
- filter-repo is my sledgehammer. When something ugly lands in history, I fix it fast and properly instead of pretending it is fine.
- Force pushes are coordinated. No silent history rewrites at 23:00 without telling anyone.
The payoff is simple.
- New teammates clone the repo quickly.
- CI does not waste time transferring junk.
- I am not afraid to experiment inside the project because I know I can clean up after myself.
If your repo feels heavy and slow, you probably already know which folders or files are to blame. Build a hit list. Clone your repo into a sandbox. Run git filter-repo on it and see what happens.
Once you see that you can surgically cut out backups, exports, and old binaries without breaking your actual work, Git stops feeling like a black box and starts feeling like a tool again.
That is all this strategy is. A bit of discipline up front, a sledgehammer in the drawer, and a refusal to let “just this once” junk live in the repo forever.
Subscribe to my newsletter to get the latest updates and news
Member discussion