First you need to rewrite history:
git filter-branch --index-filter "git rm -r --cached --ignore-unmatch *.gem" --tag-name-filter cat -- --allNote the –r and the use of wildcards inside the index-filter command. With the other options this means that all *.gem files in all commits and tags are found and removed. This command prints all objects its deleted. If it doesn't print anything useful you have made an error!
Now delete the backup created by git filter-branch:
rd /q /s ".git/refs/original"Some magic to get rid of orphaned objects inside the git repository:
git reflog expire --expire=now --all git gc --prune=nowVerify that all files are really gone with git log -- *.gem and then repack your repository.
git gc --prune=now --aggressiveFinally, push your shrinked repository to the origin.
git push origin --forceThe next time you clone the repository you clone the repository you get the shrinked version.
Update: But as soon as you do a git pull (--rebase) all the unneeded and painfully removed objects are downloaded again to your hard disk. The only way to prevent this is by deleting the repository on GitHub and replacing it with the shrinked one (without changing names or urls). Astonishingly, existing clones continued to work with the replaced repository.
Update2: On GitHub itself is now a nice article explaining the process of cleaning/shrinking repositories, including a link to a tool called BFG Repo Cleaner that is specialized for this task.