Creating a Personal Budget App and Converting it to Open Source
The Context
For several years I had been using Google Sheets to track spending and set up a budget, however I found myself very frustrated when I had to manually enter all of my transactions into it. I eventually wrote some Ruby scripts to reformat statement data from multiple credit cards so I could copy and paste it into Google Sheets — however, it was still a pain as it would cause issues with formatting and I'd have to spend time fixing it. Finally one day I decided it was time to make it all a bit more automated and have a UX I had more control over, so I built my own app.
I chose Ruby on Rails to build it with for two reasons. First, I wanted something easy to work with and I was familiar with. The second, I wanted to more deeply learn Rails by building an app from scratch as I generally spend most of my time working on existing apps.
The Challenge with Open Source
Fast forward to now, the app is working well for me - it imports transactions from various banks, categorizes expenses using regex patterns, and helps me manage budgets and financial goals with far less manual processes than before.
Thinking about putting the code out publicly as is though, the transaction categorization patterns contained personal information. Patterns like /local_business/i => 'Kid Stuff'
and /jasons_favorite_restaurant/i => 'Entertainment'
revealed not just my location, but family details and personal habits. This is certainly the kind of data you don't want in a public repository.
I wanted to open source the application. The Rails architecture, import services, and financial management features could be interesting to other folks, or help them build their own budget tracking tools, but how could I share the code while protecting my privacy and not involve a ton of manual work?
The Solution: Automated Sanitization
The answer was building an automated sync system that maintains two repositories:
- Private repo: Contains my real transaction patterns and automation scripts
- Public repo: Contains sanitized example patterns suitable for public consumption
Key Components
1. Privacy Assessment
First, I audited exactly what personal data was embedded in the code:
# PRIVATE: Real personal patterns (kept private)
CATEGORY_PATTERNS = {
/local_childrens_center/i => 'Kid Stuff',
/jasons_favorite_spot/i => 'Entertainment',
/local_grocery_store/i => 'Food'
}
# PUBLIC: Generic examples (safe to share)
CATEGORY_PATTERNS = {
/daycare/i => 'Kid Stuff',
/restaurant/i => 'Entertainment',
/grocery/i => 'Food'
}
The private patterns revealed my city, family structure, and specific businesses I frequent. The public examples provide the same functionality without personal exposure.
2. File Exclusion
I created a sync configuration that excludes sensitive files:
# Files to exclude from public repo
EXCLUDE_FILES=(
"lib/shared/transaction_category_patterns.rb"
"lib/shared/transaction_skip_patterns.rb"
)
3. Automated File Processing
The sync script automatically:
- Copies all git-tracked files from private to public repo
- Excludes sensitive files using the configuration
- Renames
*_example.rb
files to their active counterparts to ensure the app can run without much initial effort - Creates a public-specific
.gitignore
- Commits and pushes changes automatically
# Smart sync using git ls-files (respects .gitignore)
git ls-files | while read -r file; do
# Check exclusion rules and copy accordingly
process_file "$file"
done
4. Git Hook Automation
A pre-push
hook ensures the public repo stays synchronized:
# Runs before every push to main branch
if [[ "$current_branch" == "main" ]]; then
echo "🔄 Running automatic sync to public repo..."
./private/sync-to-public.sh
fi
Technical Implementation
The sync system uses several key techniques:
- Git-aware copying: Uses
git ls-files
instead ofrsync
to respect.gitignore
rules. This limits the number of exclusions I have to add to the sync configuration - Pattern-based exclusion: Excludes files by name patterns while preserving directory structure
- File transformation: Automatically renames example files to production names in public repo
Results
The system works seamlessly:
- I push changes to my private repo (with real transaction patterns)
- Pre-push hook automatically triggers sync script
- Public repo updates with sanitized code and generic examples
- Other developers can clone the repo and use the generic patterns as starting points
Repository Structure
Private Repo:
├── lib/shared/
│ ├── transaction_category_patterns.rb # Real personal patterns
│ ├── transaction_category_patterns_example.rb # Generic examples
│ └── transaction_skip_patterns_example.rb
├── private/ # Automation scripts (git-ignored)
│ ├── sync-to-public.sh
│ ├── sync-config.sh
│ └── install-hooks.sh
└── .gitignore # Excludes private/ folder
Public Repo:
├── lib/shared/
│ ├── transaction_category_patterns.rb # Copied from *_example.rb
│ ├── transaction_category_patterns_example.rb
│ └── transaction_skip_patterns_example.rb
├── LICENSE # CC BY-NC 4.0
├── PRIVACY.md # Privacy guidelines
└── PUBLIC_REPO_NOTE.md # Notice about sanitization
Privacy Protections
Licensing
I chose Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC 4.0) to:
- Allow personal and educational use
- Prevent commercial use
- Require attribution while protecting my interests
Documentation
Added clear privacy guidelines for contributors:
- How to customize transaction patterns safely
- What personal data to avoid in pull requests
- Guidelines for creating generic examples
Lessons Learned
1. Privacy by Design
Consider privacy implications before the first commit. Personal finance apps can be particularly sensitive - you can reveal lifestyle, location, family structure, and spending habits.
2. Automation Prevents Errors
Manual sanitization is error-prone. Automated systems ensure consistency and prevent accidental data exposure.
3. Git Hooks Enable Seamless Workflow
The pre-push hook makes the dual-repo system invisible. I work normally on my private repo, and the public repo stays automatically synchronized.
4. Generic Examples Have Value
The sanitized examples aren't just privacy protection - they're documentation. They show other developers how the system works without exposing personal details.
Open Source Impact
The public repository at github.com/jgodson/budget-app-public now provides:
- Rails 7 app architecture
- Multi-bank import services for some Canadian financial institutions
- Generic transaction categorization that users can customize
- Goal tracking and loan management features
- Docker deployment configuration
Other developers can fork the repository, customize the transaction patterns for their region/needs, and build their own budget management tool, or just use what they want/learn from it. That's the beauty of open source.
Conclusion
Converting a personal application to open source doesn't require sacrificing privacy. With careful planning, automated sanitization, and smart git practices, you can share valuable code while protecting sensitive personal data.
The key is building systems that handle the complexity automatically. A few hours of automation setup now prevents years of manual work and potential privacy mistakes.