From Zero to Production: How I Built a Full-Stack ML Application with 88.62% Accuracy
A comprehensive technical deep-dive into building, optimizing, and deploying a multi-model machine learning system
THE JOURNEY
In 15 days, I built a production-ready loan prediction system featuring:
- 4 ML models with 88.62% best accuracy
- REST API with Flask and PostgreSQL
- 83% test coverage with pytest
- Professional Swagger documentation
- 3-4x performance optimization
- Live deployment on Render
Table of Contents
- Introduction
- The Problem
- System Architecture
- Week 1: Foundation
- Week 2: Production
- Week 3: Polish
- Technical Deep-Dives
- Challenges & Solutions
- Performance Optimization
- Lessons Learned
- What's Next
Introduction
Three weeks ago, I had an idea: build a production-grade machine learning system that demonstrates not just ML skills, but full-stack engineering, DevOps, and software craftsmanship.
The result? A loan prediction API that's currently serving real predictions with 88.62% accuracy, complete with professional documentation, comprehensive testing, and performance optimizations that make it 3-4x faster than the initial version.

The landing page showing live statistics and model performance
This post is a technical deep-dive into how I built it, the challenges I faced, and the lessons I learned along the way.
The Problem
Challenge: Build a system that predicts loan approval decisions based on applicant information.
Requirements:
- High accuracy (>85%)
- Production-ready code
- Comprehensive testing
- Professional documentation
- Performance optimization
- Real-time predictions
- Scalable architecture
Constraints:
- 15-day timeline
- Solo developer
- Limited budget (free tier hosting)
System Architecture
The system follows a layered architecture pattern:
┌─────────────────────────────────────────┐
│ Client Layer │
│ (Web Browser, API Clients) │
└──────────────┬──────────────────────────┘
│
┌──────────────▼──────────────────────────┐
│ Presentation Layer │
│ (Flask Routes, Swagger UI) │
└──────────────┬──────────────────────────┘
│
┌──────────────▼──────────────────────────┐
│ Application Layer │
│ (Validation, Caching, Rate Limiting) │
└──────────────┬──────────────────────────┘
│
┌───────┴──────┐
│ │
┌──────▼─────┐ ┌─────▼──────┐
│ ML Layer │ │ Data Layer │
│ (4 Models) │ │(PostgreSQL)│
└────────────┘ └────────────┘

High-level system architecture showing separation of concerns
Tech Stack:
- Backend: Flask 3.0, Python 3.10
- ML: scikit-learn, pandas, numpy
- Database: PostgreSQL (production), SQLite (dev)
- Testing: pytest, coverage
- Documentation: Flasgger (Swagger/OpenAPI)
- Performance: Flask-Caching, Flask-Limiter, Flask-Compress
- Deployment: Render, Gunicorn
- Frontend: Vanilla JavaScript, CSS3, HTML5
Week 1: Foundation (Days 1-6)
Day 1-2: API Setup & Data Exploration
I started with the basics: a Flask API and understanding the data.
# Initial Flask setup
from flask import Flask, request, jsonify
app = Flask(__name__)
@app.route('/predict', methods=['POST'])
def predict():
data = request.get_json()
# TODO: Add ML model
return jsonify({"prediction": "pending"})
The dataset had 614 loan applications with 12 features:
- Applicant demographics (Gender, Married, Dependents, Education)
- Financial data (ApplicantIncome, CoapplicantIncome, LoanAmount)
- Loan details (Loan_Amount_Term, Credit_History)
- Property information (Property_Area)
Key insights from EDA:
- 69% approval rate in training data
- Credit_History was the strongest predictor
- Income and loan amount had right-skewed distributions
- Missing values in multiple columns (~10-15%)

Exploratory data analysis showing feature distributions and correlations
Day 3-4: Data Preprocessing & Feature Engineering
Missing values were handled strategically:
# Categorical: Mode imputation
df['Gender'].fillna(df['Gender'].mode()[0], inplace=True)
df['Married'].fillna(df['Married'].mode()[0], inplace=True)
# Numerical: Median imputation
df['LoanAmount'].fillna(df['LoanAmount'].median(), inplace=True)
Feature engineering made a huge difference:
# Created features that improved accuracy by 8%
df['Total_Income'] = df['ApplicantIncome'] + df['CoapplicantIncome']
df['Loan_to_Income_Ratio'] = df['LoanAmount'] / (df['Total_Income'] + 1)
df['Income_per_Dependent'] = df['ApplicantIncome'] / (df['Dependents'] + 1)
df['Log_ApplicantIncome'] = np.log1p(df['ApplicantIncome'])
df['Log_LoanAmount'] = np.log1p(df['LoanAmount'])
Day 5-6: First ML Model
Started with Random Forest:
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
model = RandomForestClassifier(
n_estimators=100,
max_depth=10,
min_samples_split=5,
random_state=42
)
model.fit(X_train, y_train)
Results:
- Initial accuracy: 79.83%
- After feature engineering: 88.62% ✅
- Precision: 0.89
- Recall: 0.96
- F1-Score: 0.92

Performance metrics for Random Forest model
Week 2: Production (Days 7-13)
Day 7-8: ML Integration & Validation
Integrated the model with comprehensive input validation:
class LoanApplicationValidator:
MIN_INCOME = 0
MAX_INCOME = 100000
MIN_LOAN_AMOUNT = 0
MAX_LOAN_AMOUNT = 10000
def validate_loan_application(self, data):
errors = []
warnings = []
# Required field validation
if 'ApplicantIncome' not in data:
errors.append("ApplicantIncome is required")
elif data['ApplicantIncome'] <= self.MIN_INCOME:
errors.append("ApplicantIncome must be greater than 0")
# Range validation
if data.get('LoanAmount', 0) > self.MAX_LOAN_AMOUNT:
warnings.append("Loan amount is unusually high")
return len(errors) == 0, errors, warnings
This prevented bad predictions and improved user experience.

User-friendly validation error messages in the UI
Day 9-10: Database Integration
Added PostgreSQL for prediction history:
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy(app)
class Prediction(db.Model):
id = db.Column(db.Integer, primary_key=True)
timestamp = db.Column(db.DateTime, default=datetime.utcnow)
prediction = db.Column(db.String(20))
confidence = db.Column(db.Float)
input_data = db.Column(db.JSON)
@classmethod
def from_request(cls, input_data, result, warnings, ip):
return cls(
prediction=result['prediction'],
confidence=result['confidence'],
input_data=input_data,
warnings=json.dumps(warnings),
ip_address=ip
)
This enabled analytics and monitoring.

Database-backed prediction history with filtering and pagination
Day 11-12: Testing
Achieved 83% test coverage with pytest:
def test_predict_success(client):
"""Test successful prediction"""
data = {
"ApplicantIncome": 5000,
"LoanAmount": 150,
"Credit_History": 1
}
response = client.post('/predict', json=data)
assert response.status_code == 200
assert 'prediction' in response.json
assert response.json['prediction'] in ['Approved', 'Rejected']
assert 0 <= response.json['confidence'] <= 1
def test_predict_validation_error(client):
"""Test validation error handling"""
data = {"ApplicantIncome": -1000} # Invalid
response = client.post('/predict', json=data)
assert response.status_code == 400
assert 'validation_errors' in response.json

pytest coverage report showing 83% code coverage
Day 13: Deployment
Deployed to Render with:
# render.yaml
services:
- type: web
name: loan-predictor-api
env: python
buildCommand: pip install -r requirements.txt
startCommand: gunicorn app:app
envVars:
- key: PYTHON_VERSION
value: 3.10.0
Deployment challenges:
- PostgreSQL connection string format differences
- Environment variable management
- Cold start optimization
Week 3: Polish (Days 14-18)
Day 14: Frontend Development
Built a responsive UI with vanilla JavaScript:
async function makePrediction() {
const data = {
ApplicantIncome: parseFloat(document.getElementById('income').value),
LoanAmount: parseFloat(document.getElementById('loan').value),
Credit_History: parseInt(document.getElementById('credit').value)
};
const response = await fetch(`${API_URL}/predict`, {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify(data)
});
const result = await response.json();
displayResult(result);
}
Design principles:
- Mobile-first responsive design
- Clear visual hierarchy
- Immediate feedback
- Error handling with helpful messages

Mobile and desktop views showing responsive layout
Day 15: API Documentation
Added Swagger/OpenAPI documentation:
from flasgger import Swagger, swag_from
swagger = Swagger(app, template=swagger_template)
@app.route('/predict', methods=['POST'])
@swag_from('docs/swagger/predict.yml')
def predict():
"""Make loan approval prediction"""
# Implementation
This made the API self-documenting and easy to test.

Interactive API documentation with Swagger UI
Day 16: Performance Optimization
Implemented caching, rate limiting, and compression:
from flask_caching import Cache
from flask_limiter import Limiter
from flask_compress import Compress
cache = Cache(app, config={'CACHE_TYPE': 'simple'})
limiter = Limiter(app, key_func=get_remote_address)
compress = Compress(app)
@app.route('/statistics')
@cache.cached(timeout=60) # Cache for 1 minute
def statistics():
return jsonify(get_statistics())
@app.route('/predict', methods=['POST'])
@limiter.limit("100 per hour") # Rate limit
def predict():
# Implementation
Performance improvements:
- Average response time: 500ms → 150ms (3.3x faster)
- P95 response time: 800ms → 250ms (3.2x faster)
- Cache hit rate: 85%
- Response size: 70% smaller with compression
Day 17: Multi-Model System
Trained and compared 4 different models:
| Model | Accuracy | Precision | Recall | F1-Score | Training Time |
|---|---|---|---|---|---|
| Random Forest | 88.62% | 0.89 | 0.96 | 0.92 | 2.3s |
| Logistic Regression | 84.55% | 0.85 | 0.94 | 0.89 | 0.1s |
| Gradient Boosting | 87.80% | 0.88 | 0.95 | 0.91 | 5.7s |
| SVM | 83.74% | 0.84 | 0.93 | 0.88 | 1.2s |
Random Forest won on accuracy, but Logistic Regression was 23x faster to train.

Side-by-side comparison of four different ML models
Added model comparison endpoint:
@app.route('/models/benchmark', methods=['POST'])
def benchmark_models():
"""Test all models with same input"""
results = {}
for model_name, model in available_models.items():
prediction = model.predict(df_processed)[0]
results[model_name] = {
'prediction': 'Approved' if prediction == 1 else 'Rejected',
'confidence': float(max(model.predict_proba(df_processed)[0]))
}
return jsonify(results)
Day 18: Demo Video & Documentation
Created comprehensive portfolio assets:
- 2-minute demo video
- 15+ screenshots
- Architecture diagrams
- Technical documentation
Technical Deep-Dives
1. Feature Engineering Impact
Feature engineering improved accuracy from 79.83% to 88.62% (+8.79%).
Most impactful features:
-
Credit_History(original) - 42% importance -
Total_Income(engineered) - 18% importance -
Loan_to_Income_Ratio(engineered) - 15% importance -
Log_ApplicantIncome(engineered) - 12% importance
# Feature importance analysis
importances = model.feature_importances_
feature_importance_df = pd.DataFrame({
'feature': feature_names,
'importance': importances
}).sort_values('importance', ascending=False)
print(feature_importance_df.head(10))

Feature importance chart showing engineered features in top 5
2. Handling Class Imbalance
The dataset had 69% approvals vs 31% rejections.
Solution: Stratified train-test split
X_train, X_test, y_train, y_test = train_test_split(
X, y,
test_size=0.2,
random_state=42,
stratify=y # Maintains class distribution
)
This ensured both sets had similar approval rates.
3. Model Serialization
Used joblib for efficient model saving:
import joblib
# Save
joblib.dump(model, 'models/loan_model_v2.pkl')
# Load
model = joblib.load('models/loan_model_v2.pkl')
joblib is faster than pickle for numpy arrays and 10x smaller for large models.
4. Database Query Optimization
Optimized statistics endpoint with eager loading:
# Before: N+1 queries (slow)
predictions = Prediction.query.all()
for pred in predictions:
user = User.query.get(pred.user_id) # Separate query each time
# After: Single query with join (fast)
predictions = Prediction.query.options(
joinedload(Prediction.user)
).all()
Result: 10x faster for 100+ predictions.
5. Caching Strategy
Different cache times for different data volatility:
# Rarely changes - long cache
@cache.cached(timeout=3600) # 1 hour
def model_info():
return model_metadata
# Changes occasionally - medium cache
@cache.cached(timeout=60) # 1 minute
def statistics():
return get_statistics()
# Changes frequently - short cache
@cache.cached(timeout=30) # 30 seconds
def recent_predictions():
return get_recent_predictions(10)
Challenges & Solutions
Challenge 1: PostgreSQL Connection Issues
Problem: Render uses postgres:// but SQLAlchemy 1.4+ requires postgresql://
Solution:
database_url = os.environ.get('DATABASE_URL')
if database_url and database_url.startswith('postgres://'):
database_url = database_url.replace('postgres://', 'postgresql+psycopg2://', 1)
Challenge 2: Cold Start Performance
Problem: First request after inactivity took 5+ seconds
Solution:
- Implemented health check endpoint
- Added model preloading on startup
- Used Render's always-on plan
# Load model on startup, not on first request
with app.app_context():
load_model()
Challenge 3: Missing Value Handling
Problem: Real-world data had missing values not in training
Solution: Defensive preprocessing with defaults
def preprocess_input(data):
# Provide sensible defaults
if 'LoanAmount' not in data or pd.isna(data['LoanAmount']):
data['LoanAmount'] = 128.0 # Median from training
if 'Credit_History' not in data:
data['Credit_History'] = 1.0 # Most common value
return data
Challenge 4: Test Coverage Gaps
Problem: Initial coverage was only 45%
Solution: Systematic test writing
# Test matrix approach
test_cases = [
# (input, expected_status, expected_keys)
({'ApplicantIncome': 5000}, 200, ['prediction', 'confidence']),
({'ApplicantIncome': -100}, 400, ['error', 'validation_errors']),
({}, 400, ['error']),
]
@pytest.mark.parametrize("input_data,status,keys", test_cases)
def test_predict(client, input_data, status, keys):
response = client.post('/predict', json=input_data)
assert response.status_code == status
for key in keys:
assert key in response.json
Result: 45% → 83% coverage
Performance Optimization
Before vs After
| Metric | Before | After | Improvement |
|---|---|---|---|
| Avg Response Time | 500ms | 150ms | 3.3x faster |
| P95 Response Time | 800ms | 250ms | 3.2x faster |
| Throughput | 10 req/s | 35 req/s | 3.5x higher |
| Response Size | 2.5 KB | 0.7 KB | 72% smaller |
| Cache Hit Rate | 0% | 85% | ∞ improvement |
Optimization Techniques
1. Response Caching
@cache.cached(timeout=60, query_string=True)
def history():
# Expensive database query
return get_recent_predictions()
2. Database Indexing
class Prediction(db.Model):
id = db.Column(db.Integer, primary_key=True)
timestamp = db.Column(db.DateTime, index=True) # Index for sorting
prediction = db.Column(db.String(20), index=True) # Index for filtering
3. Compression
compress = Compress(app) # Automatic gzip compression
4. Connection Pooling
app.config['SQLALCHEMY_ENGINE_OPTIONS'] = {
'pool_size': 10,
'pool_recycle': 3600,
'pool_pre_ping': True
}

Response time comparison before and after optimization
Lessons Learned
1. Feature Engineering > Algorithm Selection
I spent 2 days trying different algorithms (Random Forest, XGBoost, Neural Networks) and got marginal improvements (1-2%).
Then I spent 4 hours on feature engineering and got +8% accuracy.
Lesson: Understand your data first. Good features beat fancy algorithms.
2. Testing Saves Time
Initially, I skipped tests to "move faster." Then I spent 3 days debugging production issues.
After adding comprehensive tests, I could refactor confidently and caught bugs before deployment.
Lesson: Tests are not overhead. They're insurance.
3. Documentation is for Future You
I didn't document my preprocessing steps initially. Two weeks later, I couldn't remember why I used median vs mean imputation.
Good documentation saved me hours of re-research.
Lesson: Document decisions, not just code.
4. Performance Matters
My initial API took 500ms per request. Users noticed. After optimization (150ms), the experience felt instant.
Lesson: Performance is a feature.
5. Production ≠ Development
What worked locally didn't always work in production:
- Database connection strings were different
- Environment variables needed careful management
- Cold starts were a real issue
Lesson: Test in production-like environments early.
What's Next
Immediate Improvements
- [ ] Add user authentication
- [ ] Implement A/B testing framework
- [ ] Add model monitoring and drift detection
- [ ] Create admin dashboard
- [ ] Add email notifications
Future Features
- [ ] Explainable AI (SHAP values)
- [ ] Real-time model retraining
- [ ] Multi-language support
- [ ] Mobile app (React Native)
- [ ] Batch prediction API
Technical Debt
- [ ] Migrate to Redis for distributed caching
- [ ] Add Celery for background tasks
- [ ] Implement proper logging (ELK stack)
- [ ] Add monitoring (Prometheus + Grafana)
- [ ] Set up CI/CD pipeline
Conclusion
Building this system taught me that production ML is 20% modeling and 80% engineering.
The model was working on Day 5. Making it production-ready took 13 more days.
But that's what separates a Jupyter notebook from a real product:
Comprehensive testing
Professional documentation
Performance optimization
Error handling
Monitoring and logging
Security considerations
User experience
If you're building an ML system, focus on these fundamentals. The fancy algorithms can wait.
Resources
- Live Demo: https://loan-predictor-api-91xu.onrender.com/app
- API Documentation: https://loan-predictor-api-91xu.onrender.com/docs
- GitHub Repository: [https://github.com/olatunjitobiloba/loan-predictor-api]
- Demo Video: [https://www.youtube.com/watch?v=4NUUZVpKvy8]
Connect with me:
- LinkedIn: [https://www.linkedin.com/in/olatunjioluwatobiloba/]
- GitHub: https://github.com/olatunjitobiloba
- Email: [olatunjitobiloba05@gmail.com]
Appendix: Complete Tech Stack
Backend
- Python 3.10
- Flask 3.0
- Gunicorn 21.2
- scikit-learn 1.3
- pandas 2.0
- numpy 1.24
Database
- PostgreSQL 14
- SQLAlchemy 2.0
- psycopg2-binary 2.9
Testing
- pytest 7.4
- pytest-cov 4.1
- coverage 7.3
Performance
- Flask-Caching 2.1
- Flask-Limiter 3.5
- Flask-Compress 1.14
Documentation
- Flasgger 0.9.7
- Swagger UI
Frontend
- HTML5
- CSS3 (Grid, Flexbox)
- JavaScript (ES6+)
- Fetch API
DevOps
- Git/GitHub
- Render
- Environment variables
- Logging
Thanks for reading! If you found this helpful, please share it and star the repo on GitHub.
Questions? Comments? Reach out on LinkedIn or open an issue on GitHub.
Top comments (0)