About & Contact
Learn more about the author and get in touch
About This Project
This portfolio project showcases a comprehensive machine learning analysis of Airbnb listing data from New York City. Originally completed as part of CPSC 330: Applied Machine Learning, the analysis demonstrates end-to-end ML workflow from data exploration to model deployment.
The project predicts listing popularity using reviews per month as a proxy metric, achieving an impressive R² of 0.6956 on the test set. This website transforms the original Jupyter notebook into an interactive, portfolio-ready presentation suitable for both technical and non-technical audiences.
Technical Stack
Analysis: Python, scikit-learn, pandas, matplotlib, seaborn, LightGBM, SHAP
Website: Next.js, TypeScript, TailwindCSS, Recharts, Vercel
Key Achievements
Academic Excellence
- • Comprehensive ML workflow implementation
- • Advanced feature engineering techniques
- • Model interpretation with SHAP
- • Statistical validation and testing
Professional Skills
- • Full-stack web development
- • Interactive data visualization
- • Responsive design principles
- • Modern deployment practices
Benjamin Gerochi
Data Science & Machine Learning Enthusiast
Project Resources
Complete analysis with code, visualizations, and detailed explanations
Project Methodology
1. Data Exploration
Comprehensive analysis of 48,895 NYC Airbnb listings with 16 features.
- • Missing value pattern analysis
- • Feature distribution examination
- • Correlation analysis
- • Geographic pattern identification
2. Feature Engineering
Created derived features to capture domain-specific insights.
- • Minimum payment calculation
- • Recency metrics
- • Categorical binning
- • Geographic encoding
3. Model Development
Systematic comparison and optimization of multiple algorithms.
- • Cross-validation framework
- • Hyperparameter optimization
- • Performance evaluation
- • Model interpretation
Website Development
This portfolio website was built to showcase the analysis in an interactive, accessible format. The development process focused on creating a professional presentation suitable for recruiters, collaborators, and technical audiences.
Design Principles
- • Clean, professional aesthetic
- • Responsive design for all devices
- • Interactive visualizations
- • Accessible navigation structure
- • SEO optimization
- • Performance optimization
Technical Implementation
- • Next.js 15 with App Router
- • TypeScript for type safety
- • TailwindCSS for styling
- • Recharts for data visualization
- • Vercel for deployment
- • Modern web standards
Built with Next.js, TypeScript, and TailwindCSS. Deployed on Vercel.
© 2024 Benjamin Gerochi. This project is for educational and portfolio purposes.