Here are some of my recent projects that showcase my expertise in data engineering, machine learning, and analytics dashboard development.
Airbnb Data Science Project
End-to-end analysis of 50,000+ Airbnb listings using Random Forest and XGBoost, achieving 85% accuracy in price prediction. Created interactive visualizations with Seaborn and Plotly for market insights.
PythonScikit-learnXGBoostSeabornPlotly
Reddit Data Pipeline
Comprehensive ETL pipeline for Reddit data using Apache Airflow, Celery, and AWS services (S3, Glue, Athena, Redshift). Implemented automated data extraction and warehouse loading processes.
AirflowAWSPostgreSQLPythonRedshift
Tesla Stock Forecasting
LSTM neural network implementation for Tesla stock price prediction. Analyzes historical data patterns to forecast future price movements using deep learning techniques.
PythonTensorFlowLSTMPandasMatplotlib
Realtime Data Streaming
End-to-end data engineering pipeline using Apache Kafka, Spark, and Cassandra. Containerized with Docker for scalable deployment and real-time data processing.
KafkaSparkCassandraDockerPython
International Debt Analysis
Analysis of World Bank's international debt statistics, examining global economic patterns and debt distributions across developing countries.
PythonPandasSQLTableauExcel
Job Market Analytics Dashboard
Analytics dashboard analyzing 20,000+ data science job postings. Features salary distributions, geographical trends, and skill demand patterns across major tech companies.
PythonReactD3.jsPostgreSQLFlask