Featured Projects

Solutions I've built to solve real-world problems

IdeaGenie
2nd Prize Winner

LLM-Powered Innovation Ranking Engine

Architected an end-to-end idea evaluation engine using Flask, Docker, and Qwen LLM via Ollama, implementing the ReAct (Reasoning + Action) framework to simulate iterative reasoning steps for multi-criteria scoring.

Key Achievements

  • 2nd Prize – Hackathon Winner out of 50+ teams

  • Handled 100+ idea inputs with sub-180ms latency

  • Hybrid pipeline combining vector similarity with LLM evaluations

  • Production-ready REST API with input validation and concurrency

Technologies

Flask
Docker
Qwen LLM
Ollama
ReAct Framework
Python
REST API
React
ShieldScraper

AWS-Based Automated Data Pipeline

Developed a secure, fully automated web scraping pipeline leveraging Scrapy for data extraction, containerized with Docker, and deployed on AWS ECS Fargate for scalability and reliability.

Key Achievements

  • Fully automated web scraping pipeline

  • Real-time monitoring with CloudWatch and SNS

  • Interactive analytics dashboards with QuickSight

  • Daily automated workflows with EventBridge

Technologies

Scrapy
Docker
AWS ECS Fargate
AWS Glue
Lambda
DynamoDB
Athena
QuickSight
CloudWatch
ETL Pipeline for Weather Data

Airflow and Docker Implementation

Developed an automated ETL pipeline to extract real-time weather data from the OpenWeather API, transform it using Python, and load it into a PostgreSQL database, processing data for over 10+ locations daily.

Key Achievements

  • Automated ETL pipeline for 10+ locations daily

  • Containerized deployment with Docker

  • Modular DAGs for workflow automation

  • Seamless analysis and visualization capabilities

Technologies

Apache Airflow
Docker
PostgreSQL
Python
OpenWeather API
ETL
Data Visualization Dashboard

Interactive Analytics Platform

Built a comprehensive data visualization platform with interactive charts and real-time analytics. Features dynamic dashboards, advanced filtering, and multi-source data integration for comprehensive business intelligence.

Key Achievements

  • Interactive dashboards with real-time data updates

  • Multi-source data integration and processing

  • Advanced filtering and drill-down capabilities

  • Responsive design with mobile optimization

Technologies

D3.js
JavaScript
Python
Flask
PostgreSQL
Data Visualization

More Projects

Term Deposit Subscription Prediction

Machine learning model to predict customer term deposit subscription using classification algorithms and data analysis.

Python
Scikit-learn
Pandas
Machine Learning
Classification
8
3
Stock Market Prediction System

Advanced stock market prediction system using time series analysis and machine learning for financial forecasting.

Python
TensorFlow
Time Series
Financial Analysis
Deep Learning
12
4
Rock Paper Scissors Game

Interactive rock paper scissors game with modern UI design and game logic implementation for entertainment and learning.

JavaScript
HTML5
CSS3
Game Development
DOM Manipulation
6
2

Explore More Projects

Check out my GitHub profile for more projects, contributions, and open-source work.

View GitHub Profile

Experience

My professional journey and key achievements

Data Engineer Intern
San Diego, USA
May 2025 - Aug 2025
Internship

Key Achievements

  • Achieved real-time, cross-system data syncing, as measured by a 100% reduction in manual handoffs and faster reporting cycles, by building data pipelines from PostgreSQL to both PostgreSQL and BigQuery, and implementing CDC using Google Cloud Datastream.

  • Improved query maintainability and execution reliability, as measured by a 40% drop in SQL runtime errors and smoother dev handoffs, by refactoring legacy SQL into modular SQLx files and integrating them into Python-based BigQuery pipelines.

Technologies Used

PostgreSQL
BigQuery
Google Cloud Datastream
SQLx
Python
Plotly
+3
Technical Lead
Mar 2022 - Jul 2023
Leadership

Key Achievements

  • Implemented structured project management for 4 hackathons and 8+ coding competitions, enhancing team work while showing attention to detail in requirements gathering across technical challenges.

  • Coordinated 10+ member teams for 5 industry speaker sessions and 7 technical workshops, defining clear project scope and success metrics while demonstrating communication, organization, and problem-solving abilities.

Technologies Used

Project Management
Team Leadership
System Architecture
API Design
RESTful Services
Containerization
+1
Data Science Intern
Pune, Maharashtra, India
Feb 2022 - Mar 2022
Internship

Key Achievements

  • Led the project 'Predicting Term Deposit Subscription,' showcasing proficiency in data science through extensive research and advanced analytical techniques.

  • Employed web scraping for comprehensive data acquisition and implemented advanced data cleaning techniques to ensure high-quality datasets.

Technologies Used

Data Analytics
Statistics
Python
Machine Learning
Data Science
Web Scraping
+1
Ethical Hacking & Cyber Security Intern
Aug 2020 - Sep 2020
Internship

Key Achievements

  • Specialized in Ethical Hacking and Cyber Security, gaining expertise in identifying system vulnerabilities through intrusion evasion, firewall management, and honeypot analysis.

  • Honed skills in proposing effective mitigation strategies through comprehensive ethical hacking exercises and security assessments.

Technologies Used

Kali Linux
Ethical Hacking
Cybersecurity
Nmap
Wireshark
Tenable Nessus
+2
Currently seeking full-time opportunities

Technical Skills

Technologies and tools I use to build scalable solutions

Cloud & Infrastructure
AWS
Azure
Docker
Kubernetes
Terraform
CI/CD
Data Engineering
SQL
PostgreSQL
MySQL
MongoDB
Redis
Data Modeling
Programming Languages
Python
JavaScript
Java
C++
TypeScript
R
ML & LLMs
Scikit-learn
TensorFlow
LangChain
Prompt Engineering
ReAct
Big Data & ETL
Apache Airflow
PySpark
Spark
Hadoop
Batch Processing
Visualization & Tools
Power BI
Tableau
Git
Linux
VS Code
System Design

About Me

My journey from curiosity to scalable engineering

I began my journey with a strong foundation in Information Technology at Savitribai Phule Pune University, graduating with honors in Data Science. Today, I'm pursuing my Master's in Computer Science at Indiana University Bloomington (GPA 3.8/4.0), focusing on Cloud Computing, Advanced Databases, and Applied Machine Learning.

From solving SQL queries to designing cloud-scale ETL pipelines, my story is about turning curiosity into scalable engineering. Along the way, I’ve gone from building projects at hackathons to winning them, thriving at the intersection of data engineering and cloud architecture, where I focus on building systems that not only work but scale beautifully.

Current Focus

Software DevelopmentData ScienceCloud ComputingAdvanced DatabasesApplied Machine LearningData EngineeringSystem Design

Publications

Implementation Paper: Forecasting Stock Price using Machine Learning

IJARSCT • May 16, 2023

The Review: Forecasting Stock Price using Machine Learning

IJARSCT • May 4, 2023

Education

Master of Science in Computer Science

Indiana University, Bloomington

Aug 2024 - May 2026 • GPA: 3.8/4.0

Bachelor of Engineering in Information Technology

Savitribai Phule Pune University

Aug 2019 - May 2023 • GPA: 8.90/10.00

Certifications

AWS Certified Developer – Associate

Valid until Aug 4, 2027

Active

Azure AI Fundamentals

Issued Jul 22, 2022

Certified

Get In Touch

Let's discuss opportunities and ideas

Contact Information

Send a Message

© 2025 Varun Sonawane. All rights reserved.