Hello! I'm
Aspiring Data Analyst & Data Scientist
CS grad · Chapman University San Francisco, CA
I'm a data-focused CS grad, born and raised in San Francisco, half Mexican and half Filipino, turning messy datasets into clear and decision-ready insight.
I work with Python, SQL, Tableau, Power BI, and APIs, building pipelines and analyses that people actually use. I treat data like a product: clean structure, clear logic, and outputs that make sense to non-technical stakeholders.
Outside of work, I'm building the Resale Market Price Tracker, a project at the intersection of fashion, data, and market intelligence. I care deeply about expanding access and representation in tech for Black and Latinx communities.
- Collected and structured operational data from robotic systems performing household task automation to support ML model development and training pipelines.
- Worked with engineering teams to apply standardized data collection protocols, keeping datasets consistent and reliable across evaluation cycles.
- Contributed to iterative model evaluation by surfacing behavioral patterns and performance gaps through systematic data review.
- Processed and sorted 350+ packages daily across Amazon and major carriers with 99% accuracy, maintaining throughput under time constraints.
- Maintained detailed records of all incoming mail and packages to support inventory tracking and data integrity across 4+ campus buildings.
- Built and deployed a website using HTML, CSS, and JavaScript to highlight housing insecurity and social injustices facing San Francisco's unhoused population.
- Documented and analyzed tech employee work culture to surface insights on industry accessibility and barriers for underrepresented communities.
- Modeled and joined two tables across 200+ countries to analyze infection rates, death counts, and vaccination rollout over time.
- Wrote queries using window functions, CTEs, and temp tables to track rolling vaccination totals and infection rates by country.
- Built SQL views to store key metrics by continent and country for downstream visualization.
- Analyzed 4,600+ Spotify songs and found Playlist Count (R² = 0.74) was a significantly stronger predictor of streams than Playlist Reach (R² = 0.32).
- Compared polynomial regression models across degrees 2, 3, and 4 on an 80/20 split; degree-2 was optimal as higher-degree models severely overfit the training data.
- Applied GMM clustering with AIC selection and compared DBSCAN vs. K-Means on 2,600+ songs, selecting K-Means (k=2, silhouette = 0.49) for cleaner cluster results.
- Cleaned and structured 22,000+ employee records spanning 2000 to 2020 in MySQL Workbench, standardizing date formats, normalizing categorical fields, and creating calculated columns for age, tenure, and termination metrics.
- Wrote SQL queries to aggregate workforce data by department, job title, race, gender, and state, then exported cleaned data into Power BI for visualization.
- Built an interactive Power BI dashboard covering gender and race distribution, age group breakdowns, HQ vs. remote splits, employee count trends over time, and turnover rates by department.
- Built an end-to-end pipeline pulling live eBay sold listings for luxury fashion brands like Bottega Veneta and Acne Studios.
- Parsed JSON API responses and cleaned and structured pricing data with Pandas to surface market trends for resale intelligence.
- Actively expanding brand coverage and visualization layer for deeper price trend analysis.