Certified Azure Data Engineer 🎓

Contact me in: LinkedIn

Location: Monterrey, Mexico.

Technical Skills: Azure, Synapse, Data Factory, Databricks, Python, SQL Server, SSIS, Power BI, AWS Sagemaker, Tableau, Github

Azure Projects

Productivity Predictor

Created a medallion schema with Dimension and Fact tables.
Ingested SAP price update requests and raw material prices (REST API) via Azure Data Factory pipelines to calculate productivity impacts using Databricks.
Used KERAS to develop Neural Networks for predicting parts cost and estimate impact productivity.
Used and Optimize Synapse SQL Pools allowing direct connections from Power BI and Tableau.
Managing CI/CD Using Github.

Appbot Datawarehouse

Implemented an ELT solution using medallion architecture to track Mobile App data from Appbot REST API to extract insighful reviews and ratings data
Performed data profiling analysis and adhoc transformations in silver layer using delta loads.
Created and optimized gold layer in Synapse SQL Pool.
Usage of Github Copilot to accelerate the development process.
Managing CI/CD Using Github and Jenkins.

Azure Migration Project

Migrated project in Azure to align with new business requirements
Explored old setup and implemented new setup with Synapse SQL Pool replacing Databricks Delta table.
Partition of tables by daily partitions with millions of rows.
Managing CI/CD Using Github and Jenkins.

Part Number Matcher

Ingest SAP Data via Data Factory and Databricks to build an NLP powered Part number matcher at scale used downstream by business to improve grouping quality for part numbers.

Other Data Projects

UNSPSC Categorizer

NLP tool developed in Python using RegEx, NLTK and scikit-learn (Tfidf, SVM) and Descriptions-UNSPSC labels as datasource (csv) to retrieve UNSPSC for a given part description.

See repository.

Sea Level Predictor

The objective of this project is to predict sea levels for 2050 based on a dataset of global sea levels since 1880.

This project in Jupyter Notebook uses Pandas, Numpy, Scipy and Matplotlib.

See repository.

Cardiac Diseases

These modules were developed to perform data analysis over a dataset of 70k patients. Variables like blood preasure, height, weigth, patologies, etc were examinated on them.

By analyzing demographic, medical and habits data we want to determine if they can lead into a cardiac disease.

See repository.

Blacklist System

This project was made for a Debt Collection team.

The objective of this project is to have a control of blocked phone numbers and emails for clients with problems in their credits.

It was implemented using Windows Forms automated with Excel Macros as GUI and connected to SQL Server database secured by Windows Authentication.

Git and Github for version control in all projects.

General automations with Power Automate, dataloader.io, DemandTools and Web Scraper Platform.