Ahmed Refat Photo
Data Engineer — Cairo, Egypt

Ahmed Refat

Data Engineer

Building scalable data pipelines and turning raw data into clear business value. Passionate about ETL architecture, big data systems, cloud-based workflows, and analytics-ready platforms.

Who I Am

I am Ahmed Refat, an aspiring Data Engineer with a strong focus on building scalable and efficient data systems. I have hands-on experience designing data workflows using Python, SQL, PySpark, Airflow, and modern data tools. Through my projects, I work on ETL pipelines, streaming workflows, and analytics-ready data models that transform raw data into clear business insight.

1+
Years Experience
4+
Projects Built
5+
Certifications
2
Internships

Technical Stack

Data Engineering

Apache Spark Apache Airflow Apache Kafka dbt ETL Pipelines Data Warehousing

Databases

PostgreSQL MySQL MongoDB SQL Server Data Modeling

Programming

Python SQL PySpark Java C#

Cloud & DevOps

AWS EC2 S3 Redshift Glue IAM RDS Snowflake Docker Git / GitHub Linux

BI & Analytics

Power BI Grafana Reporting Dashboarding

Academic Background

Bachelor of Computer Science and Information Systems

Kafr El-Sheikh University, Faculty of Computers and Information – Information Systems Department

Graduation: May 2023

Selected Work

Project 1 Architecture
Project 1 Dashboard
Project 1 DAG Run
Project 1 Data Warehouse
01

NYC Smart Traffic Pipeline

End-to-End Batch & Streaming Data Pipeline

This project implements a scalable data engineering platform that processes both historical and real-time traffic-related data to analyze road incidents, weather impact, and urban mobility patterns.

Kafka Snowflake S3 Power BI Grafana Airflow PySpark Structured Streaming Docker
Project 2 Architecture
02

Flight & Airline Analytics Pipeline

Batch & Real-Time Data Processing

This project designs and implements a scalable data platform for flight and airline analytics using both batch and real-time processing across major New York airports, integrating flight, delay, airline, and weather data.

Kafka Snowflake S3 Power BI Grafana Airflow PySpark Structured Streaming Docker
Project 3 Architecture
Project 3 Dashboard
Project 3 Data Warehouse
Project 3 DAG Run
03

E-commerce Data Engineering Pipeline

End-to-End Batch Pipeline

This pipeline is an end-to-end system that extracts data from MongoDB and CSV files, processes it using Airflow and PySpark, and stores it in PostgreSQL using a star schema. It runs in a Dockerized environment for scalability and easy deployment.

MongoDB PySpark PostgreSQL Power BI Airflow Docker

Experience & Training

Nov 2025 — Present

Microsoft Data Engineer Trainee

Digital Egypt Pioneers Initiative (DEPI)
  • Gained hands-on experience building data pipelines using Python and SQL.
  • Worked on data modeling, ETL concepts, and cloud-based data platforms.
  • Contributed to designing scalable data workflows and improving data processing efficiency.
Jan 2026 — Mar 2026

Cloud Services Management & Operations Trainee

National Telecommunication Institute (NTI)
  • Learned cloud fundamentals, networking basics, and cloud service operations.
  • Practiced deploying and managing services on AWS and Linux environments.
  • Strengthened skills in system administration, troubleshooting, and cloud infrastructure monitoring.

Credentials

Cloud Services Management and Operation

National Telecommunication Institute (NTI) — Mar 2026

Linux Command Line & Docker

Udemy — Jan 2026

Analyzing and Visualizing Data with Microsoft Power BI

Udemy — Dec 2025

Introduction to MongoDB

MongoDB — Sep 2025

AWS Certified Solutions Architect

AWS — Apr 2026

AWS Cloud Foundations

AWS — Apr 2026