Mohammed AbdElhakim | Data Engineer & ETL Specialist

About Me

I'm a university student with a deep fascination for leveraging data to solve complex problems. Driven by the potential of Artificial Intelligence and Data Science to shape the future, I've been building hands-on experience with the tools and thinking that power modern data systems.

My expertise spans the full data lifecycle — from ingesting and cleaning messy real-world data, to building automated pipelines, to generating actionable insights through visualization. I believe the best insights come from pipelines you can trust, and I write code with that reliability in mind.

Currently progressing through the Data Science career track on DataCamp, and selected for the Digital Egypt Pioneers Initiative (DEPI) — a prestigious traineeship under the Ministry of Communications and Information Technology focused on Data Engineering, AI, and Big Data Processing.

0+ Technologies

0+ Projects Built

0K+ Records Processed

My Skills

The tools and technologies I use to build reliable data solutions.

Language Data / DB Viz / BI Infra / Cloud

Hover a node to see connections

My Education

Where I built my foundation — from research-driven STEM to hands-on computer science.

University Current

Egypt-Japan University of Science & Technology

B.Sc. Computer Science

"Learning by Doing" — a Japanese-inspired, research-oriented curriculum emphasizing practical application from day one.

Relevant Coursework

Data Structures & Algorithms
Database Systems
Web Development
Computer Architecture (MIPS/Assembly)
Advanced Mathematics

Python SQL Algorithms Web Dev Assembly Research

High School Graduate

Beni Suef STEM School

Science, Technology, Engineering & Mathematics

Part of a highly selective national network for gifted students. Curriculum built around solving Egypt's "Grand Challenges" through scientific research and engineering prototypes.

Key Outcomes

Rigorous foundation in the scientific method
Collaborative engineering & rapid prototyping
Project-Based Learning (PBL)
Research & technical writing

STEM Research Prototyping PBL Engineering

My Experience

2024 — Present

Digital Egypt Pioneers Initiative (DEPI)

Ministry of Communications & IT (MCIT)

Selected for a prestigious national traineeship focused on building expertise in core Data Engineering, AI, and Big Data Processing methodologies.

Advanced Python for Data Engineering & scalable software foundations
SQL, Database Management & Microsoft Azure Data Engineer concepts
Data Pipeline design, lifecycle management & Big Data processing
Prompt Engineering & AI for Data Engineers
Comprehensive Capstone project — a portfolio-ready, real-world deployment

Python SQL Azure Big Data AI

2024 — Present

Data Science Career Track

DataCamp

Actively building practical data science fluency through hands-on courses covering analysis, visualization, and machine learning fundamentals.

Pandas Seaborn Data Analysis Visualization

Featured Projects

Café Sales ETL Pipeline

A full Extract → Transform → Load pipeline that cleans dirty data, loads it into SQL Server, and generates visual reports.

PythonPandasSQL Server

Pipeline Architecture

Extract

Raw CSV ingestion

→

Transform

Clean, impute, validate

→

🗄️

Load

SQL Server via SQLAlchemy

→

📊

Visualize

Matplotlib & Seaborn reports

Key Techniques

Algorithmic back-calculation for data recovery
Forward-fill imputation for categorical data
Dual-path cleaning (Deletion vs. Repair)
Automated verification reporting

Impact Numbers

90%Data Recovery Rate

$55KRevenue Reclaimed

~7KRows Saved

PythonPandasNumPySQLAlchemyPyODBCMatplotlibSeabornSQL Server

View on GitHub

Telecom CDR Data Warehouse

An SSIS ETL pipeline that processes telecom Call Detail Records from CSV flat files into a SQL Server data warehouse — with lookup transforms, IMEI parsing, and error handling.

SSISSQL ServerStar Schema

Pipeline Architecture

Batch Input

Foreach Loop over CSVs

→

Lookup

IMSI → subscriber_id

→

Transform

Derive TAC & SNR from IMEI

→

Load

fact_transaction + error log

Key Techniques

Foreach Loop Container for batch CSV processing
Dimension lookup joins (IMSI reference table)
Derived Column transforms (TAC/SNR from IMEI)
Error row redirection & auditing

Architecture Highlights

3Star Schema Tables

10+Batch Files Processed

100%Error Rows Captured

SSISSQL ServerT-SQLData WarehouseStar SchemaVisual Studio

View on GitHub

Steam Games Market Analysis

An end-to-end Big Data pipeline and BI suite that ingests, refines, and visualizes 122,000+ Steam marketplace records using a containerized ELT architecture with Medallion pattern.

PySparkDockerPower BI

Medallion Architecture

Bronze

Raw CSV ingestion

→

Silver

Schema realignment & validation

→

Gold

Feature engineering & metrics

→

Power BI

3 interactive dashboards

Engineering Challenges Solved

13-column structural dislocation fix via indexed realignment
Spark-to-PostgreSQL type impedance casting layer
Custom derived metrics (est. revenue, sentiment %, age rating)
Full Docker containerization (Spark + HDFS + PostgreSQL)

Scale & Output

122K+Records Processed

3Power BI Dashboards

6+Docker Containers

PySparkDockerPostgreSQLHDFSPower BIJupyterJDBCMedallion Architecture

View on GitHub

What I Can Do

From raw data to actionable dashboards — here's how I can help.

ETL Pipelines

End-to-end Extract, Transform, Load pipelines that ingest messy data, clean and reshape it, and deliver it reliably to databases — fully automated and reproducible.

Python Pandas SQLAlchemy

Data Cleaning & Wrangling

Turn noisy, incomplete datasets into analysis-ready tables. Imputation, deduplication, format standardization, and validation — with full transparency on what changed and why.

Pandas NumPy Regex

Power BI Dashboards

Interactive, visually compelling dashboards that surface the metrics that matter. From data modeling to calculated measures to polished drill-through reports.

Power BI DAX Data Modeling

Task Automation

Automate the repetitive stuff — file processing, report generation, data syncs, web scraping. If it's boring and you do it every day, I can script it away.

Python Scripting Scheduling

Data Visualization

Publication-quality charts and visual stories that reveal patterns in your data. Custom color palettes, annotation, and multi-panel layouts for reports and presentations.

Matplotlib Seaborn Power BI

Database Design

Schema design, table relationships, and optimized queries for SQL Server and relational databases. Clean structure that makes your future queries fast and painless.

SQL SQL Server PyODBC

Let's Connect

I'm currently looking for internship or entry-level opportunities where I can contribute to a team, grow as a data professional, and help turn data into valuable insights.

Mohammed

About Me

My Skills

My Education

Egypt-Japan University of Science & Technology

B.Sc. Computer Science

Relevant Coursework

Beni Suef STEM School

Science, Technology, Engineering & Mathematics

Key Outcomes

My Experience

Digital Egypt Pioneers Initiative (DEPI)

Ministry of Communications & IT (MCIT)

Data Science Career Track

DataCamp

Featured Projects

Café Sales ETL Pipeline

Pipeline Architecture

Key Techniques

Impact Numbers

Telecom CDR Data Warehouse

Pipeline Architecture

Key Techniques

Architecture Highlights

Steam Games Market Analysis

Medallion Architecture

Engineering Challenges Solved

Scale & Output

What I Can Do

ETL Pipelines

Data Cleaning & Wrangling

Power BI Dashboards

Task Automation

Data Visualization

Database Design

Let's Connect

GitHub

LinkedIn

Email

Send a Message