Hi, I'm

Omar Mahmoud

AI Engineer & Applied ML Scientist

PhD candidate in Trustworthy AI with 5+ years building production LLM systems, RAG pipelines, and evaluation frameworks. I bring research depth to engineering challenges — and an engineering mindset to research problems.

Download CV

Who I Am

5+ Years of Experience
PhD Candidate, Trustworthy AI
12+ Publications
500+ Users Served

I'm a Senior AI Engineer and Applied Scientist specialising in taking ideas from experimentation through to real-world deployment — fine-tuning LLMs, designing evaluation frameworks, and building reliable AI systems that work at scale.

My research focuses on AI safety, privacy, and model alignment — specifically LLM memorisation, dememorisation, hallucination mitigation, and multilingual behaviour dynamics — with publications at NAACL, EMNLP, EACL, and LREC.

Currently completing my PhD at Deakin University's Applied AI Institute (A2I2) while taking on freelance AI engineering work.

Technical Skills

LLMs & GenAI

Fine-tuning (PEFT, LoRA, QLoRA) RAG System Design Agentic AI Workflows LLM Evaluation (RAGAS, TruLens) Prompt Engineering

Trustworthy AI

AI Alignment Privacy-Preserving ML LLM Dememorisation Hallucination Mitigation Adversarial Robustness Harmful Content Detection

NLP & Speech

Text Classification NER Semantic Search Summarisation Multilingual NLP ASR (Whisper, wav2vec2)

ML & Deep Learning

PyTorch TensorFlow HuggingFace Transformers Scikit-learn Computer Vision Experiment Design

Data & Retrieval

FAISS Pinecone ChromaDB Elasticsearch MLflow Weights & Biases

Engineering & Cloud

Python FastAPI Docker AWS (S3, SageMaker, Lambda) CI/CD REST APIs SQL

Experience

AI Engineer

Deakin University (A2I2) · Geelong, VIC

Feb 2025 – Jun 2025
  • Designed and shipped a production RAG system (LangChain, LlamaIndex) used by 500+ students and researchers, combining hybrid retrieval with re-ranking and achieving a 20% accuracy improvement over baseline.
  • Fine-tuned open-source LLMs using PEFT/LoRA on domain corpora, reducing inference latency by 15% and improving task accuracy by 10% across classification and summarisation.
  • Built LLM evaluation pipelines (RAGAS, TruLens) benchmarking hallucination, faithfulness, and alignment — establishing reusable protocols for responsible model deployment.
  • Developed and evaluated dememorisation techniques to prevent LLMs from leaking training data, contributing findings to EMNLP 2023 and NAACL 2025.
  • Deployed and managed models via Docker and AWS SageMaker with experiment tracking and automated testing.

AI Engineer

Upwork · Remote

May 2022 – Present
  • Designed and shipped 10+ production AI systems for clients across legal, healthcare, and e-commerce — including document Q&A, semantic search, and content classification pipelines on real-world multilingual data.
  • Built end-to-end RAG applications with custom chunking, embedding, and retrieval tuning (FAISS, Pinecone), reducing irrelevant results by ~30% over naive baselines.
  • Fine-tuned and deployed LLMs and NLP models via FastAPI on AWS; implemented monitoring, auth, and data-handling best practices for production reliability.
  • Delivered ASR pipelines (Whisper, wav2vec2), achieving 15–20% WER improvement over off-the-shelf solutions on client audio data.

Data Scientist

iNetworks · Cairo, Egypt

Jan 2020 – Apr 2022
  • Built a real-time NLP-driven recommendation system using embeddings and text classification, increasing user engagement by 25%.
  • Engineered automated web scraping and preprocessing pipelines for structured and unstructured data, reducing manual effort by 40%.

Undergraduate Research Assistant

RSSCI Lab, Helwan University · Cairo, Egypt

Oct 2019 – 2021
  • Designed and executed experimental protocols for model evaluation across Computer Vision (COVID-19 chest X-ray detection — 1st place, UGRF) and NLP tasks using PyTorch and TensorFlow.
  • Led large-scale data acquisition and preprocessing for image and text datasets, ensuring integrity and feature readiness for model training.

Publications

View all on Google Scholar →

2026

The Unintended Trade-Off of AI Alignment: Balancing Hallucination Mitigation and Safety in LLMs

EACL 2026

Omar Mahmoud*, Ali Khalil, Buddhika Laknath Semage, Thommen George Karimpanal, Santu Rana

Aligning Multilingual Representations: Unveiling Multilingual Behavior Dynamics

LREC 2026

Omar Mahmoud*, Buddhika Laknath Semage, Thommen George Karimpanal, Santu Rana

2025

Alpaca Against Vicuna: Using LLMs to Uncover Memorization of LLMs

NAACL 2025

Aly M. Kassem, Omar Mahmoud*, Niloofar Mireshghallah, Hyunwoo Kim, Yulia Tsvetkov, Yejin Choi, Sherif Saad, Santu Rana

2023

Preserving Privacy Through Dememorization: An Unlearning Technique for Mitigating Memorization Risks in Language Models

EMNLP 2023

Aly M. Kassem, Omar Mahmoud*, Sherif Saad

An Ensemble Transformer-Based Model for Arabic Sentiment Analysis

SNAM 2023

Omar Mohamed*, Aly M. Kassem, Ali Ashraf, Salma Jamal, Ensaf Hussein Mohamed

2022

GoF at Arabic Hate Speech 2022: Breaking the Loss Function Convention for Data-Imbalanced Arabic Offensive Text Detection

LREC 2022 1st Place

Aly Mostafa, Omar Mohamed Ahmed*, Ali Ashraf

GoF at Qur'an QA 2022: Efficient Question Answering for the Holy Qur'an Using Deep Learning

LREC 2022 3rd Place

Aly Mostafa, Omar Mohamed Ahmed*

On the Arabic Dialects' Identification: Overcoming Challenges of Geographical Similarities and Imbalanced Datasets

NADI 2022

Salma Jamal, Aly Mostafa, Omar Mohamed Ahmed*, Ali Ashraf

2021

COVID-19 Patient Chest X-Rays Automatic Detection Using Deep Learning

AMLTA 2021 · Springer 1st Place, UGRF

Aly Mostafa, Ahmed Elbehery, Ali Ashraf, Omar Mohamed Ahmed*, Ali Mahmoud

Arabic Speech Emotion Recognition Employing Wav2Vec2.0 and HuBERT

TMLAI 2021

Omar Mohamed Ahmed*, Salah Aly Ahmed

OCFormer: A Transformer-Based Model for Arabic Handwritten Text Recognition

IEEE MIUCC 2021

Aly Mostafa, Omar Mohamed Ahmed*, Ali Ashraf, Ahmed Elbehery, Salma Jamal, Ghada Khoriba, Amr S. Ghoneim

An End-to-End OCR Framework for Robust Arabic Handwriting Recognition (270M-word corpus)

Pre-print

* Corresponding / lead author

Education

Ph.D. in Artificial Intelligence
Deakin University (A2I2) · Melbourne, VIC
Sep 2023 – Jun 2026 (expected)

Research: LLM alignment, privacy, memorisation, and multilingual NLP. Published at NAACL, EMNLP, EACL, and LREC.

B.Sc. Computer Science
Helwan University, Faculty of Computers and AI · Cairo, Egypt
Graduated Oct 2021