United Nations World Food Programme - Enterprise AI Data Platform

Senior Full Stack Data Engineer
United Nations World Food Programme - Enterprise AI Data Platform - Screenshot 1 showing project features
United Nations World Food Programme - Enterprise AI Data Platform - Screenshot 2 showing project features
United Nations World Food Programme - Enterprise AI Data Platform - Screenshot 3 showing project features

About This Project

Built as Senior Full Stack Data Engineer for the United Nations World Food Programme (WFP), this enterprise data platform underpins critical humanitarian operations where speed and accuracy directly determine how many people receive life-saving food assistance. The platform ingests, processes, and serves beneficiary registration data at scale — enabling WFP's Enterprise Deduplication Service (EDS), an AI-powered system that identifies duplicate records in seconds using photo-based analysis, biographic matching, and open-source machine learning models. In fast-moving emergency settings — where families are displaced, records are incomplete, and needs change hourly — the platform helps teams ensure the right people receive the right support, with human judgment remaining central to every final decision. The architecture spans batch and real-time pipelines, governed data lakes, operational dashboards built with Next.js, and ML workflows on SageMaker — all orchestrated on AWS with strict data protection, privacy assessments, and culturally sensitive collection practices.

Challenges

  • Processing fragmented beneficiary records across 120+ countries with inconsistent naming, languages, and incomplete data
  • Detecting duplicate registrations in fast-moving emergency contexts where manual spreadsheet comparison took weeks
  • Building AI pipelines that flag potential duplicates while keeping final assistance decisions with humanitarian staff
  • Orchestrating complex ETL/ELT workflows across batch ingestion, ML inference, and operational reporting at UN scale
  • Ensuring data privacy, external audit compliance, and culturally sensitive identity verification in sensitive field contexts
  • Unifying operational analytics for supply chain, country offices, and central data teams on a single governed platform
  • Scaling cost-efficiently — EDS targets 50% lower cost than traditional biometric solutions using open-source AI models

Solutions

  • Architected an end-to-end AWS data platform with S3 data lakes, Glue Jobs, Step Functions, and Zero ETL integrations into Redshift
  • Built Python-based ingestion and transformation pipelines to normalize beneficiary, biographic, and photo metadata from field systems
  • Developed SageMaker-powered ML workflows and notebook-driven experimentation for duplicate detection and data quality scoring
  • Implemented DynamoDB and RDS layers for low-latency operational lookups alongside Redshift for large-scale analytics
  • Created Next.js operational dashboards for review queues, duplicate flagging, and country-level data quality monitoring
  • Designed human-in-the-loop review workflows where AI surfaces candidates and staff make final entitlement decisions
  • Established governed data access patterns with privacy assessments before rollout in new humanitarian contexts
  • Automated orchestration with Step Functions to coordinate ingestion, model scoring, review routing, and reporting pipelines

Results

  • Enabled WFP's Enterprise Deduplication Service (EDS) — among the first UN agency AI deployments for beneficiary matching
  • Mali pilot saved over US$431,000 in six months by reducing duplicated assistance; projected US$4.7M savings globally in 2026
  • Equivalent impact of ~6.7 million additional meals at an estimated 70 cents per meal through improved data accuracy
  • Reduced duplicate analysis from weeks of manual spreadsheet work to hours through automated AI-assisted review
  • Deployed in Mali with pilots across Afghanistan, Burkina Faso, Cameroon, Mozambique, Niger, Somalia, and Uganda
  • Strengthened fairness and trust — ensuring people receive what they are entitled to, no more and no less
  • Delivered a scalable, open-source AI foundation that is up to 50% cheaper than licensed biometric alternatives

Technologies

Frontend

Next.js
Operational Dashboards
Review Queue UI
Analytics Views

Backend

Python
Data Pipeline Services
ML Inference Orchestration
API Layer

Database

Amazon Redshift
Amazon DynamoDB
Amazon RDS
Zero ETL Integrations

Infrastructure

AWS Step Functions
AWS Glue Jobs
Amazon S3
Amazon SageMaker
SageMaker Notebooks
Cloud Data Lake

Tools

Enterprise Deduplication Service (EDS)
Open-source AI Models
Data Privacy Assessments
Human-in-the-loop Review
Batch & Real-time Pipelines
Field Data Ingestion

Links