As a Data Engineer , your main goal is to build and maintain the systems that process and store data for the Events & Exhibitions ecosystem.
You will take scattered, vendor-specific data (from registration systems, apps, marketing tools) and transform it into a unified, AI-ready dataset using a Medallion Architecture on Azure.
Think of it as organizing raw data into a structured pipeline that’s ready for analysis and machine learning.
1. Data Ingestion & API Integration (Bronze Layer) Build and manage robust ETL/ELT pipelines using Azure Data Factory to ingest data from 3rd-party vendors (REST APIs, Webhooks, SFTP).
Ensure raw data is landed securely in Azure Data Lake Gen2 (Bronze Layer) without data loss.
Implement error-handling and logging to monitor the health of real-time and batch ingestion jobs.
2. Transformation & Modeling (Silver & Gold Layers) Utilize PySpark (Azure Databricks/Synapse) and SQL to clean, deduplicate, and standardize data in the Silver Layer.
Execute Identity Resolution logic to stitch together visitor and exhibitor profiles from multiple touchpoints into a "Golden Record.
" Develop optimized data sets in the Gold Layer for high-performance reporting and predictive AI models.
3. Infrastructure & Performance Optimization Optimize SQL queries and Spark jobs to reduce Azure compute costs and minimize data latency.
Maintain the Data Dictionary and technical documentation to ensure the "Engine Room" logic is transparent and scalable.
Implement data masking and security protocols to ensure GDPR and internal compliance.
4. Business Enablement Support the Senior Data Manager in building the Semantic Layer that feeds our Power BI "Data Window.
" Collaborate with the Events Tech team to troubleshoot data discrepancies between front-end apps and back-end tables.
Technical Requirements Experience: 3–5 years in Data Engineering with a focus on Cloud environments.
Core Azure Stack: Proven expertise in Azure Data Factory, Azure Synapse Analytics, and Data Lake Gen2.
Coding: High proficiency in SQL (complex joins/optimizations) and Python/PySpark.
Architectural Knowledge: Practical experience with the Medallion Architecture (Bronze/Silver/Gold).
Integration: Strong experience working with REST APIs and JSON/XML data formats.