Job Description
Roles & Responsibilities
Resiliency Engineering (SRE): Implement "Chaos Engineering" and load testing to ensure web/mobile backends can handle banking-scale traffic. Maintain high availability through automated recovery scripts.
Automated Regression: Build CI/CD-integrated test suites using Python that validate both the application logic and the infrastructure state (IaC validation).
Observability & SLIs: Define and monitor Service Level Indicators (SLIs) and Objectives (SLOs). Set up advanced alerting in Azure Monitor or AWS CloudWatch to catch performance degradation before users do.
Security & Compliance Testing: Automate security scans and compliance checks to ensure all AI data handling meets strict banking data residency and privacy protocols.
Desired Candidate Profile
Technical & Professional Requirements:
Automation Stack: High proficiency in Python (for AI testing) and framework automation (PyTest, Selenium, or Robot Framework).
Cloud Infrastructure: Strong hands-on experience with Azure or AWS, specifically regarding networking, scaling, and serverless reliability.
AI/ML Understanding: Understanding of Prompt Engineering and how to evaluate AI model outputs (RAG evaluation, ROUGE/BLEU scores, or custom LLM-benchmarks).
Monitoring Tools: Experience with Grafana, Prometheus, or native cloud monitoring tools to build real-time reliability dashboards.
FinOps Awareness: Ability to identify "expensive" failing tests or inefficient cloud resource usage during the testing phase.
Recommended Skillset & Tools:
Languages: Python (Mandatory), Bash scripting.
Tools: GitHub Actions (CI/CD), Terraform (reading/validating), K6 or JMeter (Performance).
AI Frameworks: DeepEval, Ragas, or LangSmith (for automated AI evaluation).
Tanqeeb.com is the pioneering search engine in The Arab World. Tanqeeb Gathers all the suitable jobs on various platforms for you in one place.