Full Time
--
Dicetek LLC

Job Details

Job Description

Roles & Responsibilities

  • Design, build, and maintain scalable and resilient microservices infrastructure, focusing on automation and self-healing capabilities.
  • Implement and manage CI/CD pipelines for microservices, ensuring rapid and reliable deployments with robust rollback strategies.
  • Proactively monitor microservices health, performance, and security, establishing SLOs/SLIs and implementing effective alerting mechanisms.
  • Develop and maintain comprehensive disaster recovery and business continuity plans for critical microservices applications.

Desired Candidate Profile

  • 5+ years of experience in SRE or DevOps roles, building and managing large-scale, high-availability systems across banking, fintech, e-commerce, or other data-intensive digital ecosystems.

  • Bachelor’s degree in Computer Science or equivalent technical experience.

  • Strong experience with Linux environments and performance troubleshooting.

  • Proven expertise in Terraform and Infrastructure as Code (IaC) methodologies.

  • Proficiency with Kubernetes and container orchestration in microservices environments.

  • Hands-on experience with AWS (preferred); exposure to Azure or GCP is an advantage.

  • Deep knowledge of Dynatrace (AIOps, Davis AI), Prometheus, Grafana, and the ELK stack.

  • Experience implementing AI / ML-driven reliability or automation solutions (AIOps, anomaly detection, predictive alerting).

  • Practical understanding of CI / CD pipelines (GitHub Actions, Jenkins, GitLab CI / CD or Azure DevOps).

  • Experience with Kafka, RabbitMQ, Redis, Aurora, and RDS databases.

  • Strong scripting or programming skills in Python, Bash, or Go. The Ideal Candidate

  • Organized, structured, and meticulous in approach.

  • Experienced in cross-functional collaboration and working with distributed teams.

  • Strong analytical mindset with excellent troubleshooting skills for complex production systems.

  • Calm and composed communicator under pressure, capable of leading during high-impact incidents.

  • Proactive problem-solver who anticipates issues and drives preventive improvements.

  • Passionate about AI-driven automation, observability, and reliability engineering.

  • Continuously learning, keeping up-to-date with cloud-native, microservices, and SRE best practices.

  • A collaborative and adaptable team player who thrives in a fast-paced, regulated environment and is passionate about building reliable, scalable systems that empower digital banking innovation.

Similar Jobs

About Dicetek LLC
UAE, Abu Dhabi
Information Technology and Services