Site Reliability Engineer SRE

4 days ago


united arab emirates BHFT Full time

We are looking for a Site Reliability Engineer who will be responsible for ensuring the reliable operation of our platform working with metrics to improve production process efficiency and participating in testing new product versions. Responsibilities Production Stability Management: Ensure continuous compliance with external regulatory requirements and internal standards including risk security technology and trader needs. Support and automate validation and monitoring processes for adherence to necessary standards. Incident Monitoring & Management: Develop and improve monitoring and alerting systems to detect anomalies in key production metrics. Implement rapid response mechanisms and efficient solutions to maintain strategy performance. Release & Change Management: enforce standards for managing releases and changes to minimize deployment risks. Implement strict acceptance testing for all releases. Process Management: Develop and maintain Standard Operating Procedures (SOPs) for the team manage task queues and organize shift schedules to ensure continuous support and high availability of trading strategies. Integration Projects: Lead initiatives to connect with new exchanges brokers and trading platforms ensuring smooth and secure service integration. Technical Performance Optimization: Continuously improve system availability resilience (MTTR MTBF) and latency reduction while optimizing data exchange performance and order routing to maximize profitability. Qualifications Requirements Deep understanding of trading processes and market microstructure including colocation trading on native exchange protocols and algorithmic trading. Experience in monitoring alerting systems and incident management for highload environments. Knowledge of regulatory compliance and security standards. Proficiency in monitoring and incident management tools such as Grafana ClickHouse Prometheus Opsgenie Grafana OnCall PagerDuty etc. Experience developing and managing SOPs and KPIs for service teams. Experience managing integration projects with brokers and exchanges. Strong technical skill set including Linux systems administration and optimization. TCP/UDP multicast networking. FIXbased and native exchange protocols Colocation infrastructure setup and management. Python scripting for automation and monitoring. English proficiency at C1 level or higher. Remote Work Yes Employment Type Fulltime #J-18808-Ljbffr



  • , , United Arab Emirates Xenon7 Full time

    Description About us : Where elite tech talent meets world-class opportunities! At Xenon7, we work with leading enterprises and innovative startups on exciting, cutting‑edge projects that leverage the latest technologies across various domains of IT including Data, Web, Infrastructure, AI, and many others. Our expertise in IT solutions development and...


  • Abu Dhabi, United Arab Emirates Astra Tech Full time

    Job Description Role Summary We are looking for a Senior Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of botim's real-time communication and open platform infrastructure, supporting millions of active users globally. In this role, you will lead automation initiatives, operate and optimize large-scale Kubernetes...


  • , , United Arab Emirates Xenon7 Full time

    Description About us: Where elite tech talent meets world‑class opportunities! At Xenon7, we work with leading enterprises and innovative startups on exciting, cutting‑edge projects that leverage the latest technologies across various domains of IT including Data, Web, Infrastructure, AI, and many others. Our expertise in IT solutions development and...


  • United Arab Emirates Xenon7 Full time

    About us:Where elite tech talent meets world-class opportunitiesAt Xenon7, we work with leading enterprises and innovative startups on exciting, cutting-edge projects that leverage the latest technologies across various domains of IT including Data, Web, Infrastructure, AI, and many others. Our expertise in IT solutions development and on-demand resources...


  • , , United Arab Emirates BHFT Full time

    We are looking for a Site Reliability Engineer who will be responsible for ensuring the reliable operation of our platform working with metrics to improve production process efficiency and participating in testing new product versions. Responsibilities Production Stability Management: Ensure continuous compliance with external regulatory requirements and...


  • United Arab Emirates Xenon7 Full time

    About us:Where elite tech talent meets world-class opportunitiesAt Xenon7, we work with leading enterprises and innovative startups on exciting, cutting-edge projects that leverage the latest technologies across various domains of IT including Data, Web, Infrastructure, AI, and many others. Our expertise in IT solutions development and on-demand resources...


  • , , United Arab Emirates Xenon7 Full time

    A leading tech solutions provider in the United Arab Emirates seeks a Senior Site Reliability Engineer to design and maintain architecture for critical banking applications. You will lead SRE initiatives, mentor engineers, and ensure high availability and security for platforms. The role requires over 5 years of SRE experience, specifically with OpenShift...


  • united arab emirates Xenon7 Full time

    A leading tech solutions provider in the United Arab Emirates seeks a Senior Site Reliability Engineer to design and maintain architecture for critical banking applications. You will lead SRE initiatives, mentor engineers, and ensure high availability and security for platforms. The role requires over 5 years of SRE experience, specifically with OpenShift...

  • Platform Engineer

    10 hours ago


    Abu Dhabi, United Arab Emirates Dautom Full time

    Job Description We are seeking a Platform Engineer with a strong DevOps, Azure and Site Reliability Engineering (SRE) background to support our Solution Architects in implementing, operating, and maintaining scalable, reliable, and highly available production systems. This role is heavily focused on SRE practices, production support, end-to-end debugging,...


  • , , United Arab Emirates Vng Solutions Full time

    VSOL is a digital enabler with a mission to help public and private organizations evolve their businesses through data and technology. We provide an end-to-end service from consulting to execution that drives the growth and innovation of our clients. As VSOL is in a phase of rapid expansion, we offer a dynamic, creative environment that accelerates your...