63
companies
877
Jobs

Senior Site Reliability Engineer

Regie

Regie

Software Engineering
Vancouver, BC, Canada
Posted on Apr 5, 2025

Responsibilities

We’re seeking a senior Site Reliability Engineer/DevOps who is passionate about building the best infrastructure and maintaining the health of the systems.

  • Design and maintain scalable, secure, and reliable infrastructure to support Regie.ai's SaaS platform and AI/data workloads.
  • Architect a unified monitoring and alerting system for engineering teams to continuously monitor and improve system availability, reliability, performance.
  • Drive infrastructure automation and CI/CD improvements to reduce operational overhead and deployment risk.
  • Optimize infrastructure costs, support compliance efforts (e.g., SOC 2), and enforce security best practices.

Required Skills & Qualifications

  • 6+ years of experience in SRE, DevOps, or infrastructure engineering roles.
  • Extensive hands-on experience with AWS and its core services.
  • Strong experience with Terraform (or similar IaC tools), Docker and containerization, and modern CI/CD systems.
  • Proficient in scripting or programming languages such as Python and Bash.
  • Deep experience with monitoring and alerting tools (e.g., New Relic, Prometheus, Grafana, PagerDuty).
  • Strong hands-on experience with both SQL and NoSQL databases (e.g., MongoDB, PostgreSQL, MySQL).
  • Proven track record of designing and maintaining production-grade infrastructure with high availability and low latency.
  • Excellent troubleshooting abilities, along with strong communication and collaboration skills.
  • Solid understanding of cloud security and compliance best practices, including SOC 2 readiness and audit support.