We are on a mission to power the data productivity of our customers and the world, by helping teams get data business ready, faster. Our technology allows customers to load, transform, sync and orchestrate their data.

We are looking for passionate, high-integrity individuals to help us scale up our growing business. Together, we can make a dent in the universe bigger than ourselves.

We are now looking for a Senior Staff Site Reliability Engineer to join our SRE team.

About the Role

The SRE org at Matillion is made up of multiple teams which combined, own the operation and efficiency of our cloud platforms and services. It covers everything from the build, provisioning and maintenance of our cloud Infrastructure as well as reliability, capability management, observability, monitoring and metrics of our SaaS platform.

Reporting into the Director of SRE and Observability, you will utilise your experience across all pillars of Site Reliability Engineering to drive best practice aimed at enhancing our ability to build truly reliable, observable and performative infrastructure for all our core services. Your experience building modern, multi-cloud platforms will play a pivotal role as we continue to modernise our stack and implement a wide range of new tools around logging, monitoring, metrics and alerting.

We value in-person collaboration here at Matillion, therefore this role will follow our hybrid work structure where employees work 2 days a week in the Madrid office.

Our core day is Wednesday, and the second day will be determined together.

Technologies You’ll Use...Kubernetes, AWS, ArgoCD, Terraform, DataDog, Prometheus, Golang/ Python.

What You’ll Be Doing:

Leading the design of major software components, systems, and features to improve the availability, scalability, latency, and efficiency of Matillion’s SaaS services
Drive the design, implementation and management for expanding observability infrastructure, keeping up to date with new tools and technologies and be a recognised member of the broader Observability community
Lead sustainable incident response, blameless postmortems, and production improvements that result in direct business opportunities for Matillion
Define and document best practices across all pillars of SRE
Providing guidance and mentorship to other team members on managing end-to-end availability and performance of critical services, design techniques and coding standards to cultivate innovation and collaboration across the business
Balancing competing priorities as you manage a range of individual projects, deadlines, and deliverables

What we’re looking for:

A passion for everything performance, observability, availability, scalability and security
Extensive experience with Kubernetes and the surrounding ecosystem with tools such as Linkerd, Traefik and ArgoCD
An adopter and champion of core SRE principles including SLA’s, SLO’s, automation, proactive monitoring, release and deployment
Exposure to working with high traffic, large scale web operations in AWS
Ability to manage and provision infrastructure using code with Terraform or CloudFormation as well as build internal tooling with the likes of Go or Python
A solid understanding of networking systems and protocols

At Matillion, we are committed to providing competitive salaries in line with market standards. Our estimated compensation range for this position is €74,000 - €111,000 but the final salary will be based on your relevant skills, experience and qualifications demonstrated in the hiring process.

Matillion has fostered a culture that is collaborative, fast-paced, ambitious, and transparent, and an environment where people genuinely care about their colleagues and communities.

Our 6 core values guide how we work together and with our customers and partners. We operate a truly flexible and hybrid working culture that promotes work-life balance, and are proud to be able to offer the following benefits:

Our Benefits

- A truly flexible & hybrid working culture

- A culture that promotes work life balance

- 30 days holiday + bank holidays

- 5 days paid volunteering leave

- Access to mental health support

- Career development with access to a Udemy account, Blinkist and much more!

More about Matillion

Thousands of enterprises including Cisco, DocuSign, Pacific Life, Slack, and TUI trust Matillion technology to load, transform, sync, and orchestrate their data for a wide range of use cases from insights and operational analytics, to data science, machine learning, and AI.

With over $300M raised from top Silicon Valley investors, we are on a mission to power the data productivity of our customers and the world.

We are passionate about doing things in a smart, considerate way. We’re honoured to be named a great place to work for several years running by multiple industry research firms.

We are dual headquartered in Manchester, UK and Denver, Colorado.

We are keen to hear from prospective employees, so please apply and a member of our Talent Acquisition team will be in touch. Alternatively, if you are interested in Matillion but don't see a suitable role, please email talent@matillion.com

Matillion is an equal opportunity employer. We celebrate diversity and we are committed to creating an inclusive environment for all of our team. Matillion prohibits discrimination and harassment of any type, Matillion does not discriminate on the basis of race, colour, religion, age, sex, national origin, disability status, genetics, sexual orientation, gender identity or expression, or any other characteristic protected by law.

This job is no longer accepting applications

See open jobs at Matillion.See open jobs similar to "Senior Staff Site Reliability Engineer" Scale Venture Partners.

See more open positions at Matillion

Privacy policy Cookie policy