Manager IT, Observability and Reliability
Posted Yesterday
Job Description
We are looking for a Manager IT, Observability and Reliability to manage the day-to-day activities of all observability and reliability.
The successful candidate will have real-world experience in building resilient systems and enabling teams to deliver reliable, observable, and performant services. You will lead a team dedicated to ensuring the health, performance, and visibility of our cloud-native platforms and applications.
You will play a critical role in shaping our observability strategy, driving operational excellence, and fostering a culture of continuous improvement.
In this role, you'll work under the direction of the Director of Enterprise Development and Operations, Marius de Beer, along with 9 like-minded Managers, and collaborate closely with Architecture, Security, Platform Engineering, and Product teams to ensure our systems are observable, scalable, and aligned with business goals.
How you'll make a difference: As a Manager, Observability and Reliability, **** you'll be using leading-edge technology to help connect British Columbians to healthy and safe workplaces.
Where you'll work
At WorkSafeBC, we offer a hybrid work model that combines working remotely, and in our offices based on the operational needs of the position.
In this role, you will work primarily from our Richmond Office with some flexibility to work from your home in B.C.
What you'll do
Important to know
Before we can finalize any offer of employment, you must:
At WorkSafeBC, we promote safe and healthy workplaces across British Columbia. We partner with workers and employers to save lives and prevent injury, disease, and disability. When work-related injuries or diseases occur, we provide compensation and support injured workers in their recovery, rehabilitation, and safe return to work. We're honoured to serve the 2.49 million workers and 263,000 registered employers in our province.
What's it like to work at WorkSafeBC?
It's challenging, stimulating, and rewarding. Our positions offer diversity and opportunities for professional growth. Every day, the work we do impacts people and changes lives. What we do is important, and so are the people we do it for.
Our ability to make a difference relies on building a team with a rich variety of skills, knowledge, backgrounds, abilities, and experiences that reflect the diversity of the people we serve. We are committed to fostering a welcoming, inclusive, and supportive work culture where everyone can contribute as their best, authentic self.
Learn more: Discover who we are .
Our benefits
As a member of our team, you'll have access to services and benefits that help you get the most out of work - and life. Along with a competitive salary, your total compensation package includes:
Salary: $127,804 - $155,281/annually
Want to apply?
Any additional application materials must be received by email to HR Talent Acquisition (SM) by 4:30 p.m. PST on the closing date of the competition.
The successful candidate will have real-world experience in building resilient systems and enabling teams to deliver reliable, observable, and performant services. You will lead a team dedicated to ensuring the health, performance, and visibility of our cloud-native platforms and applications.
You will play a critical role in shaping our observability strategy, driving operational excellence, and fostering a culture of continuous improvement.
In this role, you'll work under the direction of the Director of Enterprise Development and Operations, Marius de Beer, along with 9 like-minded Managers, and collaborate closely with Architecture, Security, Platform Engineering, and Product teams to ensure our systems are observable, scalable, and aligned with business goals.
How you'll make a difference: As a Manager, Observability and Reliability, **** you'll be using leading-edge technology to help connect British Columbians to healthy and safe workplaces.
Where you'll work
At WorkSafeBC, we offer a hybrid work model that combines working remotely, and in our offices based on the operational needs of the position.
In this role, you will work primarily from our Richmond Office with some flexibility to work from your home in B.C.
What you'll do
- Lead and grow a high-performing SRE team focused on observability, monitoring, alerting, and incident response.
- Define and implement observability strategies, including metrics, logs, traces, dashboards, and synthetic monitoring.
- Collaborate with leadership, Architecture, Common Engineering, Delivery, and Security teams to define and implement secure, high-quality, 24x7 applications, ensuring end-to-end visibility into system health and performance.
- Drive adoption of observability tools and practices across engineering teams, ensuring alignment with enterprise standards.
- Establish and maintain SLIs, SLOs, and error budgets to guide reliability efforts and service quality.
- Oversee the design and implementation of automated monitoring and alerting systems to proactively detect and resolve issues.
- Partner with Architecture and Platform teams to ensure observability is embedded in the design of new systems and services.
- Participate in post-incident reviews and drive improvements in incident response, root cause analysis, and system resilience.
- Manage vendor relationships and evaluate new tools and technologies to enhance observability capabilities.
- Demonstrate problem solving through problem/risk identification, innovative thinking, and mitigation with data to support decision-making.
- Contribute to budget planning, resource allocation, and strategic roadmap development for SRE and observability initiatives.
- Represent or stand in for the Director, Enterprise DevOps when required.
- Consistently models the appropriate level of organizational behaviors expected of all WorkSafeBC employees: responsive, respectful, fair, collaborative, accountable, and forward thinking.
- Inspire and lead a team through change, fostering a culture of ownership, learning, and continuous improvement.
- Think strategically and apply an Agile mindset to deliver scalable, secure, and observable systems.
- Communicate effectively with technical and non-technical stakeholders, translating complex concepts into actionable insights.
- Champion a service-oriented approach to internal and external customers, ensuring reliability and transparency.
- Bachelor's degree in computer science, or a related field (or equivalent experience).
- 5+ years of experience in technical leadership roles, including managing SRE, DevOps, or infrastructure teams.
- Proven experience implementing observability platforms (e.g., Azure Monitor, Dynatrace, Splunk, OpenTelemetry).
- Strong understanding of cloud-native architecture, especially in Azure environments.
- Hands-on experience with monitoring, logging, tracing, and alerting in distributed systems.
- Familiarity with Agile, DevOps, and SAFe methodologies.
- Experience with incident management, on-call rotations, and operational readiness practices.
- Certifications in Azure, SRE, or observability tools are considered assets.
Important to know
Before we can finalize any offer of employment, you must:
- Consent to a criminal record check
- Confirm you're legally entitled to work in Canada
At WorkSafeBC, we promote safe and healthy workplaces across British Columbia. We partner with workers and employers to save lives and prevent injury, disease, and disability. When work-related injuries or diseases occur, we provide compensation and support injured workers in their recovery, rehabilitation, and safe return to work. We're honoured to serve the 2.49 million workers and 263,000 registered employers in our province.
What's it like to work at WorkSafeBC?
It's challenging, stimulating, and rewarding. Our positions offer diversity and opportunities for professional growth. Every day, the work we do impacts people and changes lives. What we do is important, and so are the people we do it for.
Our ability to make a difference relies on building a team with a rich variety of skills, knowledge, backgrounds, abilities, and experiences that reflect the diversity of the people we serve. We are committed to fostering a welcoming, inclusive, and supportive work culture where everyone can contribute as their best, authentic self.
Learn more: Discover who we are .
Our benefits
As a member of our team, you'll have access to services and benefits that help you get the most out of work - and life. Along with a competitive salary, your total compensation package includes:
- Defined benefit pension plan that provides you with a lifetime monthly pension when you retire
- 4 weeks of vacation in your first year, with regular increases based on years of service
- Benefits package that includes customizable options for health care and dental benefits, additional days off, and a health care spending account
- Optional leave arrangements
- Development opportunities (tuition reimbursement, leadership development, and more)
Salary: $127,804 - $155,281/annually
Want to apply?
- Applications are welcomed immediately; however, must be received no later than 4:30 p.m. PST on the closing date.
- Please note that we will be starting assessments prior to the closing date.
Any additional application materials must be received by email to HR Talent Acquisition (SM) by 4:30 p.m. PST on the closing date of the competition.
About WorkSafeBC
Industry
GovernmentCompany Size
1001-5000 employees
Application closing date is 2025-09-27
Current Openings
-
Sr. Observability Architect
Rogers
Full Time
-
DB team Lead
BMO
Full Time
-
Full Time
-
Full Time
-
Full Time
-
Full Time
-
Full Time
-
Full Time
-
Full Time
-
Full Time