You're using an older version of Internet Explorer that is no longer supported. Please update your browser.

Site Reliability Engineer

Toronto, ON
Full Time
11 days ago
Application Deadline:


4100 Gordon Baker Road

This role is Hybrid (2 days/week in the office @ 4100 Gordon Baker site)

At BMO, Horizontal Site Reliability Engineers (SREs) are entrusted with implementing SRE Practices and fostering a Shift-Left Mindset. Their core responsibility is to instill an SRE culture across various CIOs by applying SRE principles to address Mean Time to Resolve (MTTR) and Mean Time to Identify (MTTI) issues while maintaining Service Level Objectives (SLO), Service Level Agreements (SLA), and Service Level Indicators (SLI). The SRE team is dedicated to enabling Observability across end-to-end components and harnessing Chaos Engineering Practices to develop highly available, resilient, and reliable applications and infrastructure that are ready for Production. They collaborate closely with cross-functional teams, diverse applications, hybrid clouds, common services/middleware, and infrastructure teams throughout the development/engineering lifecycle. This role involves designing, developing, executing, and maintaining infrastructure, common services/middleware, data, and applications across various environments.

Objectives of this role
  • Achieve Observability by maximizing the potential of Application Performance Management (APM) tools like Dynatrace and correlating upstream and downstream systems spread across hybrid environments.
  • Contribute to minimizing MTTR, MTTI, and Mean Time Between Failures (MTBF) by addressing current application and infrastructure issues.
  • Embrace a cloud-adaptive mindset, showcasing and proactively implementing solutions on cloud platforms for cost optimization, log enrichment, and preventive analysis.
  • Enhance reliability, quality, and time-to-market of the suite of software solutions.
  • Measure and optimize system performance, striving for continual improvement and staying ahead of customer needs.
  • Provide primary operational support and engineering for multiple large-scale distributed software applications.
  • Gather and analyze metrics from operating systems and applications to aid in performance tuning and early-stage fault detection, adhering to the Shift Left mindset.
  • Collaborate with development teams to enhance services through rigorous testing and release procedures, leveraging logs from multiple layers for better situational awareness.
  • Engage in system design consulting, platform management, and capacity planning.
  • Develop sustainable systems and services through automation and uplifts.
  • Strike a balance between feature development speed and reliability while adhering to well-defined service-level objectives.
  • Undertake Software Development tasks to automate repetitive Operations tasks (TOIL).
Required Skills and Qualifications:
  • Bachelor's degree (or equivalent) in computer science or related discipline.
  • 3-5 years of experience as an SRE supporting enterprise-level services/applications.
  • Proficient understanding of SRE Golden Signals, Logs, Metrics, and Traces.
  • Proficiency in dissecting various logging frameworks to enable end-to-end log correlation for transactions or application cycles.
  • Exposure to DevOps practices, tools, methods, and CI/CD pipeline deployment, including experience with CloudWatch logs, alarms, and events.
  • Ability to script using one or more high-level languages, such as Python.
  • Experience with APM tools (e.g., Dynatrace) or similar tools.
  • Familiarity with middleware technologies like MQ, Kafka.
  • Experience with distributed storage technologies such as Amazon S3 and serverless tools (API Gateway, Lambda, Step Functions).
  • Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
Job Category:
Individual Contributor / Collaborateur
We're here to help

At BMO we are driven by a shared Purpose: Boldly Grow the Good in business and life. It calls on us to create lasting, positive change for our customers, our communities and our people. By working together, innovating and pushing boundaries, we transform lives and businesses, and power economic growth around the world.

As a member of the BMO team you are valued, respected and heard, and you have more ways to grow and make an impact. We strive to help you make an impact from day one - for yourself and our customers. We'll support you with the tools and resources you need to reach new milestones, as you help our customers reach theirs. From in-depth training and coaching, to manager support and network-building opportunities, we'll help you gain valuable experience, and broaden your skillset.

BMO is committed to an inclusive, equitable and accessible workplace. By learning from each other's differences, we gain strength through our people and our perspectives. Accommodations are available on request for candidates taking part in all aspects of the selection process. To request accommodation, please contact your recruiter.
Information Technology