Krila Consultancy & Recruitment

Site Reliability Engineer

Ottawa, ON, CA

5 days ago
Save Job

Summary

Site Reliability Engineer

Location: Onsite – Kanata, Ontario



About Our Clien

tImagine a startup delivering real-time data insights that empower businesses to make smarter, faster decisions. Backed by one of the world’s top tech groups, we blend cutting-edge technology with deep expertise to help companies stay agile and ahead of the curve. With the strength of a powerhouse behind us, we drive innovation and create transformative solutions for today’s dynamic markets


.
Edge Sign
al provides a full-fledged edge computing platform powering computer-vision applications across Retail, Hospitality and Warehousing. they run entirely on AWS, ingesting and analyzing massive fleets of on-premise devices with Datadog monitorin

g.We’re looking for an experienced Site Reliability Engineer to keep their cloud and edge infrastructure running flawlessly—and to help their customers get up and running smoothl

y.This position is based at their head office in Kanata, Ottawa, reporting to the Director of Technolog


y.
What You’ll

DoOperati

  • onsEnsure highly available, fault-tolerant AWS services (auto-scaling, disaster recovery, capacity plannin
  • g).Build and maintain Datadog dashboards, monitors and alerts for cloud resources and edge devices; author runbooks and automation scripts for incident respon
  • se.Develop tooling to provision, update and health-check thousands of edge devices; ingest device telemetry into Datadog for unified observabili
  • ty.Automate routine ops tasks (onboarding steps, incident remediation) using shell, Python, e

tc.Onboard

  • ingLead customer installations by configuring IP cameras, NVRS, and Edge Signal agents on-si
  • te.Guide network, security and firmware setups to ensure seamless data flow from device to clo

ud.Supp

  • ortTriage and resolve Freshdesk tickets; conduct root-cause analysis and drive timely closu
  • re.Convert complex issues into Jira epics/stories and collaborate with product teams to ship fix

es.Complia

  • nceManage AWS IAM (users, roles, policies, SSO) and enforce security best practic
  • es.Monitor and optimize AWS spend—set budgets, report usage and recommend cost-savings strategi
  • es.Integrate secrets management, vulnerability scanning and other compliance contro


ls.
What You Will B


  • ring
    A minimum of a Bachelor's degree in Computer Science or a related field in engineering is requ
  • ired;Min 3+ years as an SRE or DevOps engineer supporting production AWS environm
  • ents.Proven expertise in Datadog (APM, Infrastructure, Logs, Synthetic ch
  • ecks)Strong Linux administration skills and proficient scripting ability (Bash, Python, o
  • r Go)Experience with AWS IAM, SSO, Control Tower, cost-management tools, and billing dashb
  • oardsExcellent communicator with a bias toward collaboration and customer em


pathy
Bonus

  • PointsPrior work with edge computing or IoT device
  • fleetsExperience configuring IP cameras, RTSP streams, and NVR s
  • ystemsFreshdesk and Jira administration expe
  • rienceAWS DevOps or Solutions Architect certifi


cation

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: