Insight Global

Site Reliability Engineer

Plano, TX, US

9 days ago
Save Job

Summary

Required Skills & Experience

• 10+ years overall experience in application engineering

• 7+ years of SRE experience (architect or engineer) with SRE/Observability toolsets like Dynatrace/ AppDynamics/ New Relic, Splunk/Elastic

• 3+ years' experience monitoring applications using various SDLC methodologies preferably Agile

• 3+ years of technology design expertise which includes Performance, Security, Availability, as well as Operations, Monitoring and Support

• 2+ years of experience in Relational database management skills like MSSQL, MySQL, SQL, PostgreSQL or MongoDB

• 2+ years of experience in any of the scripting languages like Unix Shell Scripting, Python, or PowerShell

• 2+ years of experience in technology design expertise which includes Containerization, Performance, Security, Availability, Operations, Monitoring, and Support

• Experience in Systems Architecture, in-depth knowledge on SRE, IT Operations, Cloud, Coding and Scripting experience with Java, JavaScript, python and .NET, understanding of AI/ML

• Experience in a regulated industry; financial services experience ideal

• Bachelor's degree in MIS, computer science, math, or other science field required, advanced degree in a related field

Job Description

Insight Global is looking for a Site Reliability Engineer to join their clients team.

• Design, configure and sets up observability platform tools (Splunk and Dynatrace), both on-premises and cloud, to guide application development efficiencies and improve operational stability of the applications

• Work with Observability Manager and Architect to develop Monitoring capabilities strategy and Roadmaps and accomplish agreed upon priorities

• Develop tooling and processes to increase automation of monitoring and adherence to security and audit systems and controls

• Integrate and configure additional tools/frameworks to support and enable automation of various monitoring activities across the enterprise

• Perform analytics on incidents and usage patterns to better predict issues and take proactive actions

• Collaborate across the departments to gauge the effectiveness and efficiency of existing systems

• Foster the adoption of Observability tools and capabilities across Technology groups

• Partner with Service owners to implement Service Level Metrics & Service Level Objectives that act as service level health indicators

• Measure, communicate and deliver on enterprise platforms stability, scalability and technology organizations maturity in DevOps

• Resolve issues, alerts, and incidents based on predefined service level agreements regarding system availability, performance, and service levels

• Analyze the monitoring requirements in close collaboration with the architect and translate them into tasks for engineers to develop.

• Deliver presentations to managers and other technology and business partners

• Be a mentor to engineers, providing assistance, guidance and training

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: