Globant

Site Reliability Engineer

Makati, NCR, PH

8 days ago
Save Job

Summary

We are seeking a motivated and experienced Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a strong background in application performance monitoring, logging and tracing, and web performance optimization. You will play a crucial role in ensuring system reliability, scalability, and performance across our applications.


Key Responsibilities:

  • Application Performance Monitoring (APM): Utilize tools such as Splunk Cloud, New Relic, Dynatrace, and AppDynamics to monitor application performance and ensure optimal user experience.
  • Logging & Tracing: Implement and manage logging and tracing solutions using OpenTelemetry, AWS X-Ray, and Fluentd for effective performance tracking.
  • Performance Debugging: Identify and resolve performance bottlenecks in React and Node.js applications through memory leak detection, CPU profiling, and load testing.
  • Database Optimization: Optimize MongoDB and DynamoDB performance by implementing indexing, caching strategies, and TTL management.
  • Adobe Experience Manager (AEM) Optimization: Fine-tune AEM setups by improving caching mechanisms and dispatcher performance.
  • API Gateway & Serverless Architecture: Manage AWS API Gateway, Lambda, EventBridge, and other serverless services to ensure high availability and fault tolerance.
  • Circuit Breaker & Fault Tolerance: Design and implement resilience patterns, including retries and backoffs, to enhance system reliability.
  • Monitoring & Incident Management: Utilize Sentry and LogRocket for frontend and backend error monitoring, and manage incident response, including SLOs, SLIs, and postmortems.
  • Web Performance Optimization: Enhance web performance through CDN implementation, lazy loading, and caching strategies.


Qualifications:

Education: Bachelor’s degree in Computer Science or a related field.


Experience:

  • Proven experience in an SRE or related role, with a focus on application performance monitoring and system reliability.
  • Familiarity with cloud platforms (AWS preferred) and serverless architectures. Experience with Splunk Cloud is a MUST.


Skills:

  • Strong analytical and problem-solving skills.
  • Proficiency in scripting languages (e.g., Python or Bash) and performance debugging tools.
  • Excellent communication and collaboration skills to work effectively in a cross-functional team.


Preferred Qualifications:

  • Experience with frontend technologies such as React.
  • Knowledge of APM tools and performance tuning techniques.
  • Familiarity with incident management processes and tools.

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: