We are seeking an experienced and highly motivated Senior Cloud Infrastructure Engineer to design, build, and maintain secure, scalable, and resilient cloud infrastructure. You will play a key role in cloud architecture, automation, and operational excellence. While AWS is the primary platform, experience with Azure is a strong plus..
Scope & Responsibilities
Design and implement scalable, secure, and highly available AWS infrastructure
Build and maintain infrastructure as code using Terraform or CloudFormation
Collaborate with development, security, and operations teams on cloud solutions.
Ensure the highest levels of system availability, performance, security, and cost optimization.
Implement and manage network security for cloud assets.
Coordinate and engineer availability zones and regional architecture for data protection, disaster recovery, and manual redundancy failovers.
Produce daily, weekly, and monthly integrated service management reports for all solutions.
Optimize cloud cost management by monitoring usage and recommending correct sizing or architectural improvements.
Lead incident response for infrastructure and network issues, including root cause analysis and post-mortem reviews
Support various IT projects, services, and cloud implementations.
Document infrastructure architecture, procedures, and policies
Participate in on-call duty as required.
Required Skills
Deep technical knowledge of AWS services, including EC2, VPC, Elastic Load Balancing, Auto Scaling, Lambda, S3, EBS, RDS, and AWS Systems Manager automation.
Significant experience with monitoring tools and Infrastructure such as Code (IaC) such as Terraform and CloudFormation.
Professional experience in managing and operating Linux systems (RHEL/Ubuntu).
Strong automation and scripting skills in languages like Bash, Python, Perl, or similar.
Ability to develop and maintain comprehensive documentation, including network diagrams, configuration guides, and standard operating procedures.
Provide technical support and guidance to internal teams, troubleshooting issues related to cloud infrastructure and network systems (both cloud and on-premises).
Utilize problem determination and source identification skills to resolve issues involving APIs, application services, IaaS, PaaS, SaaS, microservices, containers (AWS ECS & Containerization), network, security, and infrastructure. Triage and route incidents to appropriate support levels when necessary.
Familiarity with cloud cost optimization strategies