Dice is the leading career destination for tech experts at every stage of their careers. Our client, Kforce Technology Staffing, is seeking the following. Apply via Dice today!
RESPONSIBILITIES:
Kforce has a client that is seeking a Principal Site Reliability Engineer in Portland, OR.
Summary:
We are seeking a Principal Site Reliability Engineer to join our skilled team. In this role, you will manage and maintain our production Cloud environment, ensuring a top-notch SaaS experience. If you enjoy innovation, providing technical vision, and working with a team to build reliable, scalable frameworks, this role is for you. You will analyze and improve our services and processes to enhance reliability, performance, scalability, and cost efficiency. You will also advocate for reliability methodologies and collaborate with various teams to integrate these practices into our platform and products.
What You'll Do:
- Architect, build, and maintain highly available, fault-tolerant systems using AWS/other services
- Use Terraform to define infrastructure as code, enabling scalable, repeatable, and secure deployments
- Continuously review and recommend the design, maintenance, development and implementation, including deployment and support, of our SaaS production platform solution using Docker and other modern web technologies
- Set up and enforce guardrails for databases, infrastructure, and applications, ensuring consistency and adherence to best practices
- Support operationally critical environments using monitoring tools, scripts, and logging
- Document designs and implementations
- Design and manage secure networking solutions, including AWS VPCs, and firewalls
- Partner with SRE and Engineering teams to embed reliability and security best practices into the application life cycle
- Collaborate with fellow Engineers, Product Managers, and Quality Assurance Engineers to develop and deliver services that meet or exceed enterprise customer reliability and quality expectations
- Participate and be effective at pair/mob programming and code reviews, both giving and receiving feedback
REQUIREMENTS:
- Bachelor's degree in Computer Science or equivalent years of experience
- Minimum 5+ years of experience working with Java
- Expert-level proficiency with 5+ years of experience in AWS components like EC2, CloudFormation, RDS/Aurora, IAM Roles, etc.
- Expert-level proficiency with 5+ years of experience in operating high-availability, fault-tolerant, scalable, distributed software in production: building monitoring into your code, tweaking dashboards, defining alerts, writing runbooks, etc.
- 4+ years of systems engineering/administration and/or support experience with web applications, especially J2EE technologies and technologies like Docker, Tomcat, Nginx
- Experience with programmatic manipulation of cloud infrastructure such as AWS
- Experience with network and web application monitoring tools, Datadog is preferred
- Experience with DBMS (e.g. MySQL, MS SQL, Postgres, RDS), as well as graph databases (Neo4j, ArangoDB)
- Experience with REST
- Deep understanding of modern Cloud infrastructure, programming expertise, and operational experience
- Understanding of network fundamentals including (TCP/IP, VPN, DNS, SMTP, HTTP(S))
- Scripting and automation skills using common scripting languages like Python, Bash
Nice to Have:
- Experience with SaaS application and/or product hosting, preferably in an enterprise software development environment
- Hands-on experience working with AWS Glue, Bedrock, and Sagemaker, supporting customer-facing AI applications
- Experience within the Scrum/Agile framework
- Passionate, driven, intelligent, team-oriented and hard-working with the ability to raise the performance of those around you
- Strong automation skills and independent time management skills
- Strong interpersonal and communication skills, including the ability to actively listen, build trust with stakeholders, identify needs, and proactively maintain positive relationships, both in written and spoken situations
The pay range is the lowest to highest compensation we reasonably in good faith believe we would pay at posting for this role. We may ultimately pay more or less than this range. Employee pay is based on factors like relevant education, qualifications, certifications, experience, skills, seniority, location, performance, union contract and business needs. This range may be modified in the future.
We offer comprehensive benefits including medical/dental/vision insurance, HSA, FSA, 401(k), and life, disability & ADD insurance to eligible employees. Salaried personnel receive paid time off. Hourly employees are not eligible for paid time off unless required by law. Hourly employees on a Service Contract Act project are eligible for paid sick leave.
Note: Pay is not considered compensation until it is earned, vested and determinable. The amount and availability of any compensation remains in Kforce's sole discretion unless and until paid and may be modified in its discretion consistent with the law.
This job is not eligible for bonuses, incentives or commissions.
Kforce is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, protected veteran status, or disability status.
By clicking ?Apply Today? you agree to receive calls, AI-generated calls, text messages or emails from Kforce and its affiliates, and service providers. Note that if you choose to communicate with Kforce via text messaging the frequency may vary, and message and data rates may apply. Carriers are not liable for delayed or undelivered messages. You will always have the right to cease communicating via text by using key words such as STOP.