Software foundations platform is an ecosystem of tools that helps value teams to create beautiful software driving business outcomes. Our customers are developers who want to perform their jobs as efficiently as possible. If you enjoy the thought that your work enables other people to solve important problems that help to save and improve lives. If you sleep well because you know that what you implemented is rock solid, we'd love to have you on our team.
As a foundational platform resiliency engineer you will:
- Be responsible for foundational platform services availability, responding to incidents and investigating root cause and collaborate with platform engineers
- Prevent incidents from re-occuring by treating them as a software bug
- Operate Cloud infrastructure with Ansible, Terraform, awscli or via API's
- Define and maintain monitoring dashboard metrics and general observability
- Setup monitoring and alerting to alert on conditions rather than detecting outages
- Document everything, so your findings turn into repeatable actions and then into automation.
- Improve the upgrade and deployment process to eliminate human touches out of the process
- Identify and remediate single points of failure
- Design, build and maintain core infrastructure pieces that allow scaling to support hundred of thousands of concurrent users.
- Debug production issues across services and levels of the stack.
- Plan for scalability of foundation platform infrastucture and services it provides
You may be a good fit to this role if you:
- Go after root cause of a problem like a hungry wolf
- Think about theories behind operating models and how it influences system architecture
- Think about systems edge cases, potential security breaches, behaviors, specific implementations and full stack impact on overall availability
- Know your way around Linux and the Unix Shell at the very least scripting language
- Thrive while doing data analysis and deeper dives into numbers and metrics
- You know how you need your data to be organized to make them comprehensible and presentable
- Know what is the use of config management systems like Ansible
- Have strong programming skills - chose your language
- Collaborate and communicate asynchronously across multiple teams and projects.
- You have attention to detail and gift to spot things others cannot see
- Stand ground against requirements that would lead to increased TOIL
- Share our company values and work in accordance with those values.
- Don't remember when it was last time you had logged on to a server to do an operating system related change
Current Employees apply HERE
Current Contingent Workers apply HERE
US and Puerto Rico Residents Only:Our company is committed to inclusion, ensuring that candidates can engage in a hiring process that exhibits their true capabilities. Please click here if you need an accommodation during the application or hiring process.
For more information about personal rights under Equal Employment Opportunity, visit:
EEOC GINA Supplement
OFCCP EEO Supplement
Pay Transparency Nondiscrimination
We are proud to be a company that embraces the value of bringing diverse, talented, and committed people together. The fastest way to breakthrough innovation is when diverse ideas come together in an inclusive environment. We encourage our colleagues to respectfully challenge one another’s thinking and approach problems collectively. We are an equal opportunity employer, committed to fostering an inclusive and diverse workplace. RegularDomestic1