A bit about us:
- Raisely lets people take one-off and recurring donations to their own Stripe/PayPal accounts. Our system needs to check these donations for attempted fraud, process them atomically and quickly add them to live tiered fundraising totals.
- Raisely has a comprehensive page-builder that our customers use to build their own fundraising websites. We supply a set of React components, campaign templates and page-building constructs. But we also allow our customers to write their own SCSS styles and custom React components which means our engineering team needs to support a lot of permutations and deliver them all performantly.
- Raisely provides free, flexible and powerful tools to charities, but this has made us the target of some creative attempts to misuse our platform which require robust defences that balance our limited engineering time with delivering new features for our customers.
- Raisely needs to operate at an ever-increasing scale. We already facilitate millions of API requests each day, and need to handle spikes when our clients launch a new campaign or are featured in the news. We need to support this scaling while enabling a growing and fast moving engineering team to deliver new features that are secure & reliable.
A bit about the role:
We’ve got a big codebase and a growing team that strives to produce great results for our customers. You’ll be joining an engineering team that takes our clients and their uptime seriously and we’re excited to learn how your work will support that commitment. You’ll take the lead on thinking about reliability and security, and you’ll collaborate with our product engineering teams as they work through feature cycles to deliver code that’s secure and keep us within our error budget (which you’ll help us to set).
The performance and security issues that crop up are not frequent enough yet to justify separate roles for security and reliability, and we hope this role will keep it that way for some time yet. Over the long term we expect you’ll likely specialise in security or reliability.
While you monitor metrics and fix immediate issues, you’re also looking to the future and developing a long term vision for DevSecOps and tooling to support our engineers in building quickly while releasing high performance, secure, reliable code.
Every minute of uptime that you secure, every security threat that your work protects us against will be helping our charity customers raise dollars that support the wellbeing of people and planet.
Location: Remote (we’ll help you set up your home office!)
Timezone: 3 hours overlap with Australian east-coast business hours.
If you worked here over the last year, you might have:
- Improved or added to monitoring & alerting systems
- Proposed changes to cloud platforms and underlying architecture to enhance security & resilience to failure
- Identified future performance bottlenecks and helped design and implement solutions
- Monitored our cloud infrastructure and identified opportunities to improve speed, latency, and hosting costs
- Implement automated tooling to ensure we’re shipping secure code
- Taken responsibility for solving a production reliability incident, along with a postmortem and implementing mitigations for the future.
- Helped facilitate the Raisely Cybergames, a red team exercise where all staff across the company were invited to propose attacks & exploits to test
- Taken part in reviewing our business continuity plan
- Coordinated 3rd party penetration testing and developed fixes to issues identified
- Provide input to engineering teams on security & reliability aspects of a feature cycle
- Improve anything – you’ll take the initiative to find opportunities to move our processes from reactive, to proactive, to automatic
Now about you….
You’ve been part of an SRE team before and have a clear sense of what well implemented SRE should look like and are keen to work with the engineering team to get there. You’re comfortable managing production infrastructure and writing your own scripts and tools to manage it for you.
Emphasis on engineer
You’re not here to tinker with operations and systems config, you architect systems and build them to be scalable and durable. You’re looking for ways to reduce toil for the team in our ops and in responding to and remedying alerts. You’re happy to jump in and fix an issue, but you’re even happier when you finish the job by building automation that makes sure it stays fixed. Your favourite kind of policy is Policy as Code.
An eye on security
You recognise that security and performance go hand in hand. It doesn’t matter if it’s a service failure or a service hack, downtime is downtime. You account for this as you plan out your metrics and processes and you apply your SRE experience to solving for security. You help the team think through the security implications of architecture choices and how a bad actor could abuse a new feature or change of configuration.
You’re a manager-of-one & a collaborator with many
You don’t need a manager to check in and direct your workflow, you’d prefer to work to broad expectations and manage your own time within that framework. You can identify when you need a hand, and won’t hesitate to ask. At the same time, you recognise that to go far, we need to go together. You’re looking forward to sharing new tricks with your engineering colleagues as you help us upgrade our DevSecOps processes.
You care about making a difference
Yeah, we’re all here because we want to make the world better (and by that we mean a carbon-neutral utopia with world peace and just laws, where all people have equal opportunity to thrive). So you’ve gotta want that too!
Ok, and why work with us?
How to apply:
- What is it about this job that made you want to apply?
- Tell us about a DevOps or DevSecOps automation you helped build. What problem did you solve? How did it reduce toil? What would you do differently if you had to build it again?
- When was a time you optimised how an application was running on a cloud provider? What was the setup prior, how did you go about designing a new architecture and successfully migrate to it?