Statuspage is used by many of the worlds top 1000 SaaS providers such as New-Relic, Twilio, Dropbox, and GitHub for status uptime and incident communication. This means that we have a responsibility to be up when these providers may be down. Resiliency is a core feature for Statuspage as the nature of the business is inconsistent and spiky traffic. During a large Internet-wide event, we can easily see our traffic increase by a thousand fold. Building a service that can withstand these events and that can stay up while the rest of the Internet is down provides a unique set of technical challenges for our team.
Statuspage Reliability Engineering is a team of software engineers that strive for operational excellence. We partner with development teams to scale, maintain, and make Statuspage highly-available. We are looking for a senior engineer who specializes developing and running large-scale web applications in the cloud. You will join a tight-knit agile team of experienced reliability engineers and developers who are passionate about building and scaling the Statuspage platform.
What you will be doing ?
-Partner with development and product to design and build highly available and scalable infrastructure using Atlassian's Docker-based PaaS on AWS
-Implement effective monitoring and logging strategies for all the services overall (tools we rely on Datadog, Splunk and lastly Terraform for Infrastructure-as-a-code)
-Maintain and improve CI/CD pipelines for the services (we use bamboo)
-Take part in on-call rotation for service availability to remediate and resolve critical outages (for minimal MTTR)
-Remediate vulnerabilities across application stack (we treat security with utmost priority at Atlassian)
-Own mission-critical projects that focus on scalability, security and reliability for the product
-Laser focused on service performance and uptime which may also include debugging and shipping live code into production (our application is built using Ruby on Rails)
In return, you will have the opportunity to:
-Provide technical leadership over initiatives concerning scalability, security and resiliency for Statuspage
-Leverage your deep understanding of technologies like AWS Elasticsearch, Amazon SQS, ElastiCache Redis, RDS Postgres, Varnish, Ruby on Rails
-Mentor engineers around operational excellence and influence other stakeholders
-Build incredible software on a highly-effective TEAM
More about you
On your first day, we'll expect you to:
-Extensive experience developing and deploying applications such as Ruby, Python, Go or C++ in cloud environments
-Expertise setting up monitoring and logging for large-scale web-applications
-Experience working on CI/CD tools
-Experience working with container deployment and orchestration technologies at scale with strong knowledge of fundamentals
-The ability to prioritize existing technical and infrastructure debt, and experience to build and execute a plan to pay it off
-Experience being on-call for service availability and familiarity with processes around incident management/PIR analysis
-Strong verbal and written communication skills
It's great, but not required, if you have:
-Experience working on caching technologies such as Varnish, Memcache
-Experience tuning and optimizing databases in cloud environments such as AWS RDS
-Experience establishing SLO and SLA for large scale applications
-Experience building disaster-recovery solutions for distributed systems in cloud environments
-Expertise working on micro-service architecture patterns
More about our benefits
Whether you work in an office or a distributed team, Atlassian is highly collaborative and yes, fun! To support you at work (and play) we offer some fantastic perks: ample time off to relax and recharge, flexible working options, five paid volunteer days a year for your favourite cause, an annual allowance to support your learning & growth, unique ShipIt days, a company paid trip after five years and lots more.
More about Atlassian
Software is changing the world, and we’re at the center of it all. With a customer list that reads like a who's who in tech, and a highly disruptive business model, we’re advancing the art of team collaboration with products like Jira Software, Confluence, Bitbucket, and Trello. Driven by honest values, an amazing culture, and consistent revenue growth, we’re out to unleash the potential of every team. From Amsterdam and Austin to Sydney and San Francisco, we’re looking for people who are powered by passion and eager to do the best work of their lives in a highly autonomous yet collaborative, no B.S. environment.
We believe that the unique contributions of all Atlassians is the driver of our success. To make sure that our products and culture continue to incorporate everyone's perspectives and experience we never discriminate on the basis of race, religion, national origin, gender identity or expression, sexual orientation, age, or marital, veteran, or disability status.
All your information will be kept confidential according to EEO guidelines.
Atlassian, Inc., will consider for employment qualified applicants with criminal histories in a manner consistent with the requirements of SFPC Art.49.