Working as a Site Reliability Engineer at Linode involves managing infrastructure for the largest independent open cloud provider in the world. We have 10+ global data centers, thousands upon thousands of physical machines, and we aim to deliver a world-class experience to customers. To uphold these standards Linode must remain creative as it relates to solving problems and technological innovation.
As we continue to grow, we’re looking for passionate, highly skilled Site Reliability Engineer to drive automation & operational excellence and support product development for our customer fleet. This will require creative thinking combined with deep domain expertise on Linux, DevOps, and System Administration.
To find success in this role, you are a natural problem solver and a motivated individual. Someone who lives to put out fires before they start and continuously improves upon the operational posture our customers have come to love and expect. Most of all, you’re generally curious and thrive when presented with an opportunity to learn something new.
You don’t have to have experience with all of these, but you should have experience with some of them and an interest to learn the others:
- Linux and Virtualization - Debian, KVM, and QEMU
- Alerting & Monitoring - Nagios, PagerDuty, AlertManager, Sentry
- Metrics - Percona Monitoring and Management (PMM), Prometheus / InfluxDB, Grafana, New Relic
- Logging - ElasticSearch, Logstash, Kibana, Loki
- Database Technologies - MySQL, SQLite
- Orchestration - SaltStack, Docker, Kubernetes, Packer
- Infrastructure Services - DHCP, DNS, SSL, NTP
- Programming Languages - Python, Perl, Golang, Bash
- Version Control - Git, GitHub
The must haves:
- Ability to gather insights across multiple monitoring tools and data sources
- Perform analysis that will lead to stability and performance improvements across our infrastructure, interpret user behavior, improve KPIs and SLAs, and increase client retention
- Design KPIs and create SLA dashboards to measure aspects of operational success
- Improve our monitoring and alerting capabilities using current tooling and by proposing new tools
- Track and correlate infrastructure/product issues at their earliest onset
- Collaborate with other teams to provide insights via metrics collection and logging
- Strong automation and scripting abilities
- Superb communication, organization, and documentation abilities
- Professional experience in a DevOps, Development, or SysAdmin role, preferably working with large scale distributed systems.
- Experience running mission-critical Linux servers in virtualized environments
- Experience with designing software and infrastructure at scale
- Comfort maintaining live production systems
- Ability to work in an agile organization
- Ability to participate to Linode’s 24/7 incident response on-call rotation, including being
responsive and available to quickly troubleshoot and resolve issues
- Experience provisioning and deploying servers, switches, and infrastructure management solutions
- Familiarity with best practices of systems architecture, design, and high-performance tuning
- Experience in a hosting environment or other cloud-based IAAS or SAAS
- Presence in the open-source world. Contributions to open-source projects as well as your own portfolio to show off a huge plus
Work With Us
- Philadelphia Office: HQ is one of the coolest tech buildings in Philly; join us on N3RD street!
- Flexible work hours: We offer a flexible work schedule, a generous paid time off package, and two work from home days on a weekly basis.
- Unbelievable benefits: We provide comprehensive health insurance, 401(k) contributions, a profit-sharing program, and pension plans.
- Monthly wellness reimbursements: up to $100 towards gym memberships, diet plans, massages, etc.
- A Macbook Pro: to use around the office and at home.
- Free hosting service: Take advantage of some Linode service - we’ll pick up the tab.
- Linode Lunch: What goes better with technology than food? Nothing. We bring in a catered lunch every week.
- Competitive salary: It all begins with fair compensation. We believe in paying people well and rewarding those who go the extra mile.
Equal Employment, Equal Treatment, No Judgment
Linode is committed to a culture that creates a sense of inclusion and belonging. We understand that teams perform their best when they include people with diverse backgrounds and differing perspectives, but also that to achieve greatness, people need to feel like they can be themselves; they need to be equal, included, and comfortable in order to perform at their best. Linode stands for equal pay, equal treatment, and equal experiences for all of our people, past, present, and future, regardless of age, race, ethnicity, religion, gender, sexuality, socioeconomic class, disability status, or any other differentiating factor. We strive to make sure every last person who we interact with feels like they belong and has the same opportunities as everyone else.
Since 2003, Linode has been providing cloud computing services to customers around the world. Linode offers compute, storage, and networking services from data centers in regions spanning across North America, Europe, Asia, and Oceania. We are committed to making Linode the most simple, powerful and reliable hosting provider that thousands of customers—from the fastest-growing startups to established enterprises—trust. This industry moves fast, but we strive to hire the kind of people who can stay a step ahead and keep us - and themselves - at the top. We are an equal opportunity employer and we are committed to building a diverse, inclusive, and welcoming workplace for all.