Site Reliability Engineer
Job Description
About Us
LoyaltyLion is a data-driven loyalty and engagement platform trusted by thousands of ecommerce brands worldwide. Merchants use LoyaltyLion when they want a loyalty program that is proven to increase customer engagement, retention and spend. Stores using LoyaltyLion typically generate at least $15 for every $1 they spend on the platform.
Today LoyaltyLion works with over 10,000 small and medium sized retailers. Our mission is to help them succeed in the age of Amazon, where they may not be able to compete on price and logistics but can offer a better customer experience. An experience where customers feel valued, rather than just another number.
It’s been an incredible two years for LoyaltyLion. We closed $12.5m early last year and another $12m this year, and we’ve grown from 40 employees to over 100. We’ve built out our Leadership team, recruiting a CTO, CFO and Director of Product amongst other senior hires and we continue to scale quickly, achieving spots in both the Deloitte Fast 50 and the FT1000. This is just the beginning of our inflection point.
Please note we can only consider candidates who live in the UK (who have the right to work in the UK) or Europe as do not offer sponsorships now or in the future.
The Role
We are looking for a Site Reliability Engineer to join our team and support LoyaltyLion's growth. Working with our SRE Lead, you will be responsible for ensuring the reliability, availability, and performance of our platform's infrastructure and systems. You'll also support our Data team in the provisioning and tuning of our Data platform, and our development teams in optimising their applications and CI/CD pipelines for peak performance and efficiency.
Please note this is a fully remote position, within the UTC-0 and UTC+2 timezones
Some of the things you'll be doing
- Delivering clean, architecturally sound, maintainable and secure infrastructure
- Working closely with AWS infrastructure, particularly focusing on data services, to support database scalability and availability
- Work with LoyaltyLion engineering teams to support the infrastructure they need and the platforms on which their services run
- Conducting performance tuning to optimise database performance and enhance data processing efficiency
- Implementing observability systems for infrastructure and data to ensure reliability and availability, find areas for improvement, and proactively access risks to the stability or security of our platform
- Maintain new and existing infrastructure with code, by writing well-designed Terraform code to make the best use of our AWS infrastructure.
- Documenting and driving the adoption of DevOps best practices across the wider engineering team
- Conducting proofs-of-concept on new and emerging technologies and evaluating the fit to LoyaltyLion
- Taking part in honest and transparent blame-free post-mortems on incidents we have, so we can learn from them and prevent them from happening again
- Automate and accelerate - reduce manual tasks and allow all of LoyaltyLion engineers to concentrate on building exciting new features
- Build, measure, learn - implement the best observability tools to continually improve LoyaltyLion performance
What We’re Looking For
- 4+ years of experience with AWS
- In-depth knowledge of defining infrastructure as code using Terraform
- Experience in agile development practices
- Observability & Monitoring using DataDog & Cloudwatch or similar systems
- Bonus points for real-time low latency high-frequency transaction-based systems experience
- Extra bonus points for experience with Redshift, Glue, Airflow, Athena or any other frameworks and tools for data engineering
- Ability to diagnose problems at any level (Client, HTTP/Network, Server, Database, OS)
- Ability to write clear, concise documentation
Our Stack
- AWS
- Docker
- ECS (Fargate)
- DataDog
- PagerDuty
- Postgres, Redshift
- Infra as code: Terraform, Ansible, Packer
- Scripting: Bash, Ruby, Python
- Buildkite
The Engineering team works on a fully remote basis. However you do have the option of working from our shiny new HQ in Farringdon.
Interview Process
TA Screen
Technical Overview + Value based interview
Tech Session + Q&A with Engineering Team
Meet the CTO
Benefits
Our Technology team is fully remote (they are more than welcome to work from our office in Farringdon) and we offer the following benefits to our permanent employees
- Remote and flexible working
- International Remote working (up to 30 days in each holiday year)
- 25 days holiday + bank holidays + carry 5 days holiday over into the next holiday year
- All permanent employees get equity to recognise the valuable contribution you'll make to our growth
- Company days out and events, and team socials
- Home office budget
- Cycle scheme
- Employee Assistance Program
- Private medical insurance
- Competitive learning and development budget
- The opportunity to join at a major inflection point – ecommerce is booming and with it, the demand for loyalty software like LoyaltyLion
- Macbook, magic keyboard, and any other tech or equipment you need to do a great job