Site Reliability Engineer
Posted on Wednesday, June 28, 2023
Presto Automation Inc. (NASDAQ: PRST) is a public company in voice recognition AI technology. It applies that technology in the restaurant drive-through setting, focused on well-known customers such as Del Taco, Checkers, and in the sit-down setting such as Red Lobsters, Applebee’s, and Chili’s. The company foresaw the rise of AI and the value of data analytics in our rapidly advancing technological society and has benefited from early mover advantage. It was one of the relatively few technology companies able to successfully become public in 2022. Founded out of MIT in 2008, the company has grown out of its formerly scrappy roots and is now focused on expanding into all household restaurant chains, at the thousands-of-locations level.
We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) to join Presto AI’s Cloud Ops Engineering team! As an SRE, you will be responsible for ensuring the reliability, availability, and performance of our systems and applications. You will collaborate with software engineers, system administrators, and other cross-functional teams to design, build, and operate scalable and resilient infrastructure. This role requires a strong background in software development, systems engineering, and a passion for automating operational tasks.
- Design, implement, and maintain highly available and horizontally scalable systems and applications that are easily monitored with the mindset of predicting downtime to avoid it and when applicable, minimize it.
- Collaborate with software engineers to ensure that new systems and applications are built with reliability and operability in mind.
- Develop tools and automation to improve system reliability, deployment, monitoring, and incident response through tools like New Relic, Grafana, and Sentry.
- Conduct system performance and capacity planning to meet service level objectives.
- Identify and address system and application bottlenecks, performance issues, and reliability risks.
- Maintenance of Cloud Ops and SRE Dashboards within New Relic.
- Monitor production systems and applications, respond to alerts, and troubleshoot incidents to minimize downtime and ensure a high level of service availability.
- Participate in an on-call rotation to provide 24/7 support for critical systems and applications related to both our Tabletop and AI platforms.
- Implement and enforce best practices for system and application security, data protection, and disaster recovery.
- Collaborate with cross-functional teams to define and improve operational processes, including incident management, change management, and configuration management.
- Continuously analyze system and application performance metrics to identify areas for improvement and optimize resource utilization.
- Reduce risk associated with Disaster Recovery planning and execution.
- Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent work experience).
- 4+ years of experience as a Site Reliability Engineer, DevOps Engineer, or a similar role.
- Demonstrable programming skills in languages such as Python, Go, or Java.
- Proficiency in designing, deploying, and managing cloud-based infrastructure (AWS, Azure, or GCP. We currently use AWS).
- Experience with configuration management tools and infrastructure-as-code frameworks (we use Terraform and CloudFormation for both!).
- Solid understanding of networking concepts and protocols.
- Deep knowledge of Linux/Unix and Windows systems administration and troubleshooting.
- Experience with containerization technologies (e.g., Docker, Kubernetes) and container orchestration platforms.
- Familiarity with monitoring and logging tools (we use New Relic, Prometheus, Grafana, ELK stack).
- Relevant certifications (e.g. AWS Certified DevOps Engineer, Certified Kubernetes Administrator) are a plus.
Presto (NASDAQ: PRST) has a compensation strategy that aims to reward high performers and retain them for the long term. Other benefits to U.S.-based employees include medical, dental, and vision insurance, 401(k) program, and paid-time-off (PTO). To learn more, please visit: www.presto.com.
We value people from all walks of life and are committed to creating an inclusive hiring process and work environment. We especially encourage historically underrepresented candidates to apply. We are an equal employment opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, or any other characteristic protected by law. If you need an accommodation to access the job application or interview process, please contact firstname.lastname@example.org.