Staff DevOps Engineer

  • Pune
  • Houghton Mifflin Harcourt
HMH is a learning technology company committed to delivering connected solutions that engage learners, empower educators, and improve student outcomes. As a leading provider of K–12 core curriculum, supplemental and intervention solutions and professional learning services, HMH partners with educators and school districts to uncover solutions that unlock students' potential and extend teachers' capabilities. HMH serves more than 50 million students and 4 million educators in 150 countries.     Our technical infrastructure AWS EC2, Terraform Enterprise, Docker, Aurora, Mesos, Kubernetes, ELK (Elastic Search, Logstash & Kibana). Grafana, Prometheus, Datadog, Telegraf, Runscope, Apollo, GraphQL. Microservices architecture, Spring, Java & NodeJS, React, Koa, Express.js.  Amazon RDS, Dynamo DB, Postgres, Oracle, MySQL, Influx DB, Linux, Jenkins, GitHub.  You can read more on our Engineering Blog -  here.   About the role: You will constantly be asking: what are the most important infrastructure problems we need to solve for today, that will increase the reliability and performance of our applications and infrastructure.   You will apply your deep technical knowledge, taking a broad look at our technology infrastructure You’ll help us identify common and systematic issues and validate these, prioritizing which to strategically address first We value collaboration so you will partner with our SRE/DevOps team, discussing and refining your ideas and preparing proof of concepts You’ll present and validate these across technology teams, figuring out the best solution  And you’ll be given ownership to engineer and implement your solutions   There are lots of interesting technology problems for you to solve, so you are constantly applying latest thinking. These include implementing Canary, designing a new automated pipeline solution, extension of Kubernetes capabilities, implementation of machine learning to build load testing, ensuring mutability of containerization etc.    You’ll get to evaluate existing technologies and design the future state, without being afraid to challenge the status quo. And you’ll regularly review existing infrastructure, looking for opportunities to improve, e.g. service improvement, cost reduction, security, and performance.    You’ll also get to automate everything necessary combining reliability with a pragmatic approach doing it right – the first time.   We’re continuing our journey of making our code and configuration deployments self-serve for our development teams. You’ll help us build and maintain the right tooling And you’ll have ownership to design and implement the infrastructure needed. You’ll also be involved in the daily management of our AWS infrastructure This means working with our Agile development teams, to troubleshoot server, application, and performance issues   Skills & Experience:  5 to 8 years hands-on SRE/DevOps experience in an Agile environment You’ll be able to collaborate effectively with both engineers and operations and be comfortable recommending best practices You bring substantial experience using AWS in a production environment You have the expertise and skills to navigate the AWS ecosystem and will know when and where to recommend the most appropriate service, and/or usage pattern You have experience resolving outages and are able to quickly diagnose issues and been instrumental in restoring normal service levels   You’ll also have significant experience, and/or an interest in the following: Experience managing cloud infrastructure as code Application container management Expertise with an RDBMS. You’ll know how to tune, scale and how performance and reliability are achieved Experience working with Linux Experience with management of Messaging Queues and event driven systems Experience working with firewalls, network and application load balancing & secret management Experience with CI/CD tools Experience with scripting languages A strong and informed point of view with respect to monitoring tools and how best to use them