Site Reliability Engineer

  • Pune
  • Arista Networks
Site Reliability Engineers at Arista are critical team members that have a breadth of knowledge encompassing all aspects of service delivery. They develop software solutions to enhance, harden and support our service delivery processes. This can include building and managing CI/CD deployment pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering, and auto-remediation. The SRE should have an “automate everything” mindset, helping us bring value to our customers by deploying services with incredible speed, consistency, and availability. The SRE constantly evaluates products and services before and after production releases to prevent, identify and fix problems that impact service availability in deploying, configuring, releasing, monitoring, recovering, and scaling. We are hiring for 1 year Contract initially and then convert Full Time based on performance. Responsibilities: Ensure the scalability, performance, and resilience of our suite of products Work with the development and product team to establish the right monitoring and alerting strategy Develop build, test, and deployment automation that seamlessly targets multiple cloud regions Define and implement standards and best practices related to, system architecture, service delivery, metrics, and the automation of operational tasks Optimize telemetry platform to identify customer-impacting events while providing relevant data to drive debugging Partner with the engineering team to optimize the performance of services for cloud architecture Debug Live Site events and conduct follow-up post-mortem and RCA analysis Qualifications B.E/B.Tech in Computer Science or equivalent 7+ years of relevant experience Scripting languages like Bash, Python, etc. Exposure to operational knowledge of managing applications in AWS/GCP Experienced in automating software build, deployment, and server configuration management using tools such as Puppet, Chef, and Jenkins Hands-on experience with Linux/Unix Administration Good understanding of containerization concepts - docker, ECS, EKS, Kubernetes Experience with building tools such as Jenkins Working experience with NoSQL databases such as MongoDB, PostgreSQL, etc. Understanding of basic networking concepts