ob Description
Responsibilities:
Configure, fine-tune, and optimize Spark and YARN administration for high performance
and resource efficiency.
Setup, configuring and manage Data Lake Houses using Hudi and/or Delta Lake for
efficient data storage and processing.
Collaborate with cross-functional teams to design and maintain scalable and reliable
data lake architectures.
Monitor and troubleshoot Spark and Data Lake environments to ensure continuous
operation and performance.
Provide technical guidance, training and support to team members and stakeholders on
Spark and Data Lake administration best practices.
Requirements:
Proven experience in configuring and tuning Spark and YARN administration in large-
scale environments. Kubernetes experience a plus.
Solid understanding of Data Lake architectures and Big Table technologies, data
modeling, and management principles. Experience setting up, configuring and working
with Hudi and/or Delta Lake.
Experience in Python and Spark development is .
Strong problem-solving skills and the ability to troubleshoot complex technical issues.
Excellent communication and collaboration skills.