Data Scientist

  • Gurugram
  • Axtria Ingenious Insights

Necessary Skills–

  • 3+ years of experience of model development using Python/PySpark libraries. Development on Databricks or Dataiku DSS (Data Science Studio) environment would be a plus
  • Strong experience on Spark with Scala/Python/Java
  • Strong proficiency in building/training/evaluating state of the art machine learning models and its deployment.
  • Proficiency in Statistical and Probabilistic methods such as SVM, Decision-Trees, Bagging and Boosting Techniques, Clustering
  • Proficiency in Core NLP techniques like Text Classification, Named Entity Recognition (NER), Topic Modeling, Sentiment Analysis, etc. Understanding of Generative AI / Large Language Models / Transformers would be a plus.
  • Hands on experience in Python data-science and math packages such as NumPy, Pandas, Sklearn, Seaborn, PyCaret, Matplotlib
  • Proficiency in Python and common Machine Learning frameworks (TensorFlow, NLTK, Stanford NLP, PyTorch, Ling Pipe, Caffe, Keras, SparkML and OpenAI etc.)
  • Experience of working in large teams and using collaboration tools like GIT, Jira and Confluence
  • Good understanding of any of the cloud platform – AWS, Azure or GCP
  • Understanding of Commercial Pharma landscape and Patient Data / Analytics would be a huge plus
  • Should have an attitude of willingness to learn, accepting the challenging environment and confidence in delivering the results within timelines. Should be inclined towards self motivation and self-driven to find solutions for problems.


Required Experience:

  • Real-world experience in implementing machine learning/statistical/econometric models/advanced algorithms (ideally, 3+ years of experience involving machine learning)
  • Breadth of machine learning domain knowledge
  • Experience in application of machine learning algorithms (classification, regression, deep learning, NLP, etc.)
  • Experience with a ML/data-centric programming language (such as Python, Scala, or R) and ML libraries (pandas, numpy, scikit-learn, etc.)
  • Experience with Apache Hadoop / Spark (or equivalent cloud-computing/map-reduce framework