Job role

Databricks Data Engineer (2 Open Roles)

India (Offshore Office)

  ·  

Full-Time

  ·  

4+ Years (1+ year focused on Databricks)

About role

About Frisco Analytics

At Frisco Analytics, we are redefining how enterprises manage and trust their data. We are building a next-generation, AI-native Master Data Management (MDM) platform designed to solve the world’s most complex data challenges. By combining cutting-edge AI with robust engineering, we help organizations turn fragmented data into a single, reliable source of truth.

The Role

We are looking for two high-caliber Databricks Data Engineers to join our offshore team in India. In this role, you won't just be moving data—you will be the architect of the core pipelines that power our AI-native engine. You will work at the intersection of Big Data and AI, building scalable ingestion, transformation, and optimization workflows within the Databricks Lakehouse environment to support advanced Entity Resolution and machine learning models.

The mission

Why Join Frisco Analytics?

  • Build the Future: Play a pivotal role in creating an AI-native product from the ground up.
  • Modern Stack: Work with the latest advancements in Databricks, AI, and Cloud technologies.
  • Global Impact: Collaborate with a high-performing international team across our US and India offices.
  • Career Growth: High-visibility roles with clear paths for technical leadership and ownership.

What You’ll Do
  • Pipeline Architecture: Design, build, and maintain highly scalable end-to-end data pipelines using PySpark, Delta Lake, and SQL.
  • ETL Engineering: Develop sophisticated ETL workflows to ingest and process both structured and unstructured data from diverse sources.
  • AI/ML Collaboration: Work closely with AI/ML Engineers and QA teams to curate and optimize high-fidelity datasets required for complex entity resolution.
  • Data Integrity: Implement robust schema evolution handling, automated data validation frameworks, and unit testing to ensure high data quality.
  • Performance Engineering: Proactively optimize pipeline performance, fine-tune Databricks cluster configurations, and manage cloud cost efficiency.
  • Operational Excellence: Troubleshoot complex data lineage issues and performance bottlenecks across development, staging, and production environments.

Your Toolkit
  • Professional Experience: 4+ years of hands-on experience in Data Engineering, with at least 1+ year of dedicated experience in the Databricks ecosystem.
  • Stack Proficiency: Expert-level knowledge of PySpark, Spark SQL, and the Delta Lake storage layer.
  • Architecture Knowledge: Proven experience working within Data Lakehouse architectures and a deep understanding of Medallion (Bronze/Silver/Gold) architecture patterns.
  • Engineering Best Practices: Familiarity with CI/CD pipelines (Azure DevOps, GitHub Actions, or Jenkins) for automating data workflows and deployments.
  • Problem Solving: Strong debugging skills and the ability to perform root-cause analysis on complex data transformations.

Bonus Points (Nice to Have)

  • Experience with real-time data ingestion tools (Kafka, Amazon Kinesis, or Azure EventHub).
  • Prior experience in Master Data Management (MDM) or Entity Resolution projects.
  • Databricks Certified Data Engineer (Associate or Professional).

The Perks
Interested?

Apply directly here or send your CV/GitHub to contact@friscoanalytics.com.