Data Engineer with Pyspark

Hexplora

FULL_TIME Remote · US Rocky Hill, CT, United States, CT, US Posted: 2026-05-11 Until: 2026-07-11

You will be redirected to the original job posting on BeBee.
Apply directly with the employer.

Job Description

Title: Data Engineer (PySpark) Location: Rocky Hill, CT (Onsite) Job Summary We are seeking a highly skilled Data Engineer with strong experience in PySpark, Databricks, and Apache Spark to design, build, and optimize scalable data pipelines. The ideal candidate will have a solid background in big data processing, cloud platforms, and distributed systems, with a focus on delivering high-quality, reliable data solutions. Key Responsibilities Design, develop, and maintain scalable data pipelines using PySpark and Apache Spark Build and optimize ETL/ELT workflows on Databricks Collaborate with data scientists, analysts, and stakeholders to understand data requirements Ensure data quality, integrity, and governance across data platforms Optimize performance of Spark jobs and large-scale data processing systems Work with structured and unstructured data from multiple sources Implement data storage solutions using cloud platforms (AWS, Azure, or GCP) Monitor, troubleshoot, and resolve data pipeline issues Maintain documentation for data architecture and workflows Required Qualifications Bachelor''s degree in Computer Science, Engineering, or related field Strong hands-on experience with PySpark and Apache Spark Proven experience working with Databricks Proficiency in Python and SQL Hands on experience with AI querying in Databricks platform Experience with distributed data processing and big data technologies Familiarity with cloud platforms (AWS, Azure, or GCP) Experience with data warehousing solutions (e.g., Snowflake, Redshift, BigQuery) Strong problem-solving and analytical skills Infowave Systems is an equal opportunity employer that is committed to diversity and inclusion in the workplace.