Data Engineer (PySpark + AWS)

Diverse Lynx

FULL_TIME Remote · US Chicago, IL, US Posted: 2026-05-11 Until: 2026-06-10

You will be redirected to the original job posting on BeBee.
Apply directly with the employer.

Job Description

Job Title: Data Engineer (PySpark + AWS) Location : Chicago, IL 5 days onsite (Local or nearby states only) Exp: 10+ Yrs RTTO - 5 Days Onsite Job Description: Job Summary We are looking for a skilled Data Engineer to design and build scalable data solutions using PySpark and AWS services. The ideal candidate will have hands-on experience in building modern data platforms using Apache Iceberg and implementing Medallion architecture on AWS. Key Responsibilities Design and implement end-to-end data solutions using PySpark, ensuring scalability and performance. Build and manage data pipelines using AWS services such as AWS Glue, EMR, and Lambda. Develop data products using PySpark + AWS Glue stack. Implement Medallion Architecture (Bronze, Silver, Gold layers) for structured data processing. Work with Apache Iceberg tables for efficient data storage, versioning, and schema evolution. Ensure data quality, governance, and optimization across pipelines. Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions. Optimize data processing jobs and improve performance and cost-efficiency on AWS. Required Skills & Experience Strong experience in PySpark for data processing and pipeline development. Hands-on experience with AWS ecosystem (Glue, EMR, Lambda, S3). Experience implementing Medallion Architecture. Practical knowledge of Apache Iceberg or similar table formats. Strong understanding of distributed data processing and big data frameworks. Experience designing scalable and reliable data pipelines. Good understanding of data modeling and ETL/ELT concepts. Preferred Qualifications