Job Description
This job is with LexisNexis Legal & Professional®, an inclusive employer and a member of myGwork – the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly. Raleigh, NC / Hybrid About The Team LexisNexis is a leading global provider of legal, regulatory and business information, and analytics that help customers increase productivity and improve decision-making and outcomes. We help lawyers win cases, manage their work more efficiently, serve their clients better, and grow their practices. We feel privileged to work in a business that has a higher purpose - advancing the rule of law around the world - which is vital for building peace and prosperity in society. About The Role We are looking for a Data Engineer III to join our Data Engineering team at LexisNexis. This role is ideal for a highly skilled and experienced data engineer who can independently design and deliver scalable data platforms, optimize cloud-based analytics systems, and support mission-critical production pipelines. In this role, you will provide technical leadership across Databricks development, Redshift administration, engineering analytics, and behavior analytics ingestion. You will play a key role in shaping data architecture, improving platform reliability, and driving best practices for performance, observability, and operational excellence. You will also partner closely with cross-functional teams to deliver data solutions that support engineering, product, and analytics use cases at scale. Responsibilities Design, develop, optimize, and maintain large-scale data pipelines using Databricks, Spark, and Delta Lake. Define and implement scalable ETL/ELT patterns for structured and semi-structured data across multiple sources and domains. Improve performance, cost efficiency, reliability, and maintainability of Databricks jobs, workflows, and clusters. Contribute to data architecture decisions and establish engineering best practices for cloud-based data processing. Perform Redshift administration activities including cluster management, database administration, workload tuning, and operational support. Build, optimize, and maintain Redshift schemas, tables, and complex SQL transformations to support analytics and reporting needs. Design, develop, and maintain data pipelines powering engineering analytics dashboards, metrics, and reporting solutions. Partner with stakeholders to define, compute, and evolve engineering KPIs such as deployment frequency, MTTR, reliability, and quality metrics. Own ingestion and processing pipelines for behavior analytics and product telemetry, including: Matomo - event and tracking data ingestion; FullStory - behavioral analytics and session replay data; Pendo - product usage, feature analytics, and event telemetry Ensure data quality, completeness, and usability for downstream analytics consumers. Provide L2/L3 support for critical data pipelines and platform components, ensuring high availability and timely incident resolution. Lead root cause analysis efforts and implement corrective and preventive actions for recurring issues. Establish and improve monitoring, alerting, logging, and observability standards across data systems. Partner with platform and infrastructure teams to ensure compliance with LexisNexis operational and security standards. Requirements Strong hands-on experience with Databricks, Spark, and Delta Lake. Advanced proficiency in SQL, including performance tuning and optimization at scale. Strong experience with AWS, especially S3, Lambda, IAM, Redshift, Glue, and Step Functions. Strong experience in Redshift administration, including performance tuning, distribution and sort key strategy, workload management, and cluster optimization. Solid Python programming skills for ETL/ELT pipelines, automation, and data engineering solutions. Experience working with telemetry, clickstream, or behavioral analytics data. Familiarity with Matomo, FullStory, and/or Pendo data processing. Experience supporting production data environments, including incident management, RCA, and monitoring. Strong understanding of data quality, lineage, governance, and cataloging frameworks. Ability to independently drive technical solutions and influence design decisions across teams. Experience with Kafka, Kinesis, or other event streaming technologies. Familiarity with CI/CD pipelines, DevOps practices, and automated testing frameworks. Exposure to engineering analytics, including DORA metrics and developer productivity insights. Experience with AWS EMR for distributed data proce