Job Description
We are a global trading firm built on a strong research culture and advanced technology platform. Our teams across the US, Europe, Asia Pacific, and India work together to provide liquidity in financial markets. We prioritize innovation, collaboration, and continuous improvement, both in our trading strategies and the technology that powers them. We believe technology is the foundation of our competitive edge — and machine learning is increasingly central to how we trade. In recent years, we’ve been building out our machine learning capabilities by developing infrastructure, expanding our in-house GPU cluster, deploying models into production, and partnering closely with quant researchers and traders to drive measurable impact. We’re now scaling the team, evolving our systems, and accelerating the use of deep learning across research and execution workflows. We’re hiring a Principal Machine Learning Engineer to help shape the next phase of our platform. This role will influence architecture, establish best practices, and tackle high-impact technical challenges. You’ll work closely with researchers and engineers to design systems for experimentation, training, and deployment, while helping define how machine learning scales across the organization. If you’ve built ML infrastructure at scale and are looking for a role where your ideas directly shape direction and outcomes, this is an opportunity to make a meaningful impact. Your Core Responsibilities: Design and build end-to-end infrastructure for training, evaluation, and production deployment of ML models, in partnership with HPC engineers managing on-prem compute Influence core decisions across data access, orchestration, experiment tracking, model versioning, and deployment pipelines Partner with quant researchers to accelerate iteration and move models from prototype to production Support the application of modern architectures (e.g., transformers, state-space models, temporal convolutions, graph neural networks) to high-frequency, noisy data Advance approaches to reproducibility, continual learning, and production monitoring at petabyte scale Establish standards across teams and geographies; mentor engineers and contribute to technical culture Stay current with developments in deep learning and ML infrastructure, bringing relevant ideas into production Your Skills and Experience: 8+ years building ML platforms or infrastructure in a leading tech, research, or quantitative environment Proven ownership of large-scale training and inference systems, including architectural design Strong Python expertise, with experience in CUDA or C++ Hands-on experience with modern frameworks (PyTorch, TensorFlow, or JAX) and architectures such as transformers or sequence models Deep understanding of optimization, regularization, and large-scale training trade-offs Experience with distributed training (e.g., Horovod, NCCL) and GPU optimization (cuDNN, TensorRT) Track record of deploying models with strong observability, reproducibility, and monitoring Comfort working across the full ML stack, from data pipelines to serving systems Why This Role: Shape foundational systems rather than maintaining legacy infrastructure Work on a strategically backed initiative with real investment Direct impact on trading through production ML systems Collaborate with globally distributed teams and influence scalable practices Culture driven by strong ideas, collaboration, and technical rigor Close partnership with researchers in a highly iterative environment