Staff Network Engineer (Menlo Park, CA) #4765

BioSpace

FULL_TIME Remote · US Menlo Park, CA, City of Overland Park, US USD 135000–179000 / month Posted: 2026-05-11 Until: 2026-07-10

Apply Now →

You will be redirected to the original job posting on BeBee.
Apply directly with the employer.

Job Description

Our mission is to detect cancer early, when it can be cured. We are working to change the trajectory of cancer mortality and bring stakeholders together to adopt innovative, safe, and effective technologies that can transform cancer care. We are a healthcare company, pioneering new technologies to advance early cancer detection. We have built a multi-disciplinary organization of scientists, engineers, and physicians and we are using the power of next-generation sequencing (NGS), population-scale clinical studies, and state-of-the-art computer science and data science to overcome one of medicine’s greatest challenges. GRAIL is headquartered in the bay area of California, with locations in Washington, D.C., North Carolina, and the United Kingdom. It is supported by leading global investors and pharmaceutical, technology, and healthcare companies. For more information, please visit grail.com As a Staff Network Engineer at GRAIL, you will be a hands‑on technical leader responsible for building, operating, and evolving our cloud and hybrid network infrastructure. You’ll spend a significant portion of your time designing, implementing, and troubleshooting secure, scalable, and highly available network solutions in AWS (centered on Amazon VPC), while also owning critical on‑prem and data center networking (Juniper/Aruba) and Palo Alto firewalls. You will both execute (design, configure, implement, monitor, and debug) and provide architectural leadership, standards, and mentorship across teams. A key part of the role includes robust monitoring, logging, dashboarding, and capacity planning to ensure reliable, predictable network performance. This is a hybrid role based in Menlo Park, CA (moving to Sunnyvale, CA in Fall 2026) . Our current flexible work arrangement policy requires that a minimum of 80%, or 32 hours, of your total work week be on-site for this role. Your specific schedule, determined in collaboration with your manager, will align with team and business needs and could exceed the 60% requirement for the site. Responsibilities Staff Network Engineering - AWS and Hybrid Cloud AWS VPC Engineering Design, build, and maintain Amazon VPCs including CIDR planning, subnet design (public/private), route tables, Internet Gateways (IGW), NAT gateways, and VPC endpoints (Interface/Gateway). Configure and manage security controls such as Security Groups, NACLs, AWS Network Firewall, and AWS WAF for defense‑in‑depth across environments. Hybrid Connectivity Implement and support hybrid connectivity using AWS Direct Connect, Site‑to‑Site VPNs, and AWS Transit Gateway for scalable VPC‑to‑VPC and on‑prem connectivity. Traffic Management & DNS Configure Amazon Route 53 for internal and external DNS, routing policies, health checks, and failover. Deploy and manage Elastic Load Balancing (ALB/NLB/GLB) to provide high availability, SSL termination, path‑based routing, and/or TCP/UDP load balancing. On‑Prem & Data Center Networking Operate and troubleshoot on‑prem and data center networks using Juniper and Aruba platforms (switching, routing, VLANs, VRFs, BGP/OSPF). Configure, manage, and tune Palo Alto Networks firewalls, including security policies, NAT, VPN, and content inspection. Monitoring, Logging & Dashboards Design and implement end‑to‑end monitoring, alerting, and dashboards for network health, performance, and security, leveraging tools such as: VPC Flow Logs, CloudWatch metrics/logs, and Route 53 health checks. Firewall logs and on‑prem device telemetry. Build and maintain dashboards for: Link utilization, latency, packet loss, and error rates (DX, VPN, TGW, campus links). Load balancer health, connection metrics, and capacity. DNS performance and resolution issues. Establish actionable alerting thresholds and runbooks to support rapid incident triage and resolution. Capacity Planning & Performance Perform ongoing capacity planning for AWS networking (VPCs, TGW, DX, VPN, load balancers) and on‑prem links, forecasting growth and identifying bottlenecks. Analyze traffic patterns and utilization data to right‑size connectivity, optimize routing, and plan upgrades before they become constraints. Run performance tests and baselines (throughput, latency, failover behavior) and tune configurations accordingly. Incident Response & Troubleshooting Lead network‑related incident response, including real‑time troubleshooting across layers (DNS, TCP/IP, TLS, HTTP, internal app protocols). Drive root‑cause analysis (RCA) and implement corrective and preventive actions (runbooks, automation, design changes). Architecture & Design (Significant Component)