Job Description
Viridien ( www.viridiengroup.com ) is an advanced technology, digital and Earth data company that pushes the boundaries of science for a more prosperous and sustainable future. With our ingenuity, drive and deep curiosity we discover new insights, innovations, and solutions that efficiently and responsibly resolve complex natural resource, digital, energy transition and infrastructure challenges. Job Details We are seeking a highly experienced and skilled HPC Data Center Senior Linux IT Specialist to join our IT team. This role will play a vital part in contributing to the global HPC and Digital Platform team, ensuring seamless HPC and Cloud infrastructure performance. The successful candidate will have a proven track record in Linux administration, with a focus on storage systems as well as a strong understanding of system administration, troubleshooting, and IT service management. Key Responsibilities Develop and maintain technical documentation, including system diagrams, configuration files, and troubleshooting guides, to facilitate knowledge sharing and operational continuity. Install, configure, maintain, and repair Linux-based hardware and software, ensuring high functionality, performance, and reliability. Actively participate in the Change Management process, ensuring all proposed changes are documented, reviewed, and assessed in accordance with organizational and ITIL standards. Organize and schedule upgrades, perform routine maintenance, and ensure system security and privacy compliance. Provide expert-level technical support to users and junior team members, including mentoring and training to enhance skills and knowledge. Respond to support tickets, troubleshoot complex issues, and resolve system crashes, performance degradation, and security breaches. Monitor systems and services through performance monitoring, log analysis, and proactive issue detection. Continuously improve services through automation, optimization, and standardization, while staying updated on technology trends and best practices. Configure and manage SLURM job queues, troubleshoot job scheduling issues, and support high-performance computing environments. Collaborate with cross-functional teams to align IT services with user and business needs and ensure compliance with company IT policies and standards. Demonstrate strong project and time management skills to prioritize tasks effectively in a dynamic environment. Support in-house software applications, adhering to organizational standards and procedures. Essential Skills & Competencies: 5+ years of experience in Linux administration, preferably in an HPC environment. Strong understanding of system administration, troubleshooting, and IT service management. Experience deploying and administering data storage systems (SAN, NAS, tape archives) Experience with automation/configuration management using either Puppet, Chef, Salt, Ansible, Gitlab, or an equivalent. Ability to use a wide variety of open-source technologies and cloud services. Experience with Docker and container orchestration. Familiarity with code and script (Bash, Python, Perl); shell scripting. Excellent troubleshooting and problem-solving skills. Ability to create clear, accurate, and comprehensive technical documentation for systems, procedures, and workflows. Follow established change control procedures to ensure system stability and reduce operational risk. Desirable Experience in virtualization, and hardware maintenance (Storage/CPU/GPU). Certifications like CCNA or CompTIA Network+ are a plus. ITIL Foundation level certification. Knowledge of GPUs. Experience with High Performance Computing (HPC) and clustering technology (object storage, parallel file systems, RAID storage). Understanding of networks, RAID, and tape subsystems. Experience in a high-volume critical production service environment. Qualifications And Experience Bachelor's degree in IT, Computer Science, Computer Engineering, or a related field (or equivalent work experience). 5+ years of extensive experience in Linux administration, preferably in an HPC environment. Experience with scripting and automation. Knowledge of internet security and data privacy principles. Excellent communications, presentation, and customer service skills, and must have an outstanding track record of meeting customer expectations. Must be detail-oriented and work well in a team environment. Must have leg