Tower Research Capital LLC, a high-frequency proprietary trading firm founded in 1998, seeks a Linux System Administrator to join our Server Reliability Engineering team. The Server Reliability Engineering organization is responsible for providing innovative processes and tools for the operation of Tower's high-frequency Linux-based trading platforms and High Performance Computing Environment (HPC). You will also be expected to propose and drive the adoption of Infrastructure as Code (IaC) practices to make our storage solutions scalable and manageable, and develop our growing needs with GPU, balancing on-premises and cloud-based resources.
Responsibilities
Supporting, maintaining, and enhancing the firm's trading Linux infrastructure
Supporting, maintaining, and enhancing the firm's HPC infrastructure for research
Providing support specifically for the Linux and HPC environments including:
Emergency response
Execution of planned changes, updates, and deployment projects within the Linux server infrastructure
Manage HPC systems to support trading operations and Condor Job scheduler
Advanced profiling and troubleshooting of performance issues specifically within the Linux servers environment
Contributing to the development and refinement of tools and systems to automate provisioning, configuration, and monitoring of thousands of Linux servers
Management of essential core services such as DHCP, LDAP, DNS, and NFS for on-prem and hosted data centers as well as public clouds
Participating in an on-call rotation and occasional weekend shifts
Engaging in daily direct communication with trading teams and core engineering
Stay up-to-date with the latest technologies and best practices in HPC, storage, and GPU computing.
Qualifications
Experience in maintenance, operation, and administration of a sufficiently advanced Linux environment
Daily use of and contribution to developing automation and monitoring tools
Comprehensive understanding of Linux OS concepts and internals
Working knowledge of Intel-based hardware and server components
Good knowledge of Python, expert knowledge of Bash for scripting and automation tasks in a Linux environment
Understanding of Linux server-side networking and typical network protocols
Participation in open source or personal projects is a plus
Understanding of Linux configuration management, source control, CI/CD, and automated deployment
Strong communication skills and the ability to work effectively in a team.
Preferred Qualifications
Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
Familiarity with cloud computing platforms and hybrid cloud environments.
Knowledge of parallel file systems (e.g., GPFS), batch systems (e.g., Slurm, Grid Engine, Condor), and high-performance network interconnects.
Experience with VAST and Weka storage solutions is highly desirable.
Solid understanding of trading infrastructure and low-latency systems.
Excellent problem-solving skills and the ability to work in a fast-paced, dynamic environment.
Skills in managing hybrid cloud/on-premises environments.
Experience proposing and implementing Infrastructure as Code (IaC) practices from the ground up.
Anticipated New York annual base salary range $100,000-$130,000 plus eligible for discretionary bonus.
Benefits
Tower’s headquarters are in FiDi, the heart of downtown Manhattan, at the historic Equitable Building. While we work hard, Tower’s cubicle-free workplace, jeans-clad workforce, and well-stocked kitchens reflect the value the firm places on quality of life. Benefits include:
401(k) with company matching
5 weeks of paid vacation per year plus 11 paid holidays
Free breakfast, lunch, and snacks on a daily basis
Reimbursement for health and wellness expenses
Free events and workshops
Donation matching program
Tower Research Capital is an equal opportunity employer.
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job