Back to Jobs

Machine Learning Engineer, Ads Training Platform

Reddit, Inc.
πŸ‡ΊπŸ‡Έ United States – Remote
Full-time
$125K–$157K
Estimated
Remote
Apply Now

Required Skills

⏰ Full Time
🟑 Mid-level
🟠 Senior
πŸ€– Machine Learning Engineer
Distributed Systems
PyTorch
Ray
Tensorflow
Cloud
Python
Scikit-Learn
πŸ”΄ Lead
πŸ¦… H1B Visa Sponsor
Airflow
AWS
Azure
Docker
Google Cloud Platform
Keras
Kubernetes
Terraform
Numpy
Pandas
Machine Learning
Deep Learning
Rag
R
Pytorch

Job Description

<h3>πŸ“‹ Description</h3> β€’ Design, build, and maintain large-scale distributed training infrastructure for Ads ML models. β€’ Develop tools and frameworks on top of the Ray platform. β€’ Build tools to debug, profile, and tune distributed training jobs for performance and reliability. β€’ Integrate with object storage systems and improve data access patterns. β€’ Collaborate with ML engineers to improve model training time, efficiency, and GPU training costs. β€’ Drive improvements in scheduling, state management, and fault tolerance within the training platform to enhance overall performance. <h3>🎯 Requirements</h3> β€’ 3+ years in infrastructure/platform engineering or large-scale distributed systems. β€’ 2+ years hands-on experience with Ray platform. β€’ Strong understanding of distributed computing principles (task scheduling, fault tolerance, state management). β€’ Experience with distributed storage systems and large-scale data processing. β€’ Proven ability to debug and profile distributed jobs. β€’ Experience with deep learning frameworks (PyTorch, TensorFlow) is a big plus. β€’ Bonus: model optimization for distributed training, Ads ML experience. <h3>πŸ–οΈ Benefits</h3> β€’ Comprehensive Healthcare Benefits and Income Replacement Programs β€’ 401k Match β€’ Family Planning Support β€’ Gender-Affirming Care β€’ Mental Health & Coaching Benefits β€’ Flexible Vacation & Reddit Global Days off β€’ Generous paid Parental Leave β€’ Paid Volunteer time off

Job Details

Employment Type

Full-time

Salary Range

$125K–$157K

Estimated

Location

πŸ‡ΊπŸ‡Έ United States – Remote

Remote Work

Remote Friendly