logo

View all jobs

Lead AI Infrastructure Engineer (Python/ML )

USA, Remote

Type: Long-term contract
Location: Remote (overlap with PST)

At Sphere, we partner with global logistics company leveraging AI, Machine Learning, and Data Engineering to optimize warehouse operations, predictive maintenance, and route planning.

Role: Build and maintain scalable AI infrastructure, enabling teams to run ML experiments, deploy machine learning models, and implement MLOps pipelines for production-grade AI.

Key Responsibilities:

  • Design distributed training pipelines for large-scale machine learning and deep learning models.
  • Optimize compute and storage resources for cloud-based AI/ML workloads on AWS, GCP, or Azure.
  • Collaborate with data scientists and ML engineers to deploy models in production efficiently.
  • Implement monitoring, logging, and alerting for model performance and AI workflows.
  • Ensure scalable, maintainable, and reliable AI infrastructure to support real-time and batch ML applications.

Requirements:

  • 5+ years in Python and ML infrastructure.

  • Experience in cloud AI platforms (AWS Sagemaker, GCP AI Platform, Azure ML).

  • Experience withcontainerization (Docker), orchestration (Kubernetes), and CI/CD for ML.

  • Experience with distributed systems, data pipelines, and high-performance computing for AI.

  • Hands-on with deep learning frameworks like TensorFlow or PyTorch.

Share This Job

Powered by