
We are looking for a Software/Production ML Engineer to own and evolve real-world, production-grade AI systems within a fast-paced insurance technology company.
This is a hands-on engineering role focused on building, deploying, and operating customer-facing and internal AI services in production. Our team owns multiple live systems, including real-time decisioning pipelines, AI-driven operational automations, chatbots, and the ML infrastructure that powers them.
This role is not focused on offline modeling or research-only machine learning. We are looking for engineers who take end-to-end ownership of ML systems - from data and features, to inference services, deployment, monitoring, and on-call support in production environments. Candidates whose experience is primarily limited to offline modeling, experimentation, or handoff-based deployment workflows will not be a good fit for this role.
Design and build APIs and pub/sub event streams to support real-time machine learning inference and automated agentic processes.
Play a role in the development and maintenance of both online and offline feature stores for machine learning.
Gain familiarity with the property casualty insurance sector, including key policyholder and product attributes, to help enhance model effectiveness.
Implement industry-standard MLOps and LLMOps techniques to monitor ML models, feature sets, and agentic systems for performance degradation and data drift.
Support the ongoing development of our core MLOps platform, as well as the codebase and infrastructure for serverless AI applications.
Validate the performance of machine learning models through rigorous training and testing methodologies.
Collaborate with Data Science teams to engineer new features, construct transformation pipelines, integrate custom loss functions, and experiment with novel inference strategies such as chaining and shadow deployments.
Create and scale new agentic AI automations, guiding them from initial proof-of-concept through to full production deployment.
Construct evaluation frameworks designed to rigorously test AI applications, covering not only standard workflows but also the complex, real-world scenarios common to the car insurance domain.
Utilize the Python data ecosystem to execute machine learning projects and initiatives.
Take part in the team's weekly on-call rotation, addressing alerts promptly to maintain high service availability for both customers and internal stakeholders.
Experience writing production-quality Python code.
Experience with Python data science and machine learning libraries, including scikit-learn, pandas, numpy, and related tools.
Experience deploying, operating, and supporting ML or AI services in production, including monitoring, incident response, and iterative improvement.
Hands-on experience with AWS (e.g., Lambda, Step Functions, DynamoDB, IAM, containerized services).
Experience with Kafka or other event-driven / pub-sub systems.
Experience with Git and CI/CD pipelines in production environments.
Experience building or operating MLOps platforms or ML infrastructure.
Experience with real-time data pipelines and streaming architectures.
Experience with AI chatbots, LLM-based systems, or retrieval-augmented generation (RAG).
Familiarity with feature stores, model monitoring, and deployment strategies such as A/B or shadow deployments.