Roots is seeking a motivated AI Engineer to join our growing team and help shape the future of how businesses leverage AI for speed, accuracy, and scale.
We’re building cutting-edge infrastructure that powers real-time document processing, custom model deployment, and multimodal intelligence, bringing AI from research to production in one of the world’s most critical and fast-moving markets. Our platform automates complex insurance workflows with 98%+ accuracy and speed, helping carriers and brokers go from months to value in just weeks.
About the team:
We specialize in fine-tuning state-of-the-art models and developing our own custom multimodal systems that go beyond the performance of state-of-the-art reasoning models for our specific use cases. Our work spans everything from creating low-latency inference endpoints to embedding these models directly into customer workflows. Every line of code we write is in service of speed, scalability, and delivering meaningful value.
Responsibilities:
- Develop and maintain scalable LLM inference systems, optimizing for latency, throughput, and resource consumption across GPU clusters
- Implement model compression techniques (e.g., quantization, pruning) and streaming optimizations to reduce memory usage and maintain low-latency inference for both base and fine-tuned models
- Design failover strategies and resource allocation mechanisms to ensure uptime SLAs while managing GPU workloads effectively
- Integrate and benchmark model-serving frameworks (e.g., vLLM, Triton, ONNX Runtime) to identify performance trade-offs and drive targeted optimizations
- Experience in HPC techniques for profiling and optimizing GPU workloads to identify and eliminate performance bottlenecks
- Collaborate with data engineering and ML research teams to align model architecture changes with inference performance goals
Qualifications
- Graduate degree in Computer Science, Electrical Engineering, Mathematics, Statistics, or a related field or relevant experience
- Proven experience with LLM inference, including optimization and deployment using tools such as vLLM, TensorRT, multi-GPU setups, DeepSeek, and ONNX Runtime to ensure real-time, high-performance model serving in production environments.
- Proficiency in Python and familiarity with machine learning libraries and frameworks such as PyTorch, HuggingFace, Scikit-learn, Pandas, Numpy etc.
- Experience in foundational machine learning algorithms, with hands‑on application in developing and optimizing large language models.
- Excellent written and verbal communication skills, with a strong emphasis on the written word. We highly appreciate public articles or blogs that highlight communication skills.
- Demonstrated ability to work independently, prioritize tasks, and manage multiple projects simultaneously in a fast-paced and dynamic environment.
Our cash compensation range for this role is $160,000 - $220,000
Roots is building agentic AI to transform insurance.
We’re trusted by 3 of the Top 5 P&C insurance carriers and 3 of the Top 10 brokers, and we’ve built InsurGPT™, the most advanced Generative AI model trained on proprietary, insurance-specific data. It reads, reasons, and infers like a human, with purpose-built models tuned for real-world performance and precision.
Roots has raised $44M from top-tier investors including Harbert Growth Partners, MissionOG, and Liberty Mutual Strategic Ventures. We’re growing fast, tackling massive industry challenges, and shaping the future of AI in one of the world’s largest markets.
Join a world-class team building multimodal AI systems, shipping production-grade models, and redefining what’s possible through collaboration, speed, and curiosity.
Final offer amounts are determined by multiple factors, including, experience and expertise, and may vary from the amounts listed above.
Equity: In addition to the base salary, equity may be part of the total compensation package.
Roots Automation is an Equal Opportunity Employer. All applicants will be considered for employment without attention to race, color, religion, sexual orientation, gender identity, national origin, veteran or disability status. Roots Automation is a progressive and open-minded workplace where we do not tolerate discrimination or harassment in any form. If you are smart, passionate and good at what you do, come as you are.
Roots Automation is also committed to providing reasonable accommodations to individuals with disabilities throughout the application process and employment. If you need assistance or an accommodation, please contact us for more information.