Job Description
Must-Have Skills:
LLM, MLOps, PyTorch, RAG, Vector Database, Cloud Server (Google / AWS), Python
Good-to-Have Skills:
CI/CD, GCP Vertex, LangChain, SageMaker, TensorFlow
About the Role:
Pentimenti AI (one of Uplers clients) is looking for aSenior Machine Learning Engineerwho is passionate about their work, eager to grow, and committed to delivering exceptional results. If you are a team player with a positive attitude, this is the opportunity for you!
Agentic platformsrepresent the third wave of AI, enabling complex multi-step work via autonomous LLM-powered agents. You'll own the stackfrom research to deploymenthelping ship magical, high-impact features for users.
Key Responsibilities:
- Own the agentic & RAG roadmap: design, prototype, and launch LLM agents (planner-executor, multi-agent, tool-calling) with sub-second latency.
- Productionize RAG pipelines: embedding strategy, vector-DB design (Weaviate, Pinecone), hybrid search, evaluations, and guardrails.
- Fine-tune models with PEFT/LoRA, RLHF, and safety alignment; publish impactful research.
- Optimize inference: quantization (INT4/8), speculative decoding, TensorRT-LLM/vLLM, or Ray Serve to reduce token costs.
- Lead and mentor a high-agency team; establish MLOps, CI/CD, observability, and governance standards.
- Partner with product & design to turn research into scalable, user-facing features.
Core Qualifications:
- Experience:5+ years in software/ML, including 2+ years in LLM/NLP product delivery.
- Deep Learning Stack:Python and PyTorch (TensorFlow/JAX welcome). CUDA/Triton knowledge is a plus.
- Agentic & RAG Frameworks:Experience with LangChain, LlamaIndex, CrewAI, and vector DBs like Weaviate, Pinecone, Qdrant.
- Model Optimization:Quantization, distillation, AWS Neuron, GPU kernel tuning.
- Cloud & MLOps:Kubernetes, Ray, SageMaker, or GCP Vertex. Familiar with Terraform/Pulumi and observability tools.
- Communication & Leadership:Strong design documentation and cross-functional leadership skills.
Bonus Skills:
- Multimodal agent systems (vision-language, audio-language)
- Privacy-preserving ML (federated learning, differential privacy)
- Open-source contributions (LangChain, Pinecone, Triton, etc.)
Keywords
No Referrers Available
There are currently no referrers available for this job. You can still apply, will let you know once there is any referrer available.