Computer Vision/Machine Learning Engineer
Job ID: 75557
Posted today
Cupertino, California
47 - 72/hr
Cupertino, California
Contract
47 - 72/hr
On-Site
Job Details
Computer Vision / Multimodal AI Engineer
The Select Group is looking to add a Computer Vision and Multimodal AI Engineer to a highly innovative AI team to help shape the next generation of intelligent, AI-powered business solutions.
This is an opportunity to work on complex, real-world challenges at the intersection of Computer Vision, Vision-Language Models (VLMs), NLP, and agentic AI. Rather than maintaining existing systems, this individual will help build and scale new capabilities that transform visual information into actionable insights and business value.
We are seeking someone who enjoys experimentation, thrives in ambiguity, and can take an idea from concept through production deployment. Success in this role will come from a combination of technical depth, curiosity, and the ability to translate cutting-edge AI technologies into practical solutions that users can rely on every day.
What You'll Be Helping Build
- Intelligent systems that analyze and understand visual data at scale.
- Multimodal AI solutions that combine images, language, and reasoning to deliver richer insights.
- Agentic AI workflows capable of surfacing recommendations and automating decision-making processes.
- Production-grade AI applications that support both operational efficiencies and customer-facing experiences.
- Scalable architectures that enable rapid experimentation while supporting long-term growth.
What Success Looks Like
The right person will be comfortable owning the full lifecycle of machine learning initiatives—from designing experiments and validating assumptions to deploying models and continuously improving performance.
This role will have significant influence on how AI solutions are architected, evaluated, and scaled. Strong candidates are naturally inquisitive, enjoy investigating why models succeed or fail, and can turn those findings into measurable improvements.
Technical Environment
- Computer Vision and Machine Learning
- Vision-Language Models (CLIP, BLIP, Gemini, and similar frameworks)
- Multimodal AI and reasoning systems
- Agentic AI workflows
- Python
- PyTorch, TensorFlow, CoreML, or similar ML frameworks
- Cloud-based AI deployment environments
- Model evaluation, optimization, and performance tuning
Who Tends to Excel in This Environment
- Engineers who have successfully deployed machine learning solutions into production.
- Professionals who enjoy building and testing new ideas rather than simply maintaining existing systems.
- Individuals who can move comfortably between research, experimentation, engineering, and business conversations.
- Practitioners who are excited about emerging technologies in multimodal AI, visual reasoning, and autonomous agents.
- Self-starters who can operate independently while collaborating effectively across engineering, product, and business teams.
Why This Opportunity Stands Out
- Work on some of the most rapidly evolving areas of artificial intelligence.
- Opportunity to influence architecture, strategy, and technical direction.
- Exposure to Computer Vision, multimodal AI, VLMs, and agentic systems in production environments.
- Collaborative team culture that values experimentation, innovation, and measurable business impact.
- Long-term engagement supporting the development of next-generation AI capabilities.