Senior AI Engineer Job Houston area,Texas USA,Software Development

Overview

Role:
Senior AI Engineer

Location:

Houston, TX 77007 (mostly remote)

Duration:
Direct Hire

Work Authorization: US Citizen, Green Card Holders, or Authorized to Work in the US

Job Description

We are seeking a talented Senior AI Engineer with deep expertise in Large Language Model (LLM) engineering and design. The ideal candidate will be fluent in manipulating and integrating pre-trained LLMs within complex codebases to tackle practical challenges in data extraction, processing, and interactive systems. This role prioritizes hands-on application—such as customizing LLMs for specific tasks, including retrieval-augmented generation (RAG) pipelines and vision-based workflows—over building models from scratch.

You ll focus on leveraging LLMs for solutions like advanced chatbots, natural language interfaces, semantic search, and creative problem-solving across data-intensive scenarios, while collaborating in a fast-paced team to push our AI products forward.

Key Responsibilities

Integrate and fine-tune pre-trained LLMs into our codebase using APIs, frameworks, and orchestration tools to enable features like natural language querying, automated summarization, and intelligent anomaly detection in data streams.
Design, build, and optimize chatbot systems and conversational AI, incorporating LLMs for seamless user experiences, including multi-turn dialogues, context-aware responses, and integration with external data sources.
Implement RAG architectures to enhance LLM performance by combining retrieval from vector databases with generation, enabling accurate responses grounded in large-scale document corpora.
Apply advanced LLM techniques—such as prompt engineering, chain-of-thought prompting, retrieval-augmented generation (RAG), and agentic workflows—to solve general problems like automating workflows, debugging data pipelines, or generating insights from unstructured inputs.
Work with open-source LLMs (e.g., Gemma, Llama) for local deployment and inference, optimizing for on-premises or edge environments to ensure low-latency performance and data sovereignty.
Incorporate computer vision tasks, such as OCR for text extraction from images or documents, and broader CV techniques for processing visual data in hybrid LLM pipelines.
Experiment iteratively with LLM configurations, hyperparameters, and embeddings to boost performance metrics like accuracy, latency, and cost-efficiency in real-world scenarios, including vector search optimizations.
Maintain scalable codebase integrations, conduct thorough testing (e.g., unit tests for LLM outputs, A/B evaluations), and ensure compliance with AI best practices, including bias mitigation and data security.
Collaborate with cross-functional teams on code reviews, rapid prototyping, and knowledge sharing, while leveraging AI coding assistants to accelerate development.

Required Qualifications

Bachelor s or Master s degree in Computer Science, Artificial Intelligence, Machine Learning, or a related field.
3+ years of professional experience in AI engineering, specifically manipulating and deploying LLMs in production (e.g., via Hugging Face Transformers, Lang Chain, Llama Index, or OpenAI/Groq APIs), including hands-on work with open-source models like Gemma and Llama for local deployment (e.g., using Ollama, vLLM, or direct PyTorch inference setups).
Advanced proficiency in Python, including scripting for LLM pipelines, handling dependencies with tools like Poetry or Pipenv, and integrating with libraries such as Sentence Transformers for embeddings, FAISS for vector search, or Streamlit/Gradio for prototyping interfaces.
Experience with vector databases and semantic search (e.g., Pinecone, Weaviate, or FAISS) to support efficient retrieval in LLM applications.
Demonstrated expertise in RAG systems, from building retrieval components to integrating them with LLMs for enhanced reasoning and factuality.
Proven track record building and optimizing chatbots or conversational agents (e.g., using Rasa, Dialogflow, or custom LLM-based setups), with examples of deploying them in user-facing applications.
Strong general problem-solving abilities, demonstrated through projects…


Increase/decrease your Search Radius (miles)



Job Posting Language