AI / DATA ENGINEER
The AI/Data Engineer will play a crucial role in designing and implementing data ingestion, processing, and indexing pipelines for our RAG Search application. This role is responsible for ensuring secure and efficient data management, from ingestion to retrieval, and for building advanced retrieval and RAG capabilities. The successful candidate will have strong hands-on experience in data engineering, search technologies, and AI-powered search applications, with a focus on Python and enterprise search platforms.
- Design and implement scalable data ingestion pipelines for structured, semi-structured, and unstructured content from various sources.
- Develop connectors and ingestion jobs for batch, incremental, and near-real-time indexing.
- Implement mechanisms to track document versions, provenance, permissions, and deletion events.
- Extract and convert content from various file formats into standard markdown for AI readability.
- Design semantic chunking strategies for optimal retrieval quality, including chunk size and overlap.
- Implement metadata extraction, enrichment, and deduplication during ingestion.
- Build hybrid search capabilities combining keyword, semantic vector, and metadata-based retrieval.
- Develop re-ranking pipelines using advanced techniques to improve relevance of results.
- Implement security controls to ensure user access is restricted to authorized information.
- Build RAG pipelines, including agentic workflows, prompt engineering, and response grounding.
- Hands-on experience in data engineering, search technologies, and AI-powered search applications.
- Proficiency in Python and data processing frameworks/libraries.
- Strong experience with enterprise search platforms (e.g., Elasticsearch, OpenSearch).
- Knowledge of index mappings, metadata filters, ranking profiles, and relevance tuning.
- Experience with vector search, hybrid retrieval, and semantic search techniques.
- Ability to design ingestion pipelines that convert enterprise documents into structured Markdown.
- Familiarity with embedding models, re-ranking models, and LLM orchestration frameworks.
- Experience with tool calling, agentic workflows, and AI application orchestration.
- Working knowledge of commercial/open-source LLMs (e.g., Azure OpenAI, OpenAI).
- First-level University degree with a minimum of 5 years of professional experience, including 3 years of hands-on experience in building AI-powered search applications.