Senior Software Engineer/Developer - AI
Overview
The Senior Principal Software Engineer/Developer – AI serves as a senior, hands-on full-stack AI engineer and technical authority, leading the technical strategy, design, and delivery of large-scale mission-critical AI systems supporting federal programs (e.g., HUD, AIR platform). This role combines senior technical leadership, hands-on expertise in Python-based AI/ML systems (including large language models), and ownership of enterprise architecture, governance, and innovation.
Responsibilities
- Serve as the primary technical authority, defining AI and application architecture across multiple programs
- Establish enterprise modernization roadmaps aligned to mission outcomes, compliance, and scalability
- Lead architecture for distributed, cloud-native, and hybrid AI systems
- Define and enforce reference architectures, standards, and reusable frameworks
- Drive cross-program technical decision-making to ensure interoperability, security, and long-term sustainability
- Advise senior federal stakeholders (SES-level and above) on AI adoption, modernization, and risk management
- Lead design, development, and deployment of advanced AI solutions using Python as the primary development language, including large language models (LLMs) and foundation models, Retrieval-Augmented Generation (RAG) systems, agentic workflows, and orchestration frameworks
- Architect and implement scalable ML systems and services built on Python-based frameworks and APIs
- Build full-stack AI applications end to end, from user-facing interfaces to back-end services, APIs, and data layers
- Integrate AI and LLM capabilities into existing enterprise applications and legacy platforms (e.g., content management, case management, and records systems) via APIs, middleware, and event-driven patterns
- Define and implement distributed training strategies (GPU/TPU clusters, parallelization, optimization)
- Oversee full ML lifecycle in partnership with the Senior Data Scientist: data pipelines, feature engineering, training, evaluation, deployment, and monitoring
- Drive model optimization techniques (quantization, distillation, caching) to improve performance and cost
- Establish robust MLOps practices leveraging Python-driven automation, pipelines, and tooling
- Stand up the enterprise CI/CD-to-AI/MLOps pipeline, beginning with time-boxed proofs of concept and MVP implementations that mature into production systems
- Serve as subject matter expert in federal AI policy (e.g., NIST AI RMF, OMB M-25-21 and M-25-22, Executive Order 14179)
- Define and operationalize Responsible AI frameworks, including model validation and evaluation, bias mitigation and fairness, and explainability, auditability, and safety
- Ensure compliance with FISMA, FedRAMP, NIST 800-53, privacy, and Section 508 requirements
- Lead large-scale modernization initiatives (e.g., legacy-to-cloud, microservices transformation, including Python-based refactoring and re-platforming efforts)
- Define repeatable modernization frameworks and accelerators
- Oversee DevSecOps pipelines, CI/CD automation, zero-trust architectures, and secure software supply chain practices
- Ensure delivery of resilient, high-availability systems in regulated federal environments
- Lead multiple concurrent engineering efforts across integrated teams
- Provide technical leadership to architects, engineers, and DevSecOps specialists, including establishing Python coding standards and engineering best practices
- Mentor senior engineers and technical leaders; elevate engineering excellence and code quality
- Support technical strategy in proposals, captures, and client engagements
- Contribute to thought leadership (whitepapers, architecture patterns, platform strategy)
- Expert-level proficiency in Python, including building large-scale AI/ML systems, APIs, and data pipelines
- Full-stack engineering skills, including front-end frameworks, back-end services, RESTful APIs, microservices, and cloud-native deployment (e.g., containers, Kubernetes)
- Deep expertise in machine learning and deep learning, particularly transformer-based models and LLMs
- Hands-on experience with ML frameworks (PyTorch, TensorFlow, JAX) and distributed training (DeepSpeed, FSDP, Horovod)
- Proven ability to integrate AI capabilities into existing and legacy enterprise systems (e.g., legacy CMS or COTS platforms) using APIs, middleware, connectors, and event-driven architectures
- Strong understanding of large-scale data systems and ML evaluation methodologies
- Experience working with sensitive data, including PII safeguards such as anonymization, masking, and data loss prevention
- Experience with enterprise integration technologies, including REST/SOAP services, message queues, ETL pipelines, and SQL/NoSQL databases
- Expertise designing AI systems in cloud-native, distributed environments across AWS, Azure, and GCP
- Proficiency with managed generative AI services (e.g., AWS Bedrock, Azure OpenAI Service, Google Vertex AI) and integrating frontier models such as GPT, Claude, and Gemini
- Hands-on experience with LLM application stacks, including orchestration frameworks (e.g., LangChain, LlamaIndex, Semantic Kernel), embeddings, vector databases, and prompt engineering
- Executive communication skills with experience influencing senior leaders
- Demonstrated ability to own solutions end to end — from discovery and prototyping through production deployment, integration, and ongoing support
- Ability to balance strategic vision with deep hands-on technical execution
Qualifications
- US. Citizenship required
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
- 12–15+ years of software engineering experience, including significant leadership responsibility
- 8+ years of applied AI/ML experience, including building and deploying production systems (LLMs, generative AI, and large-scale or distributed model systems) Expert-level Python development experience, including designing production-grade ML systems, data pipelines, and microservices-based architectures
- Deep experience with cloud platforms (Azure, AWS, GCP), including FedRAMP environments
- Experience with AI platforms and architectures (e.g., AWS Bedrock, Azure OpenAI Service, Google Vertex AI, RAG, agents)
- Proven success delivering enterprise-scale systems and modernization programs
- Strong background in microservices, APIs, distributed systems, and DevSecOps practices
- Experience managing GPU-based infrastructure or high-performance ML environments
- Demonstrated ability to translate AI research into production system
- Active clearance (Public Trust, Secret, or higher) preferred
- Experience with HUD or federal civilian agencies preferred