Anthropic Application
Solutions Engineer
building a production
rag assistant
a sovereign HR assistant for french public sector teams
EF
Edouard Foussier ยท AI Engineer
The Challenge
the problem
helping HR managers navigate 150k+ pages of regulations
Building an assistant for French public sector teams to navigate complex employment regulations.
Every answer must be traceable to source documents.
Context
standard rag refresher
a simple pipeline with hidden trade-offs
๐
Ingest
PDF, DOCX, HTML
โ
โ๏ธ
Chunking
Fixed-size split
Failure Point:Facts split across chunks
โ
๐งฎ
Embedding
Vector store
Failure Point:Domain limits
โ
๐
Retrieval
Top-K similarity
Failure Point:Too small K = miss
โ
๐ค
Generation
LLM answer
Failure Point:Hallucination
Every box is a trade-off. How we chunk, embed, choose K, and assemble context decides
whether the LLM sees the right evidence or a noisy window.
My Solution
the pipeline
how the pieces connect
User Query
BGE-M3 Embedding
Qdrant Vector Search
Reranker (Optional)
LLM Generation
Citation Linkification
๐ค Sovereign Embeddings
- BGE-M3 โ multilingual, open-source
- No external API dependency
- Full control over data
โก HNSW Indexing
- Sub-100ms retrieval on 150k chunks
- Tuned ef=128 for recall/speed
๐ Auto Citations
- Every claim linked to source URL
- Regex post-processing [1] โ links
Decisions
technical choices
trade-offs I made and why
๐ค BGE-M3 over OpenAI
Multilingual, open-source embeddings. No API costs, no data leaving the infra. Critical for sovereignty requirements.
โก Qdrant + HNSW
Vector DB with tunable HNSW. ef=128 gives 98% recall with sub-100ms latency on 150k chunks.
๐ฏ Optional Reranking
Cross-encoder reranker on complex queries only. +15% relevance when needed, saves compute otherwise.
๐ Citation Linkification
Regex post-processing turns [1], [2] into clickable source links. Trust through transparency.
# retrieval.py โ the core search logic
hits = qdrant.search(
collection_name="rag_rh_chunks",
query_vector=embed(query), # BGE-M3
limit=32,
search_params=SearchParams(hnsw_ef=128)
)
if use_rerank:
hits = rerank(query, hits, top_k=8) # cross-encoder
Demo
see it live
the system is deployed and serving real users
// live deployment
Self-hosted on my own infrastructure
Caddy reverse proxy ยท Docker Compose ยท Automatic TLS ยท Qdrant vector DB
Try it โ rag.edouardfoussier.com
Why Me
what I'd bring to Anthropic
from building production AI systems
-
๐ง
Technical depth with production mindset โ I've shipped AI to real users, not just prototypes
-
๐ค
Customer empathy โ I've sat with HR managers to understand their pain points
-
๐
Trust-first approach โ Citations, auditability, and sovereignty are core to my work
-
๐
Iterative improvement โ Data-driven decisions, not guesswork
Ready to help European enterprises deploy Claude with confidence.
thank you
Let's build AI that enterprises can trust.
EF
Edouard Foussier ยท AI Engineer ยท Paris