Anthropic Application Solutions Engineer

building a production
rag assistant

a sovereign HR assistant for french public sector teams

EF Edouard Foussier ยท AI Engineer
The Challenge

the problem

helping HR managers navigate 150k+ pages of regulations

150k+
pages of HR docs
100+
HR managers
<2s
response target
100%
citable answers
Building an assistant for French public sector teams to navigate complex employment regulations. Every answer must be traceable to source documents.
Context

standard rag refresher

a simple pipeline with hidden trade-offs

๐Ÿ“„
Ingest PDF, DOCX, HTML
โ†’
โœ‚๏ธ
Chunking Fixed-size split
Failure Point:Facts split across chunks
โ†’
๐Ÿงฎ
Embedding Vector store
Failure Point:Domain limits
โ†’
๐Ÿ”
Retrieval Top-K similarity
Failure Point:Too small K = miss
โ†’
๐Ÿค–
Generation LLM answer
Failure Point:Hallucination
Every box is a trade-off. How we chunk, embed, choose K, and assemble context decides whether the LLM sees the right evidence or a noisy window.
My Solution

the pipeline

how the pieces connect

User Query
BGE-M3 Embedding
Qdrant Vector Search
Reranker (Optional)
LLM Generation
Citation Linkification

๐Ÿ”ค Sovereign Embeddings

  • BGE-M3 โ€” multilingual, open-source
  • No external API dependency
  • Full control over data

โšก HNSW Indexing

  • Sub-100ms retrieval on 150k chunks
  • Tuned ef=128 for recall/speed

๐Ÿ“Ž Auto Citations

  • Every claim linked to source URL
  • Regex post-processing [1] โ†’ links
Decisions

technical choices

trade-offs I made and why

๐Ÿ”ค BGE-M3 over OpenAI

Multilingual, open-source embeddings. No API costs, no data leaving the infra. Critical for sovereignty requirements.

โšก Qdrant + HNSW

Vector DB with tunable HNSW. ef=128 gives 98% recall with sub-100ms latency on 150k chunks.

๐ŸŽฏ Optional Reranking

Cross-encoder reranker on complex queries only. +15% relevance when needed, saves compute otherwise.

๐Ÿ“Ž Citation Linkification

Regex post-processing turns [1], [2] into clickable source links. Trust through transparency.


# retrieval.py โ€” the core search logic
hits = qdrant.search(
    collection_name="rag_rh_chunks",
    query_vector=embed(query),        # BGE-M3
    limit=32,
    search_params=SearchParams(hnsw_ef=128)
)
if use_rerank:
    hits = rerank(query, hits, top_k=8)   # cross-encoder
        
Demo

see it live

the system is deployed and serving real users

// live deployment

Self-hosted on my own infrastructure

Caddy reverse proxy ยท Docker Compose ยท Automatic TLS ยท Qdrant vector DB

Try it โ†’ rag.edouardfoussier.com
Why Me

what I'd bring to Anthropic

from building production AI systems

  • ๐Ÿ”ง Technical depth with production mindset โ€” I've shipped AI to real users, not just prototypes
  • ๐Ÿค Customer empathy โ€” I've sat with HR managers to understand their pain points
  • ๐Ÿ”’ Trust-first approach โ€” Citations, auditability, and sovereignty are core to my work
  • ๐Ÿ“ˆ Iterative improvement โ€” Data-driven decisions, not guesswork
Ready to help European enterprises deploy Claude with confidence.

thank you

Let's build AI that enterprises can trust.

EF Edouard Foussier ยท AI Engineer ยท Paris