Anthropic Application Solutions Engineer

building a production
rag assistant

a sovereign HR assistant for french public sector teams

EF Edouard Foussier · AI Engineer

The Challenge

the problem

helping HR managers navigate 150k+ pages of regulations

150k+

pages of HR docs

100+

HR managers

<2s

response target

100%

citable answers

Building an assistant for French public sector teams to navigate complex employment regulations. Every answer must be traceable to source documents.

Context

standard rag refresher

a simple pipeline with hidden trade-offs

📄

Ingest PDF, DOCX, HTML

→

✂️

Chunking Fixed-size split

Failure Point:Facts split across chunks

→

🧮

Embedding Vector store

Failure Point:Domain limits

→

🔍

Retrieval Top-K similarity

Failure Point:Too small K = miss

→

🤖

Generation LLM answer

Failure Point:Hallucination

Every box is a trade-off. How we chunk, embed, choose K, and assemble context decides whether the LLM sees the right evidence or a noisy window.

My Solution

the pipeline

how the pieces connect

User Query

BGE-M3 Embedding

Qdrant Vector Search

Reranker (Optional)

LLM Generation

Citation Linkification

🔤 Sovereign Embeddings

BGE-M3 — multilingual, open-source
No external API dependency
Full control over data

⚡ HNSW Indexing

Sub-100ms retrieval on 150k chunks
Tuned ef=128 for recall/speed

📎 Auto Citations

Every claim linked to source URL
Regex post-processing [1] → links

Decisions

technical choices

trade-offs I made and why

🔤 BGE-M3 over OpenAI

Multilingual, open-source embeddings. No API costs, no data leaving the infra. Critical for sovereignty requirements.

⚡ Qdrant + HNSW

Vector DB with tunable HNSW. ef=128 gives 98% recall with sub-100ms latency on 150k chunks.

🎯 Optional Reranking

Cross-encoder reranker on complex queries only. +15% relevance when needed, saves compute otherwise.

📎 Citation Linkification

Regex post-processing turns [1], [2] into clickable source links. Trust through transparency.


# retrieval.py — the core search logic
hits = qdrant.search(
    collection_name="rag_rh_chunks",
    query_vector=embed(query),        # BGE-M3
    limit=32,
    search_params=SearchParams(hnsw_ef=128)
)
if use_rerank:
    hits = rerank(query, hits, top_k=8)   # cross-encoder

Demo

see it live

the system is deployed and serving real users

// live deployment

Self-hosted on my own infrastructure

Caddy reverse proxy · Docker Compose · Automatic TLS · Qdrant vector DB

Try it → rag.edouardfoussier.com

Why Me

what I'd bring to Anthropic

from building production AI systems

🔧 Technical depth with production mindset — I've shipped AI to real users, not just prototypes
🤝 Customer empathy — I've sat with HR managers to understand their pain points
🔒 Trust-first approach — Citations, auditability, and sovereignty are core to my work
📈 Iterative improvement — Data-driven decisions, not guesswork

Ready to help European enterprises deploy Claude with confidence.

thank you

Let's build AI that enterprises can trust.

✉️ edouard.foussier@gmail.com 💼 LinkedIn 💻 GitHub 🚀 Live Demo

EF Edouard Foussier · AI Engineer · Paris

building a productionrag assistant