How Criterion Works
Criterion combines modern AI with authentic Islamic texts to help you explore Islam. Here's how our technology works in simple terms.
The Big Picture
When you ask a question, Criterion doesn't just look for keywords. Instead, it:
- Understands the meaning of your question
- Searches through 6,236 Quran verses and 12,416 Hadiths
- Retrieves the most relevant passages with context
- Generates a response grounded in these authentic sources
- Cites every reference so you can verify
Semantic Search (Understanding Meaning)
Traditional Search (Keywords)
Most search engines look for exact words. If you search "afterlife," they only find verses with the word "afterlife."
Semantic Search (Meaning)
Criterion understands concepts. Search for "afterlife" and we'll also find:
- "Day of Judgment"
- "Paradise and Hell"
- "Resurrection"
- "The Hereafter"
This works because we convert all text into mathematical representations called "embeddings" that capture meaning, not just words.
RAG Technology (Retrieval Augmented Generation)
RAG is the secret sauce that makes Criterion reliable. Here's the difference:
❌ Regular AI Chatbots
Generate answers from training data → can hallucinate → no sources → unreliable
✅ Criterion (RAG-Powered)
Retrieve authentic texts first → generate answer from retrieved texts → cite sources → reliable
Step-by-Step: What Happens When You Ask
1. Question Analysis
Your question is converted into an embedding (a list of numbers representing its meaning).
Example: "What does Islam say about patience?" → [0.23, -0.45, 0.78, ...]
2. Similarity Search
We compare your question's embedding with embeddings of all 6,236 Quran verses and 12,416 Hadiths using vector similarity (cosine distance).
The most similar passages are retrieved. For Quran, we use hybrid search:
- Vector search - finds semantic matches
- Keyword search - finds exact terms (important for names like "Moses" or "Abu Bakr")
- Both results are merged using Reciprocal Rank Fusion
3. Context Enhancement
For top Quran results, we fetch surrounding verses (±2 verses) to avoid out-of-context interpretations. A verse about patience might make more sense with the verses before and after it.
4. Response Generation
The AI (GPT-4o Mini) receives:
- Your original question
- Retrieved verses and hadiths with context
- System instructions to act as a knowledgeable Da'i
It generates a response using only the retrieved information—no making things up!
5. Citation & Verification
Every verse and hadith mentioned includes:
- Surah:Ayah reference (e.g., Al-Baqarah 2:153)
- Direct link to Quran.com or Sunnah.com
- Full Arabic text + English translation
- Hadith grading (Sahih, Hasan, etc.)
Our Data Sources
Quran (6,236 verses)
- Arabic: Tanzil Quran Text (v1.1)
- English: Verified translation
- Structure: 114 Surahs, organized by Surah:Ayah
Hadith (12,416 narrations)
- Sahih Bukhari: 7,558 hadiths
- Sahih Muslim: 2,920 hadiths
- 40 Hadith Nawawi: 42 hadiths
- Riyadh as-Salihin: 1,896 hadiths
All hadiths include narrator chains, grading (authenticity), and references.
Quality Safeguards
1. Authenticity Filter
By default, we only show Sahih (most authentic) hadiths. You can adjust this if needed.
2. Context Windows
Top Quran results include ±2 surrounding verses to prevent misinterpretation.
3. Similarity Threshold
Only passages with >30% similarity to your question are shown. If we can't find relevant passages, we say so rather than making things up.
4. Citation Required
Our AI is instructed to always cite sources. If it can't find relevant texts, it admits it rather than guessing.
Performance
- Search speed: 100-150ms per query
- Accuracy: Top result typically >75% relevant
- Coverage: Complete Quran + major authentic hadith collections
Limitations & Honesty
We're transparent about what Criterion can't do:
- Not a scholar: Complex legal (fiqh) questions need human scholars
- English-focused: Arabic queries may not work as well (coming soon!)
- No Tafsir yet: We show verses but not scholarly commentary (planned)
- Limited to texts: Can't access knowledge outside our database
Future Improvements
We're constantly improving Criterion:
- Multilingual support (Arabic, Urdu, French, etc.)
- Tafsir (commentary) integration
- More hadith collections
- Contextual chunk embeddings (+35% accuracy)
- Advanced reranking
Try It Yourself
Chat Assistant
Ask natural questions and get answers with citations
Theme Search
See RAG in action - search by topic
Technical Deep Dive
Developers can explore our open-source codebase: