How Criterion Works

Criterion combines modern AI with authentic Islamic texts to help you explore Islam. Here's how our technology works in simple terms.

The Big Picture

When you ask a question, Criterion doesn't just look for keywords. Instead, it:

Understands the meaning of your question
Searches through 6,236 Quran verses and 12,416 Hadiths
Retrieves the most relevant passages with context
Generates a response grounded in these authentic sources
Cites every reference so you can verify

Semantic Search (Understanding Meaning)

Traditional Search (Keywords)

Most search engines look for exact words. If you search "afterlife," they only find verses with the word "afterlife."

Semantic Search (Meaning)

Criterion understands concepts. Search for "afterlife" and we'll also find:

"Day of Judgment"
"Paradise and Hell"
"Resurrection"
"The Hereafter"

This works because we convert all text into mathematical representations called "embeddings" that capture meaning, not just words.

RAG Technology (Retrieval Augmented Generation)

RAG is the secret sauce that makes Criterion reliable. Here's the difference:

❌ Regular AI Chatbots

Generate answers from training data → can hallucinate → no sources → unreliable

✅ Criterion (RAG-Powered)

Retrieve authentic texts first → generate answer from retrieved texts → cite sources → reliable

Step-by-Step: What Happens When You Ask

1. Question Analysis

Your question is converted into an embedding (a list of numbers representing its meaning).

Example: "What does Islam say about patience?" → [0.23, -0.45, 0.78, ...]

2. Similarity Search

We compare your question's embedding with embeddings of all 6,236 Quran verses and 12,416 Hadiths using vector similarity (cosine distance).

The most similar passages are retrieved. For Quran, we use hybrid search:

Vector search - finds semantic matches
Keyword search - finds exact terms (important for names like "Moses" or "Abu Bakr")
Both results are merged using Reciprocal Rank Fusion

3. Context Enhancement

For top Quran results, we fetch surrounding verses (±2 verses) to avoid out-of-context interpretations. A verse about patience might make more sense with the verses before and after it.

4. Response Generation

The AI (GPT-4o Mini) receives:

Your original question
Retrieved verses and hadiths with context
System instructions to act as a knowledgeable Da'i

It generates a response using only the retrieved information—no making things up!

5. Citation & Verification

Every verse and hadith mentioned includes:

Surah:Ayah reference (e.g., Al-Baqarah 2:153)
Direct link to Quran.com or Sunnah.com
Full Arabic text + English translation
Hadith grading (Sahih, Hasan, etc.)

Our Data Sources

Quran (6,236 verses)

Arabic: Tanzil Quran Text (v1.1)
English: Verified translation
Structure: 114 Surahs, organized by Surah:Ayah

Hadith (12,416 narrations)

Sahih Bukhari: 7,558 hadiths
Sahih Muslim: 2,920 hadiths
40 Hadith Nawawi: 42 hadiths
Riyadh as-Salihin: 1,896 hadiths

All hadiths include narrator chains, grading (authenticity), and references.

Quality Safeguards

1. Authenticity Filter

By default, we only show Sahih (most authentic) hadiths. You can adjust this if needed.

2. Context Windows

Top Quran results include ±2 surrounding verses to prevent misinterpretation.

3. Similarity Threshold

Only passages with >30% similarity to your question are shown. If we can't find relevant passages, we say so rather than making things up.

4. Citation Required

Our AI is instructed to always cite sources. If it can't find relevant texts, it admits it rather than guessing.

Performance

Search speed: 100-150ms per query
Accuracy: Top result typically >75% relevant
Coverage: Complete Quran + major authentic hadith collections

Limitations & Honesty

We're transparent about what Criterion can't do:

Not a scholar: Complex legal (fiqh) questions need human scholars
English-focused: Arabic queries may not work as well (coming soon!)
No Tafsir yet: We show verses but not scholarly commentary (planned)
Limited to texts: Can't access knowledge outside our database

Future Improvements

We're constantly improving Criterion:

Multilingual support (Arabic, Urdu, French, etc.)
Tafsir (commentary) integration
More hadith collections
Contextual chunk embeddings (+35% accuracy)
Advanced reranking

Try It Yourself

Chat Assistant

Ask natural questions and get answers with citations

Theme Search

See RAG in action - search by topic

Technical Deep Dive

Developers can explore our open-source codebase: