Matching an ayah to its place in the Quran is one problem. Matching a voice to a specific qari is another, and it is much harder. The text of the Quran is fixed. The way each reciter delivers it (pace, breath control, melodic line, characteristic madds) is not. This article walks through how voice identification works in RecitID, what it needs to work well, and the honest edge cases.
Short version: the feature is called Reciter Identification, it covers 200+ qaris, and it is a separate model from the verse-matching model. Product page with the full capability list.
Text match vs voice match: two different jobs
RecitID's Detect feature does both jobs in parallel. The text-match path transcribes the Arabic you are hearing and looks it up against every ayah in the Quran, a large search over a fixed corpus. The voice-match path asks a separate question: whose voice produced this audio?. That is a similarity search against a set of reference speaker embeddings.
Either path can succeed on its own. A clip of someone reciting Al-Fatiha over a noisy café will match on text even if the voice is unidentifiable. A four-second clean sample of Abdul Basit can match on voice even if the ayah is ambiguous. You see both results when both succeed, and an honest partial answer when only one does.
The speaker-verification model, in plain terms
The voice-match side uses ECAPA-TDNN, a model originally built for telephone-call speaker verification. It turns any audio clip into a fixed-length vector (192 numbers) that encodes the speaker's acoustic signature while stripping out what they actually said. Two clips from the same person map to nearby vectors; two clips from different people map to distant ones. We precompute these vectors for every reciter in our reference set and store them as a searchable index.
When you hit Detect, we compute the vector for your clip and find the closest match. If the closest match is close enough, above a similarity threshold we tuned on held-out data, we return the reciter name. If not, we return nothing. False positives (Qari A confidently labelled as Qari B) are much worse than honest uncertainty, so we lean conservative on the threshold.
The model is indifferent to the words. It could be reciting Al-Ikhlas or Ya-Sin. What it latches onto is how the voice is produced: register, breathiness, timbre, vibrato.
One fair question: does this mean the model can identify any voice, even non-recitation? Technically yes, it is a general speaker-verification model, but we only index reciters, so outside-the-set clips will always return "no match" rather than a random name.
The reference set: 200+ reciters
We selected 200+ reciters based on three criteria: recording availability, recognisability, and coverage across styles and regions. The lineup includes Haramain imams (Sudais, Shuraim, Maher Al-Muaiqly, Juhany, Yasser Al-Dosari), Egyptian mujawwad legends (Abdul Basit, Minshawi, Husary), contemporary murattal favourites (Mishary Alafasy, Saad Al-Ghamdi, Abdul Rahman Al-Ossi), and younger voices with strong followings (Fares Abbad, Raad Al-Kurdi, Idris Abkar, Khalifa Al-Tunaiji).
What you cannot identify: a qari from a small regional mosque with no commercial recordings. We can only match against voices we have reference samples for. If someone you want is missing and has enough public recordings, tell us. We add reciters on a rolling basis.
If you are curious about the reciters we do cover, the reciters page has the current list, and the top 20 article profiles the most-listened voices with notes on what makes each distinctive.
What makes one reciter sound different from another
Three things, mostly. First, style. Murattal (measured, used for daily reading and memorisation) sounds completely different from Mujawwad (ornamented, with extended melodic phrases, used in public recitation and competitions). Second, maqam. Egyptian reciters tend to move through bayyati, rast, hijaz, and saba; Saudi reciters often stay within a narrower set. Third, personal habits: how long a particular reciter holds a madd, whether they breathe on the ending of an ayah or push through, the breath attack on a new phrase.
The model picks up on acoustic correlates of all three, not on the category labels. You do not need to tell it "this is Mujawwad". It figures that out from the audio.
Step by step: using it in the app
- Open Detect on the home screen. That is the big mic button.
- Play or expose the audio. Hold the phone near a speaker, turn up a video, or bring it close to someone reciting aloud. Four to six seconds is the sweet spot.
- Read the result card. The top block shows the surah and verse. That is the verse match. Below it, if we identified the voice, you see the reciter card with their name, a confidence indicator, and a short profile link.
- Tap through to the reciter profile. You get their bio, the recordings we have licensed, and the option to set them as your default playback reciter.
- Save the detection if you want to come back to it. Pro+ includes session save and replay. We keep the clip, the transcript, and the reciter attribution together.
If you want to catch multiple ayahs in one sitting, during tarawih, a khatm recording, or a study circle, switch to Auto-Detect instead. It runs continuously and logs every verse as it comes.
When it will not identify the reciter
An honest list of cases where the voice match fails or refuses to answer:
- The clip is too short: under four seconds of clean recitation is often not enough. Verse match can still succeed; voice match cannot.
- Heavy background noise: car engines, a crowd, echo from a large masjid. We do not run aggressive noise reduction before embedding because it distorts the voice signature.
- The reciter is not in our reference set. We do not guess.
- Two reciters sound similar: Alafasy and Fares Abbad, for example, share enough acoustic space that on short clips the model may decline rather than risk a wrong call.
- The audio is not the reciter speaking: if the clip has someone in front of a reciter playing on speakers, the closest voice is the person in the room, not the one in the recording.
- Autotune or pitch-shifted clips (social-media edits): the voice signature is distorted enough that we will usually refuse.
In those cases you still get the verse, the translation, and the option to play back the ayah in any of our licensed reciters. That is often the thing you wanted in the first place.
How accurate is it, really?
On a held-out evaluation set (reciters in the reference set, clean four-second clips, no augmentation) the top-1 accuracy is above 95%. Add realistic background noise (60 dB of café ambience) and that drops into the mid-80s. Add compression (a downsampled WhatsApp voice note) and you can expect low 80s. At the threshold we ship, we decline about 10% of in-set clips rather than return a wrong name, which is the tradeoff we chose.
The best way to judge it is to try it on recitations where you already know the answer. Play a Sudais recording, play a Mishary recording, play a clip of someone not in our set. The app should say the first two correctly and decline the third.
What we do not do
We do not claim to be a tajweed grader. The model does not know the rules of makhraj or sifaat. It measures acoustic similarity between voices, not compliance with tajweed. If you want structured feedback on your own recitation, Tajweed Reader is the product for that (and it is colour-coded by rule).
We also do not identify the verse of Quran from a non-Arabic voice. If someone translates a verse in English and reads the translation aloud, we will not match it. Detect matches Arabic recitation against Arabic text.
Frequently asked
Can I identify a reciter from a video without audio?
No. We need sound. If the video has muted audio, turn on the device volume and let RecitID hear the clip. If the video itself has no audio, there is nothing to match.
Does the model know the difference between Mujawwad and Murattal?
Not as categories. It knows that a particular reciter sounds a particular way. If one qari recites both styles in different recordings, we include both in their reference samples. The embedding is forgiving of intra-speaker style variation.
Why did it get the verse but not the reciter?
Voice match needs a cleaner, longer clip than verse match. If the ambient was loud or the clip was short, that is the usual cause.
How do I add a reciter who is not in the set?
Email us with a name and two or three public recording links (full ayahs, the same reciter, no other voices). Contact.
Is this on Android as well as iOS?
Yes. Both platforms have the same Reciter Identification feature. Install links on the home page.
Try it on a clip you already know
Pick a reciter you have saved, maybe a favourite Sudais Al-Fatiha, or a Mishary Al-Baqarah clip. Play it aloud and tap Detect. You should see the verse on top and the reciter name below. Then try a random YouTube clip from a reciter you do not know; you will either learn who they are or find out that we still need to add them.
Related reading: how RecitID works at the surah-and-verse level, and a guided tour of the 20 most popular Quran reciters.