By Turni Saha | March 2026
Every enrichment vendor says they search "20+ data providers." Almost none of them explain what happens between the first API call and the result that shows up in your CRM.
Here's how waterfall enrichment actually works under the hood.
Why it's called a waterfall
The name comes from the query pattern. Your enrichment tool sends a request to Provider A. If Provider A doesn't have the data, it flows down to Provider B. Then C. Then D. Each step has rules about when to move down, when to stop, and how confident the system is in whatever it found.
Think of it like calling a list of references when you're hiring. You start with the best one. If they don't pick up, you try the next. If the second one gives you an answer that sounds off, you might call a third to confirm. You don't call all 20 references every time, because that would take forever and cost a fortune.
Waterfall enrichment works the same way. The order matters. The rules for moving to the next step matter. The timeout thresholds matter. And most tools don't let you see any of it.
The anatomy of a waterfall query
Here's what happens when you click "enrich" on a LinkedIn profile, broken into the actual steps most tools run through:
Step 1: Input normalization. The system takes whatever you gave it (a LinkedIn URL, a name and company, an email domain) and standardizes it into a format the downstream providers can accept. This sounds trivial. It isn't. "VP of Sales" and "Vice President, Sales" and "VP Sales & Partnerships" need to resolve to the same person. Some tools are better at this than others, and the ones that are bad at it lose matches before the waterfall even starts.
Step 2: Provider selection. Not every provider is good at everything. Some have strong US phone data but weak European coverage. Some are fast but return stale emails. Some are expensive per call but have high match rates on enterprise contacts. The waterfall logic decides which provider to try first based on what it knows about the input. A US-based VP at a Fortune 500 company might route to a different first provider than a startup founder in Berlin.
Step 3: The first call. The system sends the query to Provider A. This is where timeout thresholds matter. How long does the system wait? 2 seconds? 5 seconds? 30 seconds? If Provider A is slow that day (and APIs have bad days constantly), the system has to decide: keep waiting, or move on. Set the timeout too short and you miss data that was coming. Set it too long and the user is staring at a spinner wondering if the tool crashed.
Step 4: Response evaluation. Provider A comes back with something. Maybe an email. Maybe a phone number. Maybe both. Maybe nothing. The system now has to evaluate the response. Is this a verified email or a pattern-generated guess? Does the phone number's area code match the contact's location? Is the company name in the response the same company the contact actually works at, or did they change jobs two months ago?
This is where match confidence scoring comes in, and it's one of the least-understood parts of waterfall logic.
Step 5: Continue or stop. If the confidence score is high enough, the waterfall stops. You got your answer. If it's below the threshold, the system moves to Provider B and repeats steps 3 and 4. If Provider B returns a different email than Provider A, now you have a conflict, and the system needs rules for resolving it.
How match confidence gets calculated
Most enrichment tools report a confidence score. Few explain where it comes from.
Here's the typical approach. The system checks multiple signals against the input data and scores each one. Did the provider return a name that matches the input name? Does the company match? Does the job title match, or at least come close? Is the email domain the company's actual domain, not a personal Gmail? Is the phone number in the right country, the right state, the right area code?
Each signal gets a weight. A matching company domain on the email is a strong signal. A matching first name is weaker (lots of people named Sarah). A phone number in the right country but wrong state is somewhere in between.
The weights vary by provider, because some providers are more reliable on certain data types. If Provider A historically returns accurate emails but shaky phone numbers, the system might trust its email at 85% confidence but its phone number at only 60%.
This is why a tool can return "no result" even when the data exists somewhere. The system found something, but the confidence was below the threshold, and no subsequent provider scored high enough to override it. The data was there. The system just didn't trust it enough to show you.
And this is the part that frustrates people. You know the email exists. Your colleague found it manually. But the tool returned nothing because the confidence math didn't clear the bar.
What happens when providers disagree
This is the scenario most vendors pretend doesn't exist.
You query Provider A. It returns sarah.chen@acme.com. You query Provider B. It returns s.chen@acmecorp.com. Different email, different domain format. Which one is right?
There are a few approaches, and they all have tradeoffs:
First-match wins. Whatever the highest-priority provider returned, that's the answer. Simple, fast, but you're trusting one source completely. If that source is wrong, you're wrong. This is how a lot of the cheaper tools work. It's also why they sometimes return outdated data from a provider that was once good but hasn't updated in months.
Majority vote. Query three providers. If two of them agree, go with that answer. More reliable, but slower and more expensive (you're paying for three API calls instead of one). Also fails when the data is genuinely ambiguous, like someone who has two valid work emails.
Confidence-weighted selection. Each provider's response gets scored, and the highest-confidence response wins regardless of query order. This is the most sophisticated approach, but it depends entirely on how good the confidence scoring is. Bad weights mean bad selections.
Recency-based tiebreaking. When two providers return different data with similar confidence, prefer the one with a more recent verification timestamp. This helps with job changers but only works if providers actually report when they last verified their data. Not all of them do.
Most tools use some combination of these. The problem is that when they get it wrong, you have no way to know why. The tool just gives you an email. You don't know if three providers agreed on it or if one low-confidence source was the only hit.
Why some tools return nothing when the data exists
This is probably the most common complaint we hear. "We know this person's email is out there. Why didn't the tool find it?"
Three usual reasons:
The confidence threshold is set high to protect accuracy metrics. If a tool advertises 95% accuracy, it can't afford to show you uncertain results. So it hides them. A lower confidence threshold would find more data, but the accuracy number on the marketing page would drop. Most vendors choose the higher accuracy number.
The provider order missed the right source. If Provider C has the email but the waterfall stopped at Provider B because B returned a high-confidence (but wrong) result, Provider C never gets queried. The data was one step away, but the logic said stop.
The input was slightly off. The person's LinkedIn name doesn't exactly match what's in the provider's database. Maybe they go by Mike on LinkedIn but Michael in corporate records. Maybe the company name on LinkedIn says "Acme" but the provider has "Acme Corporation." Input normalization should catch this, but it doesn't always.
What you can actually control
Most enrichment tools don't expose their waterfall logic to customers. But some do, and even with the ones that don't, you can ask better questions during your evaluation.
Ask about provider order. "Which provider do you query first for US-based contacts? Does that change for European contacts?" If the vendor can't answer this, they either don't know or don't want you to know. Neither is great.
Ask about timeout thresholds. "How long do you wait for a provider response before moving to the next one?" Short timeouts mean faster results but more missed data. Long timeouts mean slower enrichment but potentially better coverage. There's no right answer, but knowing the tradeoff helps you understand why enrichment sometimes feels slow.
Ask about conflict resolution. "When two providers return different emails for the same person, how do you decide which one to use?" If the answer is "we use the first one" or a vague "our algorithm handles it," push harder.
Ask about confidence thresholds. "Can I adjust the minimum confidence level? Can I see results that fell below your default threshold?" Some tools let you lower the bar and see more data at the cost of accuracy. That might be worth it if you're doing high-volume prospecting where a 70% accurate email is better than no email.
Ask what happens on a miss. "If your waterfall exhausts all providers and finds nothing, do you retry later? Do you tell me which providers were tried?" A tool that tells you "we tried 8 providers and none had data" is more useful than one that just says "not found."
The honest version
Waterfall enrichment is better than single-source enrichment. Querying multiple providers sequentially gives you more coverage and usually better accuracy than relying on one database.
But it's not magic. The quality of the waterfall depends on which providers are in it, what order they're queried, how timeouts are handled, how conflicts are resolved, and where the confidence thresholds are set. Two tools can both say "we use 20+ providers" and produce wildly different results because the logic between those providers is completely different.
The vendors who explain their logic are the ones who've actually thought about it. The ones who won't are either using a basic first-match approach and hoping you don't ask, or they genuinely don't understand their own system well enough to explain it.
If you want to test ShareCo's waterfall, it's free for 10 email enrichments and 3 phone enrichments through the Chrome Web Store.
