Skip to main content

How to Deduplicate Salesforce Data: Step-by-Step Guide

· 22 min read
Artyom Bazyk
Founder & CTO of No Duplicates | Salesforce ISV Architect

Every Salesforce org accumulates duplicates — from web-to-lead forms, list imports, manual entry, and integrations. According to Harvard Business Review, performing a unit of work costs 10x more when data is flawed — and duplicates are one of the most common flaws. They inflate pipeline reports, waste sales reps' time, trigger duplicate outreach that damages customer relationships, and make compliance with GDPR or HIPAA significantly harder.

The longer duplicates sit in your CRM, the harder they are to resolve — related records pile up, field values diverge, and users lose trust in the data. That's why deduplication isn't a one-time cleanup but an ongoing process.

Short answer: To deduplicate Salesforce data, follow these 7 steps:

  1. Audit your data to understand the duplicate scope
  2. Set up Duplicate Rules to prevent new duplicates
  3. Configure Matching Rules to define what counts as a duplicate
  4. Find existing duplicates using reports or a dedup tool
  5. Merge duplicates manually or with auto-merge
  6. Automate ongoing deduplication with scheduled scans
  7. Monitor and maintain data quality over time

Salesforce's native tools handle steps 2–4, but for steps 5–7 at scale, you'll need an AppExchange tool like No Duplicates — free on sandboxes.

This guide walks you through the complete Salesforce deduplication process — from initial audit to fully automated, ongoing data cleanup.

In this article:


Why Salesforce Data Gets Duplicated

Before you start cleaning duplicates, it helps to understand how they get there in the first place. The most common sources:

Web-to-Lead forms. Every form submission creates a new Lead — even if that person already exists. A prospect who downloads three whitepapers creates three Lead records. Multiply that across all your landing pages, and the duplicates add up fast.

Manual data entry. Sales reps create records on the fly — often without checking whether the person or company already exists. Slight variations in spelling ("Acme Corp" vs "Acme Corporation" vs "ACME") slip past basic duplicate checks.

List imports and data loads. Marketing imports event attendee lists, SDR teams upload purchased lists, and data migration projects bring in records from legacy systems. Each import is a potential source of duplicates, especially when there's no deduplication step built into the process.

Integrations and syncs. Marketing automation tools (Pardot, HubSpot, Marketo), ERP systems, and third-party data enrichment services all push records into Salesforce. If records aren't matched correctly during sync, you get duplicates.

Lead-to-Contact conversion. When a Lead is converted without checking for existing Contacts, Salesforce creates a new Contact — even if one already exists for that person. This creates cross-object duplicates that native tools struggle to detect.

Understanding these sources matters because effective deduplication isn't just cleanup — it's also prevention. The steps below address both.


Step 1: Audit Your Data

Before you configure anything, assess the scope of the problem. You need to know:

  • How many duplicates exist? Is this hundreds or hundreds of thousands?
  • Which objects are affected? Leads? Contacts? Accounts? Custom objects?
  • What fields can identify duplicates? Email, phone, name + company, address?
  • Where are duplicates entering? Web forms, imports, integrations, manual entry?

Quick Audit Using Reports

Create a Salesforce report grouped by a field that should be unique (like Email):

  1. Go to ReportsNew Report
  2. Choose the object (e.g., Contacts)
  3. Add the Email field as a grouping
  4. Add a Record Count column
  5. Filter to show groups with count > 1

This gives you a rough count of duplicates based on email matching. Repeat for other objects and match fields (phone, name + company).

Deeper Audit with a Dedup Tool

For a more thorough audit, install a deduplication tool on a sandbox. No Duplicates is free on sandboxes with no restrictions — you can run full scans with fuzzy matching to see the true scope of duplicates that exact-match reports miss.

Tip: Always back up your data before any deduplication work. Export affected objects via Data Export or a tool like Data Loader for Salesforce.


Step 2: Set Up Duplicate Rules (Prevention)

Salesforce's Duplicate Rules are your first line of defense. They detect potential duplicates when a record is being created or edited and either alert the user or block the save entirely.

How Duplicate Rules Work

A Duplicate Rule consists of three parts:

  1. Object — which object this rule applies to (Account, Contact, Lead, or custom object)
  2. Matching Rule — the criteria used to find duplicates (see Step 3)
  3. Action — what happens when a match is found: Alert (warn but allow save) or Block (prevent save)

Setting Up a Duplicate Rule

  1. Go to SetupDuplicate Rules
  2. Click New Rule and select the object
  3. Choose an action: Alert or Block
  4. Select which Matching Rule(s) to use
  5. Define the record-level security — should the rule apply when creating, editing, or both?
  6. Activate the rule

Alert vs. Block: Which Should You Choose?

ActionWhen to use
AlertEarly adoption — lets reps see potential duplicates without disrupting their workflow. Good for training users about data quality.
BlockMature data governance — prevents duplicates entirely. Use after your team is familiar with the system and matching rules are well-tuned.

Recommendation: Start with Alert while you tune your matching rules. Once you're confident in the match accuracy, switch to Block for high-confidence matches (like exact email). Keep Alert for fuzzy matches where false positives are possible.

Important: Duplicate Rules only work on new and edited records. They don't scan your existing database. For that, you need Step 4.


Step 3: Configure Matching Rules

Matching Rules define what counts as a duplicate. Salesforce comes with standard matching rules for Accounts, Contacts, and Leads — but you can create custom rules for more precision.

Standard vs. Custom Matching Rules

Standard rules use a combination of fields with fuzzy matching algorithms. For example, the standard Contact matching rule compares:

  • First Name (Exact or Fuzzy)
  • Last Name (Exact or Fuzzy)
  • Email (Exact)
  • Phone (Exact)
  • Mailing Street (Exact or Fuzzy)

Custom rules let you define your own field combinations and matching methods. This is where you tailor deduplication to your specific data patterns.

Matching Methods

Salesforce supports these matching methods natively:

MethodDescriptionExample
ExactFields must match exactly"john@acme.com" = "john@acme.com"
Fuzzy: First NameHandles nicknames and variations"Bob" ≈ "Robert"
Fuzzy: Last NameHandles minor typos"Smith" ≈ "Smyth"
Fuzzy: Company NameHandles abbreviations"Acme Corp" ≈ "Acme Corporation"
Fuzzy: TitleHandles title variations"VP" ≈ "Vice President"
Fuzzy: StreetHandles address formatting"123 Main St" ≈ "123 Main Street"
Fuzzy: CityHandles city name variations"NY" ≈ "New York"
Fuzzy: PhoneHandles formatting differences"(555) 123-4567" ≈ "5551234567"

For more advanced matching — including Jaro-Winkler distance, Double Metaphone (phonetic matching), and Levenshtein distance — you'll need a third-party tool. No Duplicates supports all three through configurable matching rules.

Tips for Effective Matching Rules

  1. Start broad, then narrow. Begin with a rule that catches obvious duplicates (exact email match), then add fuzzy rules incrementally.
  2. Combine fields. A single-field match (name only) produces too many false positives. Combine fields: name + company, or email + phone.
  3. Test before activating. Run your matching rule on a sandbox first to see what it catches — and what it misses.
  4. Account for data quality. If your data has inconsistent formatting (mixed case, abbreviations, extra spaces), fuzzy matching becomes essential.

Step 4: Find Existing Duplicates

Duplicate Rules prevent new duplicates, but they won't find the ones already in your database. You need to actively scan your existing data.

Option A: Duplicate Record Sets (Native)

When Duplicate Rules are active and a match is found, Salesforce creates a Duplicate Record Set — a grouping of records that may be duplicates. You can view these in:

  1. SetupDuplicate Record Sets
  2. Or via a custom report type: "Duplicate Record Sets"

Limitation: Duplicate Record Sets are only created when a record is saved while a Duplicate Rule is active. They don't retroactively scan your existing database. If you activated Duplicate Rules after your data was already populated, the existing duplicates won't appear in Duplicate Record Sets.

Option B: Duplicate Reports

Salesforce's Duplicate Record Report lets you see all records flagged by Duplicate Rules. This requires:

  1. A custom report type based on "Duplicate Record Sets"
  2. Active Duplicate Rules that have been running

Option C: AppExchange Deduplication Tool

For the most comprehensive scan, use a dedicated tool. No Duplicates scans your entire database — including records created before you activated any rules — using exact and fuzzy matching across any combination of fields.

The key advantage: a dedup tool can scan cross-object (Leads against Contacts), use advanced fuzzy matching (phonetic algorithms, edit distance), and process hundreds of thousands of records in minutes.

Found duplicates? Now resolve them automatically. No Duplicates is free on sandboxes — test fuzzy matching and auto-merge with your real data before going to production.

Install from AppExchange


Step 5: Merge Duplicates

Once you've identified duplicates, it's time to merge them. You have two approaches: manual and automated.

Manual Merge (Native Salesforce)

Salesforce lets you merge up to 3 records at a time for Accounts, Contacts, and Leads. For a detailed walkthrough with screenshots, see: How to Merge Duplicate Records in Salesforce.

Manual merge makes sense for small numbers of duplicates (under 50) or records that require human judgment.

Automated Merge (AppExchange Tools)

Manual merging breaks down at scale. At 10–20 minutes per merge, 1,000 duplicates would take 150+ hours of manual work. Automated merge tools solve this by letting you define rules for how duplicates should be resolved — then processing thousands of merges unattended.

With No Duplicates auto-merge, you configure two things:

  1. Master record strategy — which record survives (e.g., most complete, newest, most attachments, or a custom Salesforce Flow)
  2. Field-level strategies — how each field is populated on the surviving record (e.g., longest text, newest date, combine multi-select values)

With 24+ built-in strategies plus custom Flow support, you get field-by-field control over how duplicates are resolved — without touching each record manually.


Step 6: Automate Ongoing Deduplication

Deduplication isn't a one-time project. New duplicates enter your system constantly through the sources described earlier. You need ongoing automation to keep data clean.

Scheduled Scans

Set up scheduled deduplication scans to run at regular intervals — daily, weekly, or on a custom cron schedule. Each scan identifies new duplicates that entered since the last run.

Combine scheduled scans with auto-merge rules, and your Salesforce org stays clean automatically:

  1. Weekly scan detects new duplicates
  2. Auto-merge rules resolve exact matches automatically
  3. Fuzzy matches are flagged for manual review
  4. Email notifications alert admins to new duplicate groups

Lead-to-Contact Auto-Conversion

One of the most common duplicate scenarios — a Lead that already exists as a Contact — can be automated entirely. No Duplicates auto-conversion matches incoming Leads against existing Contacts and automatically converts matches, eliminating cross-object duplicates before they accumulate.

Ready to automate your Salesforce deduplication? No Duplicates is free on sandboxes with no restrictions — test auto-merge, scheduled scans, and fuzzy matching with your real data.

Install from AppExchange


Step 7: Monitor and Maintain

Deduplication is an ongoing process, not a one-time cleanup. Build these monitoring habits:

Track Key Metrics

  • Duplicate rate: What percentage of new records are duplicates? Track this monthly — it should trend downward.
  • Merge volume: How many records are being merged per week? A sudden spike may indicate a new duplicate source (import, integration).
  • Match accuracy: Are your matching rules catching real duplicates, or generating false positives? Review a sample of flagged matches periodically.

No Duplicates includes a duplicate analytics dashboard and merge reports (CSV export) to track these metrics automatically.

Refine Your Rules

As your data evolves, your matching rules should evolve too:

  • New fields become available. If you add a "Company Domain" field, add it to your matching rules for better Account deduplication.
  • New duplicate sources appear. A new integration or marketing tool may introduce duplicates in a pattern your current rules don't catch.
  • False positive rate changes. If users complain about too many false alerts, tighten your matching criteria. If duplicates are slipping through, loosen them.

Document Your Process

Create a runbook that covers:

  • Which matching rules are active and why
  • The auto-merge strategy for each object
  • Who reviews flagged duplicates and how often
  • The backup process before major merge operations
  • How to handle merge exceptions (e.g., records that should remain separate)

Salesforce Deduplication Best Practices

Based on working with Salesforce orgs of all sizes — from startups to enterprises with millions of records — here are the practices that consistently lead to the best results:

1. Always back up before bulk operations. Merging is permanent in native Salesforce — there is no undo. Export your data before any major merge operation.

2. Start with prevention, then clean up. Activate Duplicate Rules first so new duplicates stop entering while you work on the backlog. Otherwise, you're cleaning a mess while it's still being made.

3. Deduplicate on sandbox first. Test your matching rules and merge strategies on a sandbox with real data before running them in production. No Duplicates is free on sandboxes — use this to validate your approach risk-free.

4. Prioritize by impact. Deduplicate Accounts first (they affect the most related records), then Contacts, then Leads. Within each object, start with high-confidence matches (exact email) before moving to fuzzy matches.

5. Normalize data before matching. Standardize formats before running dedup: consistent phone formatting, standard state abbreviations, trimmed whitespace. Better data in = better matches out.

6. Don't merge during business hours. Large merge operations can trigger workflow rules, process builders, and integrations. Run bulk merges during off-hours to avoid disrupting live users and downstream systems.

7. Involve stakeholders. Data quality isn't just an admin's problem. Sales, marketing, and ops teams all create and consume CRM data. Get buy-in on matching rules and merge strategies before implementation.


When Native Tools Aren't Enough

Salesforce's built-in Duplicate Rules and Matching Rules are a solid starting point — but they have hard limitations that become blockers at scale: no auto-merge, no cross-object matching, no scheduled scans, and a 3-record-per-merge cap.

No Duplicates fills these gaps while keeping your data inside Salesforce — 100% native, so your data never leaves the platform (which simplifies compliance with HIPAA, GDPR, and SOC 2). It offers 24+ auto-merge strategies with custom Flow support, cross-object matching, advanced fuzzy algorithms (Jaro-Winkler, Double Metaphone, Levenshtein), and cron-based scheduled automation. All features included from $240/year.

For a comparison with other tools, see: Best Salesforce Deduplication Tools in 2026.

Want to try it? No Duplicates is completely free on sandboxes with no restrictions. Install from AppExchange and test with your real data.


Frequently Asked Questions

What is Salesforce deduplication?

Salesforce deduplication is the process of finding and merging duplicate records in your CRM. It has three parts: prevention (Duplicate Rules block new duplicates), detection (Matching Rules scan for existing ones), and resolution (merging duplicates into one master record).

Can Salesforce deduplicate data automatically?

Salesforce automates detection and prevention, but not resolution. Here's what each step looks like:

  1. Prevention — Duplicate Rules automatically alert or block users when they create a matching record.
  2. Detection — Matching Rules flag potential duplicates in real time.
  3. Merging — Fully manual in native Salesforce (up to 3 records at a time).

To automate the merge step, you need an AppExchange tool that supports scheduled scans and rule-based auto-merge.

How do I find duplicates in Salesforce?

Three approaches, from basic to comprehensive:

  1. Reports — Group records by Email or Name and filter for groups with count > 1. Quick but limited to exact matches.
  2. Duplicate Record Sets — When Duplicate Rules are active, Salesforce creates sets of potential matches. Only catches duplicates created after rules were activated.
  3. AppExchange tools — Run full-database scans with fuzzy matching, cross-object detection, and scheduled automation. This is the most thorough approach.

What is the best way to deduplicate Salesforce data?

The best approach combines prevention and cleanup: use Salesforce's native Duplicate Rules to prevent new duplicates from entering, then use an AppExchange tool for bulk detection and automated merging of existing duplicates. Set up scheduled scans to catch duplicates continuously — deduplication should be an ongoing process, not a one-time project.

How long does Salesforce deduplication take?

It depends on scale:

  • Setting up rules: 1–2 hours for basic Duplicate Rules and Matching Rules
  • Manual merging: 10–20 minutes per merge (3 records max per merge in native Salesforce)
  • Automated merging: No Duplicates can process 10,000+ records per hour with auto-merge (based on internal testing; actual throughput depends on org complexity)
  • Ongoing maintenance: Minutes per week with scheduled automation

Should I deduplicate before or after a data migration?

Both. Deduplicate your source data before migration to minimize the number of duplicates you import. Then run deduplication again after migration to catch cross-system duplicates — records that exist in both the source system and the target Salesforce org. This two-pass approach is standard practice for data migration projects.

What's the difference between Duplicate Rules and Matching Rules?

Matching Rules define what counts as a duplicate — the field combinations and matching methods (exact, fuzzy) used to compare records. Duplicate Rules define what happens when a match is found — alert the user, block the save, or log the match. You need both: Matching Rules without Duplicate Rules don't do anything, and Duplicate Rules require at least one Matching Rule to work.

Can I deduplicate custom objects in Salesforce?

Yes — the same 7-step process applies. Salesforce's native Duplicate Rules and Matching Rules work with custom objects for detection and prevention. The key limitation: there is no native merge UI for custom objects (only Accounts, Contacts, Leads, and Cases have one). To merge custom object duplicates, you'll need either custom Apex code or an AppExchange tool like No Duplicates that supports custom object merging.


Disclosure: This guide is published by the team behind No Duplicates, a 100% native Salesforce deduplication app rated 5.0 stars on AppExchange. All information about Salesforce's native tools is sourced from official Salesforce documentation.

Accurate as of February 2026.