Generally, it’s best practice to put unique constraints on a table to prevent duplicate rows. However, you may find yourself working with a database where duplicate rows have been created through human error, a bug in your application, or uncleaned data from external sources. This tutorial will teach you how to find these duplicate rows.

To follow along, you’ll need read access to your database and a tool to query your database.

Identify Duplicate Criteria

Oracle sql query to find duplicate rows in a table of contents

The first step is to define your criteria for a duplicate row. Do you need a combination of two columns to be unique together, or are you simply searching for duplicates in a single column? In this example, we are searching for duplicates across two columns in our Users table: username and email.

Today, we will see Duplicate Records in SQL. Here, we will discuss a query to find duplicate rows in a database. Also, we will see how can we delete duplicate records in SQL with and without the DISTINCT keyword. So, let us start Duplicate Records in SQL. In the above table, we can find duplicate row using below query. SELECT name, section FROM tbl GROUP BY name, section HAVING COUNT(.) 1 Another Example: Given a table named PERSON task is to write an SQL query to find all duplicate name in the table.

Write Query to Verify Duplicates Exist

The first query we’re going to write is a simple query to verify whether duplicates do indeed exist in the table. For our example, my query looks like this:

HAVING is important here because unlike WHERE, HAVING filters on aggregate functions.

If any rows are returned, that means we have duplicates. In this example, our results look like this:

Oracle Find Duplicate Rows

usernameemailcount
Petepete@example.com2
Jessicajessica@example.com2
Milesmiles@example.com2

List All Rows Containing Duplicates

Oracle Sql Find Duplicate Rows

Oracle sql query to find duplicate rows in a table of contents

In the previous step, our query returned a list of duplicates. Now, we want to return the entire record for each duplicate row.

To accomplish this, we’ll need to select the entire table and join that to our duplicate rows. Our query looks like this:

If you look closely, you’ll see that this query is not so complicated. The initial SELECT simply selects every column in the users table, and then inner joins it with the duplicated data table from our initial query. Because we’re joining the table to itself, it’s necessary to use aliases (here, we’re using a and b) to label the two versions.

Here is what our results look like for this query:

idusernameemail
1Petepete@example.com
6Petepete@example.com
12Jessicajessica@example.com
13Jessicajessica@example.com
2Milesmiles@example.com
9Milesmiles@example.com

Because this result set includes all of the row ids, we can use it to help us deduplicate the rows later.