- MySQL 基础
- MySQL - 首页
- MySQL - 简介
- MySQL - 特性
- MySQL - 版本
- MySQL - 变量
- MySQL - 安装
- MySQL - 管理
- MySQL - PHP 语法
- MySQL - Node.js 语法
- MySQL - Java 语法
- MySQL - Python 语法
- MySQL - 连接
- MySQL - Workbench
- MySQL 数据库
- MySQL - 创建数据库
- MySQL - 删除数据库
- MySQL - 选择数据库
- MySQL - 显示数据库
- MySQL - 复制数据库
- MySQL - 数据库导出
- MySQL - 数据库导入
- MySQL - 数据库信息
- MySQL 用户
- MySQL - 创建用户
- MySQL - 删除用户
- MySQL - 显示用户
- MySQL - 修改密码
- MySQL - 授予权限
- MySQL - 显示权限
- MySQL - 收回权限
- MySQL - 锁定用户账户
- MySQL - 解锁用户账户
- MySQL 表
- MySQL - 创建表
- MySQL - 显示表
- MySQL - 修改表
- MySQL - 重命名表
- MySQL - 克隆表
- MySQL - 清空表
- MySQL - 临时表
- MySQL - 修复表
- MySQL - 描述表
- MySQL - 添加/删除列
- MySQL - 显示列
- MySQL - 重命名列
- MySQL - 表锁
- MySQL - 删除表
- MySQL - 派生表
- MySQL 查询
- MySQL - 查询
- MySQL - 约束
- MySQL - INSERT 查询
- MySQL - SELECT 查询
- MySQL - UPDATE 查询
- MySQL - DELETE 查询
- MySQL - REPLACE 查询
- MySQL - INSERT IGNORE
- MySQL - INSERT ON DUPLICATE KEY UPDATE
- MySQL - INSERT INTO SELECT
- MySQL 运算符和子句
- MySQL - WHERE 子句
- MySQL - LIMIT 子句
- MySQL - DISTINCT 子句
- MySQL - ORDER BY 子句
- MySQL - GROUP BY 子句
- MySQL - HAVING 子句
- MySQL - AND 运算符
- MySQL - OR 运算符
- MySQL - LIKE 运算符
- MySQL - IN 运算符
- MySQL - ANY 运算符
- MySQL - EXISTS 运算符
- MySQL - NOT 运算符
- MySQL - NOT EQUAL 运算符
- MySQL - IS NULL 运算符
- MySQL - IS NOT NULL 运算符
- MySQL - BETWEEN 运算符
- MySQL - UNION 运算符
- MySQL - UNION vs UNION ALL
- MySQL - MINUS 运算符
- MySQL - INTERSECT 运算符
- MySQL - INTERVAL 运算符
- MySQL 连接
- MySQL - 使用连接
- MySQL - INNER JOIN
- MySQL - LEFT JOIN
- MySQL - RIGHT JOIN
- MySQL - CROSS JOIN
- MySQL - FULL JOIN
- MySQL - 自连接
- MySQL - DELETE JOIN
- MySQL - UPDATE JOIN
- MySQL - UNION vs JOIN
- MySQL 触发器
- MySQL - 触发器
- MySQL - 创建触发器
- MySQL - 显示触发器
- MySQL - 删除触发器
- MySQL - BEFORE INSERT 触发器
- MySQL - AFTER INSERT 触发器
- MySQL - BEFORE UPDATE 触发器
- MySQL - AFTER UPDATE 触发器
- MySQL - BEFORE DELETE 触发器
- MySQL - AFTER DELETE 触发器
- MySQL 数据类型
- MySQL - 数据类型
- MySQL - VARCHAR
- MySQL - BOOLEAN
- MySQL - ENUM
- MySQL - DECIMAL
- MySQL - INT
- MySQL - FLOAT
- MySQL - BIT
- MySQL - TINYINT
- MySQL - BLOB
- MySQL - SET
- MySQL 正则表达式
- MySQL - 正则表达式
- MySQL - RLIKE 运算符
- MySQL - NOT LIKE 运算符
- MySQL - NOT REGEXP 运算符
- MySQL - regexp_instr() 函数
- MySQL - regexp_like() 函数
- MySQL - regexp_replace() 函数
- MySQL - regexp_substr() 函数
- MySQL 函数 & 运算符
- MySQL - 日期和时间函数
- MySQL - 算术运算符
- MySQL - 数值函数
- MySQL - 字符串函数
- MySQL - 聚合函数
- MySQL 其他概念
- MySQL - NULL 值
- MySQL - 事务
- MySQL - 使用序列
- MySQL - 处理重复项
- MySQL - SQL 注入
- MySQL - 子查询
- MySQL - 注释
- MySQL - 检查约束
- MySQL - 存储引擎
- MySQL - 将表导出到 CSV 文件
- MySQL - 将 CSV 文件导入数据库
- MySQL - UUID
- MySQL - 公共表表达式
- MySQL - ON DELETE CASCADE
- MySQL - Upsert
- MySQL - 水平分区
- MySQL - 垂直分区
- MySQL - 游标
- MySQL - 存储函数
- MySQL - SIGNAL
- MySQL - RESIGNAL
- MySQL - 字符集
- MySQL - 校对规则
- MySQL - 通配符
- MySQL - 别名
- MySQL - ROLLUP
- MySQL - 今日日期
- MySQL - 字面量
- MySQL - 存储过程
- MySQL - EXPLAIN
- MySQL - JSON
- MySQL - 标准差
- MySQL - 查找重复记录
- MySQL - 删除重复记录
- MySQL - 选择随机记录
- MySQL - SHOW PROCESSLIST
- MySQL - 更改列类型
- MySQL - 重置自动递增
- MySQL - Coalesce() 函数
- MySQL 有用资源
- MySQL - 有用函数
- MySQL - 语句参考
- MySQL - 快速指南
- MySQL - 有用资源
- MySQL - 讨论
MySQL - 删除重复记录
MySQL 删除重复记录
数据库(包括 MySQL)中的重复记录非常常见。MySQL 数据库以包含行和列的表的形式存储数据。现在,当数据库表中的两行或多行具有相同的值时,该记录被认为是重复的。
这种冗余可能由于各种原因而发生:
- 该行可能被插入两次。
- 从外部来源导入原始数据时。
- 数据库应用程序中可能存在错误。
无论原因是什么,删除这种冗余对于提高数据准确性、减少错误或提高数据库性能效率都非常重要。
查找重复值
在删除重复记录之前,我们必须找出它们是否存在于表中。可以使用以下方法:
GROUP BY 子句
COUNT() 方法
示例
让我们首先创建一个名为“CUSTOMERS”的表,其中包含重复值:
CREATE TABLE CUSTOMERS( ID int, NAME varchar(100) );
使用以下 INSERT 查询,将一些记录插入到“CUSTOMERS”表中。在这里,我们添加了“John”作为重复记录 3 次:
INSERT INTO CUSTOMERS VALUES (1,'John'), (2,'Johnson'), (3,'John'), (4,'John');
获得的 CUSTOMERS 表如下所示:
| id | name |
|---|---|
| 1 | John |
| 2 | Johnson |
| 3 | John |
| 4 | John |
现在,我们使用 COUNT() 方法和 GROUP BY 子句检索表中重复的记录,如下面的查询所示:
SELECT NAME, COUNT(NAME) FROM CUSTOMERS GROUP BY NAME HAVING COUNT(NAME) > 1;
输出
获得的输出如下所示:
| NAME | COUNT(NAME) |
|---|---|
| John | 3 |
删除重复记录
要从数据库表中删除重复记录,我们可以使用 DELETE 命令。但是,此 DELETE 命令可以使用两种方法从表中删除重复项:
使用 DELETE... JOIN
使用 ROW_NUMBER() 函数
使用 DELETE... JOIN
为了使用 DELETE... JOIN 命令从表中删除重复记录,我们对其自身执行内部连接。这适用于并非完全相同的案例。
例如,假设客户记录中存在客户详细信息的重复,但序列号不断递增。在这里,即使 ID 不相同,记录也是重复的。
示例
在下面的查询中,我们使用前面创建的 CUSTOMERS 表来使用 DELETE... JOIN 命令删除重复记录:
DELETE t1 FROM CUSTOMERS t1 INNER JOIN CUSTOMERS t2 WHERE t1.id < t2.id AND t1.name = t2.name;
输出
获得的输出如下所示:
Query OK, 2 rows affected (0.01 sec)
验证
我们可以使用以下 SELECT 语句验证是否已删除重复记录:
SELECT * FROM CUSTOMERS;
我们可以从获得的表中看到,该查询删除了重复项,并在表中保留了不同的记录:
| ID | NAME |
|---|---|
| 2 | Johnson |
| 4 | John |
使用 ROW_NUMBER() 函数
MySQL 中的 ROW_NUMBER() 函数用于为从查询获得的结果集中的每一行分配一个从 1 开始的顺序号。
使用此函数,MySQL 允许您检测重复行,可以使用 DELETE 语句将其删除。
示例
在这里,我们将 ROW_NUMBER() 函数应用于在“NAME”列中具有重复值的 CUSTOMERS 表。我们将使用以下查询基于“NAME”列在分区内分配行号:
SELECT id, ROW_NUMBER() OVER (PARTITION BY name ORDER BY name) AS row_num FROM CUSTOMERS;
获得的输出如下所示:
| id | row_num |
|---|---|
| 1 | 1 |
| 3 | 2 |
| 4 | 3 |
| 2 | 1 |
现在,使用以下语句删除重复行(行号大于 1 的行):
DELETE FROM CUSTOMERS WHERE id IN( SELECT id FROM (SELECT id, ROW_NUMBER() OVER (PARTITION BY name ORDER BY name) AS row_num FROM CUSTOMERS) AS temp_table WHERE row_num>1 );
我们得到如下所示的输出:
Query OK, 2 rows affected (0.00 sec)
要验证是否已删除重复记录,请使用以下 SELECT 查询:
SELECT * FROM CUSTOMERS;
产生的结果如下所示:
| ID | NAME |
|---|---|
| 1 | John |
| 2 | Johnson |
使用客户端程序删除重复记录
我们还可以使用客户端程序删除重复记录。
语法
要通过PHP程序删除重复记录,需要使用**mysqli**函数**query()**执行包含“DELETE”命令的内连接,如下所示:
$sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; $mysqli->query($sql);
要通过JavaScript程序删除重复记录,需要使用**mysql2**库的**query()**函数执行包含“DELETE”命令的内连接,如下所示:
sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; con.query(sql)
要通过Java程序删除重复记录,需要使用**JDBC**函数**execute()**执行包含“DELETE”命令的内连接,如下所示:
String sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; statement.execute(sql);
要通过Python程序删除重复记录,需要使用**MySQL Connector/Python**的**execute()**函数执行包含“DELETE”命令的内连接,如下所示:
delete_query = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name" cursorObj.execute(delete_query)
示例
以下是程序示例:
$dbhost = 'localhost';
$dbuser = 'root';
$dbpass = 'password';
$db = 'TUTORIALS';
$mysqli = new mysqli($dbhost, $dbuser, $dbpass, $db);
if ($mysqli->connect_errno) {
printf("Connect failed: %s
", $mysqli->connect_error);
exit();
}
//printf('Connected successfully.
');
//let's create a table
$sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100))";
if($mysqli->query($sql)){
printf("DuplicateDeleteDemo table created successfully...!\n");
}
//now lets insert some duplicate records;
$sql = "INSERT INTO DuplicateDeleteDemo VALUES(1,'John')";
if($mysqli->query($sql)){
printf("First record inserted successfully...!\n");
}
$sql = "INSERT INTO DuplicateDeleteDemo VALUES(2,'Johnson')";
if($mysqli->query($sql)){
printf("Second record inserted successfully...!\n");
}
$sql = "INSERT INTO DuplicateDeleteDemo VALUES(3,'John')";
if($mysqli->query($sql)){
printf("Third records inserted successfully...!\n");
}
$sql = "INSERT INTO DuplicateDeleteDemo VALUES(4,'John')";
if($mysqli->query($sql)){
printf("Fourth record inserted successfully...!\n");
}
//display the table records
$sql = "SELECT * FROM DuplicateDeleteDemo";
if($result = $mysqli->query($sql)){
printf("Table records(before deleting): \n");
while($row = mysqli_fetch_array($result)){
printf("ID: %d, NAME %s",
$row['ID'],
$row['NAME']);
printf("\n");
}
}
//now lets count duplicate records
$sql = "SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1";
if($result = $mysqli->query($sql)){
printf("Duplicate records: \n");
while($row = mysqli_fetch_array($result)){
print_r($row);
}
}
//lets delete dupliacte records
$sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
if($mysqli->query($sql)){
printf("Duplicate records deleted successfully...!\n");
}
$sql = "SELECT ID, NAME FROM DuplicateDeleteDemo";
if($result = $mysqli->query($sql)){
printf("Table records after deleting: \n");
while($row = mysqli_fetch_row($result)){
print_r($row);
}
}
if($mysqli->error){
printf("Error message: ", $mysqli->error);
}
$mysqli->close();
输出
获得的输出结果如下所示:
DuplicateDeleteDemo table created successfully...!
First record inserted successfully...!
Second record inserted successfully...!
Third records inserted successfully...!
Fourth record inserted successfully...!
Table records(before deleting):
ID: 1, NAME John
ID: 2, NAME Johnson
ID: 3, NAME John
ID: 4, NAME John
Duplicate records:
Array
(
[0] => John
[NAME] => John
[1] => 3
[COUNT(NAME)] => 3
)
Duplicate records deleted successfully...!
Table records after deleting:
Array
(
[0] => 2
[1] => Johnson
)
Array
(
[0] => 4
[1] => John
)
var mysql = require('mysql2');
var con = mysql.createConnection({
host: "localhost",
user: "root",
password: "Nr5a0204@123"
});
// Connecting to MySQL
con.connect(function (err) {
if (err) throw err;
console.log("Connected!");
console.log("--------------------------");
// Create a new database
sql = "Create Database TUTORIALS";
con.query(sql);
sql = "USE TUTORIALS";
con.query(sql);
sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100));"
con.query(sql);
sql = "INSERT INTO DuplicateDeleteDemo VALUES(1,'John'),(2,'Johnson'),(3,'John'),(4,'John');"
con.query(sql);
sql = "SELECT * FROM DuplicateDeleteDemo;"
con.query(sql, function(err, result){
if (err) throw err
console.log("**Records of DuplicateDeleteDemo Table:**");
console.log(result);
console.log("--------------------------");
});
//Fetching records that are duplicated in the table
sql = "SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1;"
con.query(sql, function(err, result){
if (err) throw err
console.log("**Records that are duplicated in the table:**");
console.log(result);
console.log("--------------------------");
});
sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
con.query(sql);
sql = "SELECT * FROM DuplicateDeleteDemo;"
con.query(sql, function(err, result){
if (err) throw err
console.log("**Records after deleting Duplicates:**");
console.log(result);
});
});
输出
获得的输出结果如下所示:
Connected!
--------------------------
**Records of DuplicateDeleteDemo Table:**
[
{ ID: 1, NAME: 'John' },
{ ID: 2, NAME: 'Johnson' },
{ ID: 3, NAME: 'John' },
{ ID: 4, NAME: 'John' }
]
--------------------------
**Records that are duplicated in the table:**
[ { NAME: 'John', 'COUNT(NAME)': 3 } ]
--------------------------
**Records after deleting Duplicates:**
[ { ID: 2, NAME: 'Johnson' }, { ID: 4, NAME: 'John' } ]
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
public class DeleteDuplicates {
public static void main(String[] args) {
String url = "jdbc:mysql://:3306/TUTORIALS";
String user = "root";
String password = "password";
ResultSet rs;
try {
Class.forName("com.mysql.cj.jdbc.Driver");
Connection con = DriverManager.getConnection(url, user, password);
Statement st = con.createStatement();
//System.out.println("Database connected successfully...!");
String sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100))";
st.execute(sql);
System.out.println("Table DuplicateDeleteDemo created successfully...!");
//let's insert some records into it...
String sql1 = "INSERT INTO DuplicateDeleteDemo VALUES (1,'John'), (2,'Johnson'), (3,'John'), (4,'John')";
st.execute(sql1);
System.out.println("Records inserted successfully....!");
//print table records
String sql2 = "SELECT * FROM DuplicateDeleteDemo";
rs = st.executeQuery(sql2);
System.out.println("Table records(before deleting the duplicate rcords): ");
while(rs.next()) {
String id = rs.getString("id");
String name = rs.getString("name");
System.out.println("Id: " + id + ", Name: " + name);
}
//let delete duplicate records using delete join
String sql3 = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
st.execute(sql3);
System.out.println("Duplicate records deleted successfully....!");
String sql4 = "SELECT * FROM DuplicateDeleteDemo";
rs = st.executeQuery(sql4);
System.out.println("Table records(after deleting the duplicate rcords): ");
while(rs.next()) {
String id = rs.getString("id");
String name = rs.getString("name");
System.out.println("Id: " + id + ", Name: " + name);
}
}catch(Exception e) {
e.printStackTrace();
}
}
}
输出
获得的输出结果如下所示:
Table DuplicateDeleteDemo created successfully...! Records inserted successfully....! Table records(before deleting the duplicate rcords): Id: 1, Name: John Id: 2, Name: Johnson Id: 3, Name: John Id: 4, Name: John Duplicate records deleted successfully....! Table records(after deleting the duplicate rcords): Id: 2, Name: Johnson Id: 4, Name: John
import mysql.connector
# Establishing the connection
connection = mysql.connector.connect(
host='localhost',
user='root',
password='password',
database='tut'
)
# Creating a cursor object
cursorObj = connection.cursor()
# Creating the table 'DuplicateDeleteDemo'
create_table_query = '''CREATE TABLE DuplicateDeleteDemo(ID int, NAME varchar(100))'''
cursorObj.execute(create_table_query)
print("Table 'DuplicateDeleteDemo' is created successfully!")
# Inserting records into 'DuplicateDeleteDemo' table
sql = "INSERT INTO DuplicateDeleteDemo (ID, NAME) VALUES (%s, %s);"
values = [(1, 'John'), (2, 'Johnson'), (3, 'John'), (4, 'John')]
cursorObj.executemany(sql, values)
print("Values inserted successfully")
# Display table
display_table = "SELECT * FROM DuplicateDeleteDemo;"
cursorObj.execute(display_table)
# Printing the table 'DuplicateDeleteDemo'
results = cursorObj.fetchall()
print("\nDuplicateDeleteDemo Table:")
for result in results:
print(result)
# Retrieve the duplicate records
duplicate_records_query = """
SELECT NAME,
COUNT(NAME)
FROM DuplicateDeleteDemo
GROUP BY NAME
HAVING COUNT(NAME) > 1;
"""
cursorObj.execute(duplicate_records_query)
dup_rec = cursorObj.fetchall()
print("\nDuplicate records:")
for record in dup_rec:
print(record)
# Delete duplicate records
delete_query = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"
cursorObj.execute(delete_query)
print("Duplicate records deleted successfully")
# Verification
display_table_after_delete = "SELECT * FROM DuplicateDeleteDemo;"
cursorObj.execute(display_table_after_delete)
results_after_delete = cursorObj.fetchall()
print("\nDuplicateDeleteDemo Table (After Delete):")
for result in results_after_delete:
print(result)
# Closing the cursor and connection
cursorObj.close()
connection.close()
输出
获得的输出结果如下所示:
Table 'DuplicateDeleteDemo' is created successfully!
Values inserted successfully
DuplicateDeleteDemo Table:
(1, 'John')
(2, 'Johnson')
(3, 'John')
(4, 'John')
Duplicate records:
('John', 3)
Duplicate records deleted successfully
DuplicateDeleteDemo Table (After Delete):
(2, 'Johnson')
(4, 'John')