- MySQL 基础
- MySQL - 首页
- MySQL - 简介
- MySQL - 特性
- MySQL - 版本
- MySQL - 变量
- MySQL - 安装
- MySQL - 管理
- MySQL - PHP 语法
- MySQL - Node.js 语法
- MySQL - Java 语法
- MySQL - Python 语法
- MySQL - 连接
- MySQL - Workbench
- MySQL 数据库
- MySQL - 创建数据库
- MySQL - 删除数据库
- MySQL - 选择数据库
- MySQL - 显示数据库
- MySQL - 复制数据库
- MySQL - 数据库导出
- MySQL - 数据库导入
- MySQL - 数据库信息
- MySQL 用户
- MySQL - 创建用户
- MySQL - 删除用户
- MySQL - 显示用户
- MySQL - 修改密码
- MySQL - 授予权限
- MySQL - 显示权限
- MySQL - 收回权限
- MySQL - 锁定用户账户
- MySQL - 解锁用户账户
- MySQL 表
- MySQL - 创建表
- MySQL - 显示表
- MySQL - 修改表
- MySQL - 重命名表
- MySQL - 克隆表
- MySQL - 清空表
- MySQL - 临时表
- MySQL - 修复表
- MySQL - 描述表
- MySQL - 添加/删除列
- MySQL - 显示列
- MySQL - 重命名列
- MySQL - 表锁
- MySQL - 删除表
- MySQL - 派生表
- MySQL 查询
- MySQL - 查询
- MySQL - 约束
- MySQL - INSERT 查询
- MySQL - SELECT 查询
- MySQL - UPDATE 查询
- MySQL - DELETE 查询
- MySQL - REPLACE 查询
- MySQL - INSERT IGNORE
- MySQL - INSERT ON DUPLICATE KEY UPDATE
- MySQL - INSERT INTO SELECT
- MySQL 运算符和子句
- MySQL - WHERE 子句
- MySQL - LIMIT 子句
- MySQL - DISTINCT 子句
- MySQL - ORDER BY 子句
- MySQL - GROUP BY 子句
- MySQL - HAVING 子句
- MySQL - AND 运算符
- MySQL - OR 运算符
- MySQL - LIKE 运算符
- MySQL - IN 运算符
- MySQL - ANY 运算符
- MySQL - EXISTS 运算符
- MySQL - NOT 运算符
- MySQL - NOT EQUAL 运算符
- MySQL - IS NULL 运算符
- MySQL - IS NOT NULL 运算符
- MySQL - BETWEEN 运算符
- MySQL - UNION 运算符
- MySQL - UNION vs UNION ALL
- MySQL - MINUS 运算符
- MySQL - INTERSECT 运算符
- MySQL - INTERVAL 运算符
- MySQL 连接
- MySQL - 使用连接
- MySQL - INNER JOIN
- MySQL - LEFT JOIN
- MySQL - RIGHT JOIN
- MySQL - CROSS JOIN
- MySQL - FULL JOIN
- MySQL - 自连接
- MySQL - DELETE JOIN
- MySQL - UPDATE JOIN
- MySQL - UNION vs JOIN
- MySQL 触发器
- MySQL - 触发器
- MySQL - 创建触发器
- MySQL - 显示触发器
- MySQL - 删除触发器
- MySQL - BEFORE INSERT 触发器
- MySQL - AFTER INSERT 触发器
- MySQL - BEFORE UPDATE 触发器
- MySQL - AFTER UPDATE 触发器
- MySQL - BEFORE DELETE 触发器
- MySQL - AFTER DELETE 触发器
- MySQL 数据类型
- MySQL - 数据类型
- MySQL - VARCHAR
- MySQL - BOOLEAN
- MySQL - ENUM
- MySQL - DECIMAL
- MySQL - INT
- MySQL - FLOAT
- MySQL - BIT
- MySQL - TINYINT
- MySQL - BLOB
- MySQL - SET
- MySQL 正则表达式
- MySQL - 正则表达式
- MySQL - RLIKE 运算符
- MySQL - NOT LIKE 运算符
- MySQL - NOT REGEXP 运算符
- MySQL - regexp_instr() 函数
- MySQL - regexp_like() 函数
- MySQL - regexp_replace() 函数
- MySQL - regexp_substr() 函数
- MySQL 函数 & 运算符
- MySQL - 日期和时间函数
- MySQL - 算术运算符
- MySQL - 数值函数
- MySQL - 字符串函数
- MySQL - 聚合函数
- MySQL 其他概念
- MySQL - NULL 值
- MySQL - 事务
- MySQL - 使用序列
- MySQL - 处理重复项
- MySQL - SQL 注入
- MySQL - 子查询
- MySQL - 注释
- MySQL - 检查约束
- MySQL - 存储引擎
- MySQL - 将表导出到 CSV 文件
- MySQL - 将 CSV 文件导入数据库
- MySQL - UUID
- MySQL - 公共表表达式
- MySQL - ON DELETE CASCADE
- MySQL - Upsert
- MySQL - 水平分区
- MySQL - 垂直分区
- MySQL - 游标
- MySQL - 存储函数
- MySQL - SIGNAL
- MySQL - RESIGNAL
- MySQL - 字符集
- MySQL - 校对规则
- MySQL - 通配符
- MySQL - 别名
- MySQL - ROLLUP
- MySQL - 今日日期
- MySQL - 字面量
- MySQL - 存储过程
- MySQL - EXPLAIN
- MySQL - JSON
- MySQL - 标准差
- MySQL - 查找重复记录
- MySQL - 删除重复记录
- MySQL - 选择随机记录
- MySQL - SHOW PROCESSLIST
- MySQL - 更改列类型
- MySQL - 重置自动递增
- MySQL - Coalesce() 函数
- MySQL 有用资源
- MySQL - 有用函数
- MySQL - 语句参考
- MySQL - 快速指南
- MySQL - 有用资源
- MySQL - 讨论
MySQL - 删除重复记录
MySQL 删除重复记录
数据库(包括 MySQL)中的重复记录非常常见。MySQL 数据库以包含行和列的表的形式存储数据。现在,当数据库表中的两行或多行具有相同的值时,该记录被认为是重复的。
这种冗余可能由于各种原因而发生:
- 该行可能被插入两次。
- 从外部来源导入原始数据时。
- 数据库应用程序中可能存在错误。
无论原因是什么,删除这种冗余对于提高数据准确性、减少错误或提高数据库性能效率都非常重要。
查找重复值
在删除重复记录之前,我们必须找出它们是否存在于表中。可以使用以下方法:
GROUP BY 子句
COUNT() 方法
示例
让我们首先创建一个名为“CUSTOMERS”的表,其中包含重复值:
CREATE TABLE CUSTOMERS( ID int, NAME varchar(100) );
使用以下 INSERT 查询,将一些记录插入到“CUSTOMERS”表中。在这里,我们添加了“John”作为重复记录 3 次:
INSERT INTO CUSTOMERS VALUES (1,'John'), (2,'Johnson'), (3,'John'), (4,'John');
获得的 CUSTOMERS 表如下所示:
id | name |
---|---|
1 | John |
2 | Johnson |
3 | John |
4 | John |
现在,我们使用 COUNT() 方法和 GROUP BY 子句检索表中重复的记录,如下面的查询所示:
SELECT NAME, COUNT(NAME) FROM CUSTOMERS GROUP BY NAME HAVING COUNT(NAME) > 1;
输出
获得的输出如下所示:
NAME | COUNT(NAME) |
---|---|
John | 3 |
删除重复记录
要从数据库表中删除重复记录,我们可以使用 DELETE 命令。但是,此 DELETE 命令可以使用两种方法从表中删除重复项:
使用 DELETE... JOIN
使用 ROW_NUMBER() 函数
使用 DELETE... JOIN
为了使用 DELETE... JOIN 命令从表中删除重复记录,我们对其自身执行内部连接。这适用于并非完全相同的案例。
例如,假设客户记录中存在客户详细信息的重复,但序列号不断递增。在这里,即使 ID 不相同,记录也是重复的。
示例
在下面的查询中,我们使用前面创建的 CUSTOMERS 表来使用 DELETE... JOIN 命令删除重复记录:
DELETE t1 FROM CUSTOMERS t1 INNER JOIN CUSTOMERS t2 WHERE t1.id < t2.id AND t1.name = t2.name;
输出
获得的输出如下所示:
Query OK, 2 rows affected (0.01 sec)
验证
我们可以使用以下 SELECT 语句验证是否已删除重复记录:
SELECT * FROM CUSTOMERS;
我们可以从获得的表中看到,该查询删除了重复项,并在表中保留了不同的记录:
ID | NAME |
---|---|
2 | Johnson |
4 | John |
使用 ROW_NUMBER() 函数
MySQL 中的 ROW_NUMBER() 函数用于为从查询获得的结果集中的每一行分配一个从 1 开始的顺序号。
使用此函数,MySQL 允许您检测重复行,可以使用 DELETE 语句将其删除。
示例
在这里,我们将 ROW_NUMBER() 函数应用于在“NAME”列中具有重复值的 CUSTOMERS 表。我们将使用以下查询基于“NAME”列在分区内分配行号:
SELECT id, ROW_NUMBER() OVER (PARTITION BY name ORDER BY name) AS row_num FROM CUSTOMERS;
获得的输出如下所示:
id | row_num |
---|---|
1 | 1 |
3 | 2 |
4 | 3 |
2 | 1 |
现在,使用以下语句删除重复行(行号大于 1 的行):
DELETE FROM CUSTOMERS WHERE id IN( SELECT id FROM (SELECT id, ROW_NUMBER() OVER (PARTITION BY name ORDER BY name) AS row_num FROM CUSTOMERS) AS temp_table WHERE row_num>1 );
我们得到如下所示的输出:
Query OK, 2 rows affected (0.00 sec)
要验证是否已删除重复记录,请使用以下 SELECT 查询:
SELECT * FROM CUSTOMERS;
产生的结果如下所示:
ID | NAME |
---|---|
1 | John |
2 | Johnson |
使用客户端程序删除重复记录
我们还可以使用客户端程序删除重复记录。
语法
要通过PHP程序删除重复记录,需要使用**mysqli**函数**query()**执行包含“DELETE”命令的内连接,如下所示:
$sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; $mysqli->query($sql);
要通过JavaScript程序删除重复记录,需要使用**mysql2**库的**query()**函数执行包含“DELETE”命令的内连接,如下所示:
sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; con.query(sql)
要通过Java程序删除重复记录,需要使用**JDBC**函数**execute()**执行包含“DELETE”命令的内连接,如下所示:
String sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; statement.execute(sql);
要通过Python程序删除重复记录,需要使用**MySQL Connector/Python**的**execute()**函数执行包含“DELETE”命令的内连接,如下所示:
delete_query = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name" cursorObj.execute(delete_query)
示例
以下是程序示例:
$dbhost = 'localhost'; $dbuser = 'root'; $dbpass = 'password'; $db = 'TUTORIALS'; $mysqli = new mysqli($dbhost, $dbuser, $dbpass, $db); if ($mysqli->connect_errno) { printf("Connect failed: %s
", $mysqli->connect_error); exit(); } //printf('Connected successfully.
'); //let's create a table $sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100))"; if($mysqli->query($sql)){ printf("DuplicateDeleteDemo table created successfully...!\n"); } //now lets insert some duplicate records; $sql = "INSERT INTO DuplicateDeleteDemo VALUES(1,'John')"; if($mysqli->query($sql)){ printf("First record inserted successfully...!\n"); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(2,'Johnson')"; if($mysqli->query($sql)){ printf("Second record inserted successfully...!\n"); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(3,'John')"; if($mysqli->query($sql)){ printf("Third records inserted successfully...!\n"); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(4,'John')"; if($mysqli->query($sql)){ printf("Fourth record inserted successfully...!\n"); } //display the table records $sql = "SELECT * FROM DuplicateDeleteDemo"; if($result = $mysqli->query($sql)){ printf("Table records(before deleting): \n"); while($row = mysqli_fetch_array($result)){ printf("ID: %d, NAME %s", $row['ID'], $row['NAME']); printf("\n"); } } //now lets count duplicate records $sql = "SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1"; if($result = $mysqli->query($sql)){ printf("Duplicate records: \n"); while($row = mysqli_fetch_array($result)){ print_r($row); } } //lets delete dupliacte records $sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; if($mysqli->query($sql)){ printf("Duplicate records deleted successfully...!\n"); } $sql = "SELECT ID, NAME FROM DuplicateDeleteDemo"; if($result = $mysqli->query($sql)){ printf("Table records after deleting: \n"); while($row = mysqli_fetch_row($result)){ print_r($row); } } if($mysqli->error){ printf("Error message: ", $mysqli->error); } $mysqli->close();
输出
获得的输出结果如下所示:
DuplicateDeleteDemo table created successfully...! First record inserted successfully...! Second record inserted successfully...! Third records inserted successfully...! Fourth record inserted successfully...! Table records(before deleting): ID: 1, NAME John ID: 2, NAME Johnson ID: 3, NAME John ID: 4, NAME John Duplicate records: Array ( [0] => John [NAME] => John [1] => 3 [COUNT(NAME)] => 3 ) Duplicate records deleted successfully...! Table records after deleting: Array ( [0] => 2 [1] => Johnson ) Array ( [0] => 4 [1] => John )
var mysql = require('mysql2'); var con = mysql.createConnection({ host: "localhost", user: "root", password: "Nr5a0204@123" }); // Connecting to MySQL con.connect(function (err) { if (err) throw err; console.log("Connected!"); console.log("--------------------------"); // Create a new database sql = "Create Database TUTORIALS"; con.query(sql); sql = "USE TUTORIALS"; con.query(sql); sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100));" con.query(sql); sql = "INSERT INTO DuplicateDeleteDemo VALUES(1,'John'),(2,'Johnson'),(3,'John'),(4,'John');" con.query(sql); sql = "SELECT * FROM DuplicateDeleteDemo;" con.query(sql, function(err, result){ if (err) throw err console.log("**Records of DuplicateDeleteDemo Table:**"); console.log(result); console.log("--------------------------"); }); //Fetching records that are duplicated in the table sql = "SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1;" con.query(sql, function(err, result){ if (err) throw err console.log("**Records that are duplicated in the table:**"); console.log(result); console.log("--------------------------"); }); sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; con.query(sql); sql = "SELECT * FROM DuplicateDeleteDemo;" con.query(sql, function(err, result){ if (err) throw err console.log("**Records after deleting Duplicates:**"); console.log(result); }); });
输出
获得的输出结果如下所示:
Connected! -------------------------- **Records of DuplicateDeleteDemo Table:** [ { ID: 1, NAME: 'John' }, { ID: 2, NAME: 'Johnson' }, { ID: 3, NAME: 'John' }, { ID: 4, NAME: 'John' } ] -------------------------- **Records that are duplicated in the table:** [ { NAME: 'John', 'COUNT(NAME)': 3 } ] -------------------------- **Records after deleting Duplicates:** [ { ID: 2, NAME: 'Johnson' }, { ID: 4, NAME: 'John' } ]
import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.Statement; public class DeleteDuplicates { public static void main(String[] args) { String url = "jdbc:mysql://127.0.0.1:3306/TUTORIALS"; String user = "root"; String password = "password"; ResultSet rs; try { Class.forName("com.mysql.cj.jdbc.Driver"); Connection con = DriverManager.getConnection(url, user, password); Statement st = con.createStatement(); //System.out.println("Database connected successfully...!"); String sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100))"; st.execute(sql); System.out.println("Table DuplicateDeleteDemo created successfully...!"); //let's insert some records into it... String sql1 = "INSERT INTO DuplicateDeleteDemo VALUES (1,'John'), (2,'Johnson'), (3,'John'), (4,'John')"; st.execute(sql1); System.out.println("Records inserted successfully....!"); //print table records String sql2 = "SELECT * FROM DuplicateDeleteDemo"; rs = st.executeQuery(sql2); System.out.println("Table records(before deleting the duplicate rcords): "); while(rs.next()) { String id = rs.getString("id"); String name = rs.getString("name"); System.out.println("Id: " + id + ", Name: " + name); } //let delete duplicate records using delete join String sql3 = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; st.execute(sql3); System.out.println("Duplicate records deleted successfully....!"); String sql4 = "SELECT * FROM DuplicateDeleteDemo"; rs = st.executeQuery(sql4); System.out.println("Table records(after deleting the duplicate rcords): "); while(rs.next()) { String id = rs.getString("id"); String name = rs.getString("name"); System.out.println("Id: " + id + ", Name: " + name); } }catch(Exception e) { e.printStackTrace(); } } }
输出
获得的输出结果如下所示:
Table DuplicateDeleteDemo created successfully...! Records inserted successfully....! Table records(before deleting the duplicate rcords): Id: 1, Name: John Id: 2, Name: Johnson Id: 3, Name: John Id: 4, Name: John Duplicate records deleted successfully....! Table records(after deleting the duplicate rcords): Id: 2, Name: Johnson Id: 4, Name: John
import mysql.connector # Establishing the connection connection = mysql.connector.connect( host='localhost', user='root', password='password', database='tut' ) # Creating a cursor object cursorObj = connection.cursor() # Creating the table 'DuplicateDeleteDemo' create_table_query = '''CREATE TABLE DuplicateDeleteDemo(ID int, NAME varchar(100))''' cursorObj.execute(create_table_query) print("Table 'DuplicateDeleteDemo' is created successfully!") # Inserting records into 'DuplicateDeleteDemo' table sql = "INSERT INTO DuplicateDeleteDemo (ID, NAME) VALUES (%s, %s);" values = [(1, 'John'), (2, 'Johnson'), (3, 'John'), (4, 'John')] cursorObj.executemany(sql, values) print("Values inserted successfully") # Display table display_table = "SELECT * FROM DuplicateDeleteDemo;" cursorObj.execute(display_table) # Printing the table 'DuplicateDeleteDemo' results = cursorObj.fetchall() print("\nDuplicateDeleteDemo Table:") for result in results: print(result) # Retrieve the duplicate records duplicate_records_query = """ SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1; """ cursorObj.execute(duplicate_records_query) dup_rec = cursorObj.fetchall() print("\nDuplicate records:") for record in dup_rec: print(record) # Delete duplicate records delete_query = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name" cursorObj.execute(delete_query) print("Duplicate records deleted successfully") # Verification display_table_after_delete = "SELECT * FROM DuplicateDeleteDemo;" cursorObj.execute(display_table_after_delete) results_after_delete = cursorObj.fetchall() print("\nDuplicateDeleteDemo Table (After Delete):") for result in results_after_delete: print(result) # Closing the cursor and connection cursorObj.close() connection.close()
输出
获得的输出结果如下所示:
Table 'DuplicateDeleteDemo' is created successfully! Values inserted successfully DuplicateDeleteDemo Table: (1, 'John') (2, 'Johnson') (3, 'John') (4, 'John') Duplicate records: ('John', 3) Duplicate records deleted successfully DuplicateDeleteDemo Table (After Delete): (2, 'Johnson') (4, 'John')