NumPy - 查找唯一行

在 NumPy 数组中查找唯一行

在 NumPy 中，数组可以包含多行数据，有时您可能需要识别唯一行，这意味着这些行在数组中只出现一次。查找唯一行涉及根据其内容确定哪些行与其他行不同。

在 NumPy 中，我们可以使用unique() 函数来实现此目的。

使用 union1d() 函数

np.unique() 函数通常用于查找数组中唯一的元素。当与axis参数一起使用时，它可以用来查找唯一的行。以下是语法：

numpy.unique(a, axis=None, return_index=False, return_inverse=False, return_counts=False)

其中：

a - 输入数组。
axis - 查找唯一值的轴。对于行，设置为 0。
return_index - 确定是否返回第一次出现的索引。
return_inverse - 确定是否返回可以重建数组的索引。
return_counts - 确定是否返回唯一值的计数。

示例：查找一维数组中的唯一元素

np.unique() 函数最简单的用法是在一维数组中查找唯一元素：

import numpy as np

# Define a 1D array with duplicate values
array = np.array([1, 2, 2, 3, 4, 4, 5])

# Find unique elements
unique_elements = np.unique(array)

print("Unique Elements:\n", unique_elements)

获得以下输出：

Unique Elements:
[1 2 3 4 5]

示例：二维数组中的唯一行

在下面的示例中，我们使用 unique() 函数检索二维数组中的唯一行，删除任何重复的行：

import numpy as np

# Define an array with duplicate rows
array = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [1, 2, 3],
    [7, 8, 9]
])

# Find unique rows
unique_rows = np.unique(array, axis=0)

print("Unique Rows:\n", unique_rows)

这将产生以下结果：

Unique Rows:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

查找具有索引的唯一行

通过在 unique() 函数中将return_index参数设置为True，我们可以在 NumPy 中找到原始数组中唯一行的索引。

示例

在这个例子中，我们使用 unique() 函数查找唯一行及其索引：

import numpy as np

# Define an array with duplicate rows
array = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [1, 2, 3],
    [7, 8, 9]
])

# Find unique rows and their indices
unique_rows, indices = np.unique(array, axis=0, return_index=True)

print("Unique Rows:\n", unique_rows)
print("Indices of Unique Rows:\n", indices)

以上代码的输出如下：

Unique Rows:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Indices of Unique Rows:
[0 1 3]

重建原始数组

如果您需要根据唯一行重建原始数组，您可以使用 np.unique() 函数返回的索引，并将return_inverse参数设置为True。反向索引可以用来从唯一值映射回原始数据。

示例

在这个例子中，我们使用 unique() 函数识别 NumPy 数组中的唯一行及其原始索引。然后我们使用这些索引重建数组，以验证唯一行与原始数组（不包含重复项）匹配：

import numpy as np

# Define an array with duplicate rows
array = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [1, 2, 3],
    [7, 8, 9]
])

# Find unique rows and their indices
unique_rows, indices = np.unique(array, axis=0, return_index=True)

# Reconstruct the original array using the indices
reconstructed_array = array[np.sort(indices)]

print("Reconstructed Array:\n", reconstructed_array)

获得的输出如下所示：

Reconstructed Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

计数唯一行

除了查找唯一行之外，您可能还希望计算每行在数组中出现的次数。在 NumPy 中，您可以通过在 unique() 函数中将return_counts参数设置为True来实现此目的。

这在处理多维数组时特别有用，其中每一行代表一个记录或观察。

示例

在下面的示例中，我们使用 unique() 函数检索原始数组中每行唯一行的计数：

import numpy as np

# Define an array with duplicate rows
array = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [1, 2, 3],
    [7, 8, 9]
])

# Find unique rows and their counts
unique_rows, counts = np.unique(array, axis=0, return_counts=True)

print("Unique Rows:\n", unique_rows)
print("Counts of Each Row:\n", counts)

执行上述代码后，我们将获得以下输出：

Unique Rows:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Counts of Each Row:
[2 1 1]

多维数组

对于多维数组，您可以使用 np.unique() 函数通过将axis参数设置为0来查找唯一行。要处理所有维度上的唯一值，您可以使用默认设置。

示例

在下面的示例中，我们将 3D 数组展平为 2D，然后使用 unique() 函数查找唯一行：

import numpy as np

# Define a 3D array
array = np.array([
    [[1, 2], [3, 4]],
    [[1, 2], [5, 6]],
    [[1, 2], [3, 4]]
])

# Flatten the 3D array to 2D for uniqueness check
array_2d = array.reshape(-1, array.shape[-1])

# Find unique rows in the flattened array
unique_rows = np.unique(array_2d, axis=0)

print("Unique Rows in 3D Array:\n", unique_rows)

产生的结果如下：

Unique Rows in 3D Array:
[[1 2]
 [3 4]
 [5 6]]

打印页面