- Unix Commands Reference
- Unix Commands - Home
comm Command in Linux
comm is a Linux command that is used to compare two sorted files line by line. The comm command produces a three-column output. The first column contains lines unique to the first file, the second column contains lines unique to the second file, while the third column contains lines common to both files.
Table of Contents
Here is a comprehensive guide to the options available with the comm command −
- How to Install comm Command in Linux?
- Syntax of comm Command
- comm Command Options
- Examples of comm Command in Linux
How to Install comm Command in Linux?
The comm command is a part of coreutils package, an essential package that is preinstalled on all Linux systems. You can confirm whether comm command is working on the system by running the following command −
comm --version
Or use the below-given command to confirm comm command installation on Linux −
which comm
In case you encounter an issue while working with the comm command, you can reinstall it from your desired Linux package manager.
The Linux systems like Ubuntu and Debian that uses the APT package manager can reinstall the coreutils package from the below-given command −
sudo apt install --reinstall coreutils
For other distributions like CentOS and Fedora that uses the yums package manager, they can reinstall coreutils package using the following command −
sudo yum install --reinstall coreutils
Syntax of comm Command
The basic syntax to use the comm command on Linux is pretty simple, which is given below −
comm [options] filename_1 filename_2
Ensure executing the comm command with the desired options as provided in the above syntax. If you execute the comm command without specifying the file names, you will encounter comm:missing operand error.
comm Command Options
Using comm without options is quite simple, however, if you want to customize the behavior of the command, you can use following options provided in the table −
Options | Description |
---|---|
-1 | Suppress the first column (these are lines unique to the first file). |
-2 | Suppress the second column (these are lines unique to the second file). |
-3 | Suppress the third column (these are lines common to both files). |
--check-order | Check that the input is correctly sorted, even if all input lines are pairable. |
--nocheck-order | Do not check the input sorting. |
--output-delimiter | Separate columns with the specified string (e.g., `–output-delimiter=" |
--total | Output total number of lines in each column. |
--help | Display a help message and exit. |
--version | Output version information and exit. |
Examples of comm Command in Linux
Before discussing examples of comm commands, make sure you have set up a test environment. For that purpose, you must create two files, name them according to your choice, add different words and numbers to each file. Also ensure some contents may overlap between the two files.
As an example, we have created two files named file1.txt and file2.txt and the text inside the two files are shown below −
Compare Two Files
The basic use of comm command on Linux is to compare two files, this can be done by using the command without any argument followed by the file names you want to compare. For example −
comm file1.txt file2.txt
The output will have three columns −
- The first column contains names only present in file1.txt.
- The second column contains names only present in file2.txt.
- The third column contains names common to both files.
The output for above command is provided below −
Hide Columns
You can customize the output by hiding specific columns using options with the comm command. These options include -1, -2 and -3.
For example, if you want to hide the first column, you can use the below-given command −
comm -1 file1.txt file2.txt
To hide the second column, you can use -2 option −
comm -2 file1.txt file2.txt
For hiding the third column, you can use -3 option −
comm -3 file1.txt file2.txt
Compare Unsorted Files
If your files are not sorted, you will encounter errors like “comm:file 1 is not sorted” or “comm:file 2 is not sorted”.
You can confirm whether files are sorted or not by running the comm command with option --check-order
comm --check-order file1.txt file3.txt
If you want to force the comm command to print the result irrespective of the sorted order, you can use the command with the --nocheck-order option. For example −
comm --nocheck-order file1.txt file3.txt
This will print the result of the two unsorted files. However, the result generated from the --nocheck-order option is not reliable. As you can see from the above output, the fruits apple and banana are common in both files. However, the command only considers apple as common fruit in both files.
Show Line Counts
You can also use the comm command with --total option to find out the total number of lines in each column. For example,
comm --total file1.txt file2.txt
Change the Default Separator
By default, the comm command uses a tab character as the separator between columns in its output. However, you can specify a different separator using the --output-delimiter=STR option by replacing STR with the desired separator.
For example, in the following example, we use * as our desired separator −
comm --output-delimiter=* file1.txt file2.txt
This will produce an output where the columns are separated by asterisks. The output indicates that words in file1.txt are shown without an asterisk, those in file2.txt have one asterisk, and items common to both files are marked with two asterisks.
That’s how you can use the comm command to compare two sorted files on your Linux system.
Conclusion
comm is a powerful Linux command that helps you in comparing two sorted files. It provides a clear breakdown of unique lines in each file and common lines between them.
Before using the comm command, you must make sure that your files are sorted so that you will be able to get the accurate results.
In this tutorial, we have covered the syntax, various options, and practical examples to help you grasp its usage. Follow them to effectively utilize the comm command for comparing sorted files in your Linux system.