dos2unix Command in Linux



The Linux dos2unix command converts the DOS text file format to UNIX format. The newline in the DOS text file uses two escape characters, carriage return \r, and line feed \n.

In the older MAC versions before OS X, the \r is used as a line ending in text files. However, UNIX only uses line feed \n.

When a DOS text file is opened in UNIX, the extra carriage return can cause compatibility issues. Programs or scripts that process the text file might not work as expected. To avoid such issues in UNIX or UNIX-like operating systems, the DOS file format is converted to UNIX using the dos2unix utility.

Table of Contents

Here is a comprehensive guide to the options available with the dos2unix command −

Prerequisites for Using the dos2unix Command

The dos2unix command line utility may not be installed on Linux distributions by default. To install it, see the instructions given below −

To install dos2unix on Ubuntu, Debian, and Debian-based distributions −

sudo apt install dos2unix

On CentOS, RHEL, and other RPM-based Linux distributions −

sudo yum install dos2unix

For Arch Linux and distributions based on Arch −

sudo pacman -S dos2unix

To verify the installation, check the version −

dos2unix --version
Prerequisites for Using dos2unix Command

Syntax of dos2unix Command

The general syntax to use the dos2unix command is as follows −

dos2unix [options] [file]

The [options] field is used to specify the conversion modes and output behavior. The [file] field is used to specify the file to be converted.

dos2unix Command Options

The options for the dos2unix command are listed in the table below −

Flags Options Description
--allow-chown It allows the change of file ownership
-b --keep-bom To keep byte order mark
-c mode --convmode mode It is used to set the conversion mode from ASCII, 7bit, ISO, and MAC
-I file --info file To display information about file
-h --help To display the help about the command
-k --keepdate To keep the original timestamp in the new file
-m --add-boom It adds byte order mark (default UTF-8)
-n infile outfile --newfile infile outfile It converts the infile and writes the output to the outfile (wildcards cannot be used)
--no-allow-chown It does not allow the change in file ownership (default)
-o file --oldfile file It converts the old file and writes the output to it (default mode | wildcard can be used)
-q --quiet It suppresses all the warnings during conversion
-r --remove-bom It removes the byte order mark (default)
-u --keep-utf16 It keeps the UTF-16 encoding
-ul --assume-utf16le It assumes the input file is UTF-16LE encoded
-ub --assume-utf16be It assumes the input file is UTF-16BE encoded
-v --verbose It prints the detailed output
-V --version It prints the command version information

Understanding the DOS File Format

Before proceeding with conversion, let’s understand the DOS file format. The DOS file format is primarily used in the Windows operating system. The line ending of the DOS file has both \r and \n.

Let’s analyze a DOS file on Linux, using the od command. The od stands for octal dump, which displays files in various formats. To display the printable characters and backslash escapes of a file, the -c option is used with the od command −

od -c file.txt
Understanding DOS File Format

The output shows the two escape characters \r and \n. To fix the escape characters according to the UNIX format, the dos2unix command is used.

Examples of dos2unix Command in Linux

  • Converting a DOS File to a UNIX File Format
  • Converting Multiple DOS Files to UNIX File Format
  • Preserving the Original Timestamp
  • Converting in Different Conversion Modes
  • Converting to a New File
  • Converting a DOS File to a Unix File Quietly
  • Converting a DOS File while Keeping the Byte Order Mark (BOM)

In this section, the various examples of using the dos2unix command will be discussed −

Converting a DOS File to a UNIX File Format

To convert a DOS file to UNIX format, execute the dos2unix command with the file name. For example, to convert the file.txt, run −

dos2unix file.txt

To verify, use the od command.

Converting DOS File to UNIX File Format

Converting Multiple DOS Files to UNIX File Format

To convert multiple files to UNIX format, mention the file names separated by a space −

dos2unix file1.txt file2.txt file3.txt

Use a wildcard to convert all the files in a directory −

os2unix /path/*

Preserving the Original Timestamp

To keep the timestamp of the original file, the -k or --keepdate option is used −

dos2unix -k file.txt
Preserving Original Timestamp

Converting in Different Conversion Modes

The -c or --convmode is used to specify the conversion mode with the dos2unix command. The command supports the following modes −

  • ASCII
  • 7bit
  • ISO
  • MAC

To convert the file in ASCII, use the following command −

dos2unix -c ascii file.txt

Note that the ASCII is the default mode, it can be done without mentioning it explicitly.

To convert the file in 7bit, use the following command −

dos2unix -c 7bit file.txt

The above command converts the file into 7-bit ASCII, which strips the 8th bit from each byte. For example, if the file contains extended ASCII characters such as å, ħ, or ü, they will be removed from the file.

To convert to ISO, use the following command −

dos2unix -c iso file.txt

The ISO converts the file into ISO format such as ISO-1252, 437, 850, 860, 863, and 865. It retains the special characters.

To convert DOS file format into older MAC line ending format, use the following command −

dos2unix -c mac file.txt

In older MAC versions the \r is used as line endings.

Similarly, multiple files and even wildcards can also be mentioned to convert multiple files in different modes.

dos2unix -c ascii file1.txt file2.txt
dos2unix -c 7bit file1.txt file2.txt
dos2unix -c iso file1.txt file2.txt
dos2unix -c mac file1.txt file2.txt

Converting to a New File

The converted output can be saved in a new file using the -n option. For example, use the following command to convert ofile.txt to UNIX format and save its output to nfile.txt.

dos2unix -n ofile.txt nfile.txt

The ofile.txt signifies the old file, while nfile.txt signifies the new file.

To convert in a specific mode, use the following command −

dos2unix -n -c 7bit ofile.txt nfile.txt

To convert multiple files to new files, use the following command −

dos2unix -n ofile1.txt nfile1.txt ofile2.txt nfile2.txt

To keep the timestamp in the new file, use the following command −

dos2unix -n -k ofile.txt nfile.txt

Converting a DOS File to a Unix File Quietly

To suppress all the warnings during conversion, the -q option is used −

dos2unix -q file.txt

Converting a DOS File while Keeping the Byte Order Mark (BOM)

The byte order mark or BOM is used to detect the file encoding. There are different encodings such as UTF-8, UTF-16, or UTF-32. Each encoding is represented differently.

Encoding Hex Representation Decimal Representation
UTF-8 ef bb bf 239 187 191
UTF-16 fe ff 254 255
UTF-32 (BE) 00 00 fe ff 0 0 254 255
UTF-32 (LE) fe ff 00 00 254 255 0 0

To keep the UTF-8 byte order mark.

dos2unix -b file.txt

To keep the UTF-16 BOM, use the following command −

dos2unix -u file.txt

To add the BOM, use the -m option −

dos2unix -m file.txt
Converting DOS File while Keeping BOM

The output shows that ef bb bf bytes of mark are added to the file.

To remove the BOM, use the -r option −

dos2unix -r file.txt

Conclusion

The dos2unix command is a handy Linux command-line utility to convert DOS-style files to UNIX style. The DOS files in the default file format of the Windows operating system in which the line ending uses \r\n escapes, while in UNIX only \n is used. To ensure compatibility on Unix-like operating systems, the DOS files are converted to UNIX files.

In this tutorial, we explained in detail the dos2unix command, its installation, syntax, options, and usage through various examples.

Advertisements