uniq (识别并有选择地过滤掉文件中重复的行)

嚯嚯发表于 2020-07-18 03:14

浏览次数：

命令行 Unix&Linux命令 uniq

在手机上阅读

释放双眼，带上耳机，听听看~！

00:00

在类似Unix的操作系统上，uniq命令报告或过滤掉文件中的重复行。本文档介绍uniq的GNU / Linux版本。

查看英文版

1 uniq 运行系统环境

2 uniq 描述

3 uniq 语法

4 uniq 示例

uniq 运行系统环境

Linux

uniq 描述

uniq 从输入文件INPUT中过滤出相邻的匹配行，并将过滤后的数据写入输出文件OUTPUT中。

如果未指定INPUT，则uniq从标准输入读取。

如果未指定OUTPUT，则uniq写入标准输出。

如果未指定选项，则匹配的行将合并到第一个匹配项。

查看英文版

uniq 语法

uniq [OPTION]... [INPUT [OUTPUT]]

Options

-c, --count

带数字的前缀行代表它们发生了多少次。

-d, --repeated

仅打印重复的行。

-D, --all-repeated[=delimit-method]

打印所有重复的行。delimit-method可能是以下之一：

none	根本不要分隔重复的行。这是默认值。
prepend	在每组重复的行之前插入空白行。
separate	在每组重复的行之间插入空白行。

该 -D 选项是等同于指定 --all-repeated=none.

-f N, --skip-fields=N

在确定唯一性之前，请避免比较一行的前N个字段。字段是一组字符，由空格分隔。例如，如果文档的行已编号，并且您想比较该行中除行号以外的所有内容，此选项很有用。

如果指定了选项-f 1，则相邻的行将被视为相同。如果未指定-f选项，则将其视为唯一。

1 This is a line. 2 This is a line.

-i, --ignore-case

通常，比较是区分大小写的。该选项将执行不区分大小写的比较。

-s N, --skip-chars=N

确定唯一性时，避免比较每行的前N个字符。就像-f选项一样，但是它会跳过单个字符而不是字段。

-u, --unique

唯一仅打印唯一的行。

-z, --zero-terminated

以0字节（NULL）而不是换行符结束的行。

-w, --check-chars=N

每行比较不超过N个字符。

--help

显示帮助消息并退出。

--version

输出版本信息并退出。

Notes

uniq 不会检测重复的行，除非它们相邻。您可能要先对输入进行排序，或者使用sort -u而不是uniq。

uniq [OPTION]... [INPUT [OUTPUT]]

Options

-c, --count

Prefix lines with a number representing how many times they occurred.

-d, --repeated

Only print duplicated lines.

-D, --all-repeated[=delimit-method]

Print all duplicate lines. delimit-method may be one of the following:

none	Do not delimit duplicate lines at all. This is the default.
prepend	Insert a blank line before each set of duplicated lines.
separate	Insert a blank line between each set of duplicated lines.

The -D option is the same as specifying --all-repeated=none.

-f N, --skip-fields=N

Avoid comparing the first N fields of a line before determining uniqueness. A field is a group of characters, delimited by whitespace.

This option is useful, for instance, if your document's lines are numbered, and you want to compare everything in the line except the line number. If the option -f 1 were specified, the adjacent lines

1 This is a line. 2 This is a line.

would be considered identical. If no -f option were specified, they would be considered unique.

-i, --ignore-case

Normally, comparisons are case-sensitive. This option performs case-insensitive comparisons instead.

-s N, --skip-chars=N

Avoid comparing the first N characters of each line when determining uniqueness. This is like the -f option, but it skips individual characters rather than fields.

-u, --unique

Only print unique lines.

-z, --zero-terminated

End lines with 0 byte (NULL), instead of a newline.

-w, --check-chars=N

Compare no more than N characters in lines.

--help

Display a help message and exit.

--version

Output version information and exit.

Notes

uniq does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use sort -u instead of uniq.

查看英文版

uniq 示例

假设我们有一个八行文本文件myfile.txt，其中包含以下文本：

This is a line.
This is a line.
This is a line.
This is also a line.
This is also a line.
This is also also a line.

...以下是在此文件上运行uniq的几种方法及其创建的输出：

uniq myfile.txt

This is a line.
This is also a line.
This is also also a line.

uniq -c myfile.txt

3 This is a line.
1  
2 This is also a line.
1  
1 This is also also a line.

uniq -d myfile.txt

This is a line. 
This is also a line.

uniq -u myfile.txt

This is also also a line.

Let's say we have an eight-line text file, myfile.txt, which contains the following text:

This is a line.
This is a line.
This is a line.
This is also a line.
This is also a line.
This is also also a line.

...Here are several ways to run uniq on this file, and the output it creates:

uniq myfile.txt

This is a line.
This is also a line.
This is also also a line.

uniq -c myfile.txt

3 This is a line.
1  
2 This is also a line.
1  
1 This is also also a line.

uniq -d myfile.txt

This is a line. 
This is also a line.

uniq -u myfile.txt

This is also also a line.

查看英文版

其他命令行

如此好文，分享给朋友