文本处理正则表达式:grep
grep
- 一、正则表达式
- 二、grep的意义
- 三、grep的用法总结
- 四、grep演示示例
一、正则表达式
正则表达式(regular expression),简写RE。正则表达式就是能用某种模式去匹配一类字符串的公式,它是由一串字符和元字符构成的字符串。
主要分为两类
基本正则表达式:BRE拓展正则表达式:ERE
正则表达式就是处理字符串的方法,它是以行为单位来进行字符串的处理,正则表达式通过一些特殊符号的辅助,可以让使用者达到搜寻、删除、取代某特定字符串的目的。
例如vim、grep、find、awk、sed等命令都支持正则表达式。
二、grep的意义
grep:global search regular expression and print out the line
作用:文本搜索工具。根据用户指定的“模式”(过滤条件)对目标文本行进行匹配检查,打印匹配到的行。
模式:由正则表达式的元字符及文本字符所编写的过滤条件。
三、grep的用法总结
1.字符匹配
符号 | 意义 |
---|---|
. | 匹配单个字符 |
[ ] | 匹配范围内的任意单个字符 |
[^] | 匹配范围外的任意单个字符 |
2.匹配次数
\?:匹配前面的字符0次或一次,意味着可有可无。*:匹配前面的字符0次或任意次。.*:匹配任意长度的任意字符。\+:匹配前面的字符1次或任意次,意味着至少一次。\{n\}:匹配其前面的字符n次。\{n,m\}:匹配其前面的字符至少n次,至多m次。\{0,n\}:匹配前面的字符至多n次。\{m,\}:匹配前面的字符至少m次。\f:匹配一个换页符。
3.位置限定
^:行首锚定,用于模式的最左侧。$:行尾锚定,用于模式的最右侧。^$:空白行。^pattern$:用于pattern来匹配整行。\<或\b:词首锚定,用于单词模式的左侧。\>或\b:词尾锚定,用于单词模式的右侧。\<pattern\>:匹配完整单词。
4.其他特殊的posix字符
\w:匹配字母、数字和下划线等于[A-Za-z0-9]。\W:匹配非字母、非数字和非下划线[^A-Za-z0-9]。\n:匹配一个换行符。\r:匹配一个回车符。\t:匹配一个制表符。\s:匹配任何空白字符。\S:匹配任何非空白字符。[[:alnum:]]:文字数字字符(除去符号之外,数字、字母都可以包括中文)。[[:alpha:]]:文字字符(字母,包括中文)。[[:digit:]]:数字字符(阿拉伯数字)。[[:graph:]]:非空字符(不包括空格、tab键)。[[:print:]]:非空字符(包括空格,但不包括tab键)。[[:lower:]]:小写字符。[[:upper:]]:大写字符。[[:cntrl:]]:控制字符。[[:punct:]]:标点符号(包括常用符号)。[[:space:]]:所有空白字符(空行、空格、制表符)[[:xdigit:]]:十六进制数(0-9,a-f,A-F)。
四、grep演示示例
1.创建测试文本文件
[root@localhost ~]# vim test.txt
good good study day day up
10086
10086+1
I remember that day she got married
三个空格三个tab建
Good afternoon everyone
you are a good man
Study hard for your future
How are you?
yuwenkedaibiao
I'm fine,thanks
go TO bed
are you ok?
It's been a long day without you my friend
You jump ,I jump!
one day ,'your girl will go '. Just because you have noting.
"you cried to me,'fairy tales are deceptive'"
\\\\\\\\\\
//
()()()()()++++++++"how do you do"
2.用法举例
(1)搜索含有thanks的行。
[root@localhost ~]# grep "thanks" test.txt
I'm fine,thanks
(2)搜索以Good开头的行。
[root@localhost ~]# grep "^Good" test.txt
Good afternoon everyone
(3)搜索一个包含od的行,d字母可有可无,不限次数。
[root@localhost ~]# grep "od*" test.txt
good good study day day up
I remember that day she got married
Good afternoon everyone
you are a good man
Study hard for your future
How are you?
yuwenkedaibiao
go TO bed
are you ok?
It's been a long day without you my friend
You jump ,I jump!
one day ,'your girl will go '. Just because you have noting.
"you cried to me,'fairy tales are deceptive'"
"how do you do"
(4)搜索以y或Y开头的行。
[root@localhost ~]# grep "^[yY].*" test.txt
you are a good man
yuwenkedaibiao
You jump ,I jump!
(5)搜索含有单引号的行
[root@localhost ~]# grep ".*'.*'.*" test.txt
one day ,'your girl will go '. Just because you have noting.
"you cried to me,'fairy tales are deceptive'"
(6)搜索一些行,该行包括某个单词,并满足以下条件:
第一个字符可以是Y或y
第二个字符可以是o或者没有
第三个字符可以是任意字符
第四个字符匹配前一个字符,0次或任意次
[root@localhost ~]# grep "^[yY]o\?.*" test.txt
you are a good man
yuwenkedaibiao
You jump ,I jump!
(7)匹配以大写字母开头的行
[root@localhost ~]# grep "^[[:upper:]].*" test.txt
I remember that day she got married
Good afternoon everyone
Study hard for your future
How are you?
I'm fine,thanks
It's been a long day without you my friend
You jump ,I jump!
(8)匹配包含数字的行
[root@localhost ~]# grep "[[:digit:]]" test.txt
10086
10086+1
(9)搜索精确匹配到含有day的单词的行
[root@localhost ~]# grep "\<day\>" test.txt
good good study day day up
I remember that day she got married
It's been a long day without you my friend
one day ,'your girl will go '. Just because you have noting.
(10)搜索包含are you ok这一行
[root@localhost ~]# grep "are you ok" test.txt
are you ok?
(11)搜索以字母g开头包含两个o以上的单词
[root@localhost ~]# grep "^go\{2,\}" test.txt
good good study day day up
(12)使用拓展正则表达式,搜索g和d之间至少有一个o的行
[root@localhost ~]# grep "go\{1,\}d" test.txt
good good study day day up
you are a good man
[root@localhost ~]# grep "go\+d" test.txt
good good study day day up
you are a good man
(13)搜索++++++++的行
[root@localhost ~]# grep "++++" test.txt
++++++++
(14)搜索\或者的行
[root@localhost ~]# grep "[\/]" test.txt
\\\\\\\\\\
//
(15)搜索带有双引号的行(注意:双引号需要转译加反斜杠)
[root@localhost ~]# grep ".*\".*\".*" test.txt
"you cried to me,'fairy tales are deceptive'"
"how do you do"
(16)搜索双引号里带有单引号的行
[root@localhost ~]# grep ".*\".*'.*'.*\".*" test.txt
"you cried to me,'fairy tales are deceptive'"
(17)搜索空白字符行(注意:这里空白字符包括空格、空行、tab键、制表符)
[root@localhost ~]# grep "[[:space:]]" test.txt
good good study day day up
I remember that day she got marriedGood afternoon everyone
you are a good man
Study hard for your future
How are you?
I'm fine,thanks
go TO bed
are you ok?
It's been a long day without you my friend
You jump ,I jump!
one day ,'your girl will go '. Just because you have noting.
"you cried to me,'fairy tales are deceptive'"
"how do you do"
(18)搜索非空白字符行,并统计有多少行。
注意:
graph搜索,认为空白包括空格、空行、tab键;
print搜索,认为空白包括空行、tab键,并不包括空格。
[root@localhost ~]# grep -c "[[:graph:]]" test.txt
21
[root@localhost ~]# grep -c "[[:print:]]" test.txt
22
(19)搜索带有标点符号的行
[root@localhost ~]# grep "[[:punct:]]" test.txt
10086+1
How are you?
I'm fine,thanks
are you ok?
It's been a long day without you my friend
You jump ,I jump!
one day ,'your girl will go '. Just because you have noting.
"you cried to me,'fairy tales are deceptive'"
\\\\\\\\\\
//
()()()()()
++++++++
"how do you do"
(20)搜索以y开头中间肯定有字符末尾是o的行,并显示出上下两行的内容
[root@localhost ~]# grep "^y.\+o$" test.txt
yuwenkedaibiao
[root@localhost ~]# grep -C 2 "^y.\+o$" test.txt
Study hard for your future
How are you?
yuwenkedaibiao
I'm fine,thanks
go TO bed