gsub()可以用于字段的刪減、增補(bǔ)、替換和切割,可以處理一個字段也可以處理由字段組成的向量。
具體的使用方法為:gsub("目標(biāo)字符", "替換字符", 對象)
在gsub函數(shù)中,任何字段處理都由將“替換字符”替換到“目標(biāo)字符”這一流程中實(shí)現(xiàn),令替換字符為''''可實(shí)現(xiàn)刪除,令替換字符為"目標(biāo)字符+增補(bǔ)內(nèi)容"可實(shí)現(xiàn)增補(bǔ),替換和切割也是使用類似的操作。
> text <- "AbcdEfgh . Ijkl MNM"
> gsub("Efg", "AAA", text) #將Efg改為AAA,區(qū)分大小寫
[1] "AbcdAAAh . Ijkl MNM"
任何符號,包括空格、Tab和換行都是可以識別的
> gsub(" I", "i", text) #可識別空格
[1] "AbcdEfgh .ijkl MNM"
同時字符可以識別多個,進(jìn)行批量置換
> gsub("M", "N", text)
[1] "AbcdEfgh . Ijkl NNN"
除此之外,gsub還有其他批量操作的方法
> gsub("^.* ", "a", text) #開頭直到最后一個空格結(jié)束替換成a
[1] "aMNM"
> gsub("^.* I(j).*$", "\\1", text) #只保留一個j
[1] "j"
> gsub(" .*$", "b", text) #第一個空格直達(dá)結(jié)尾替換成b
[1] "AbcdEfghb"
> gsub("\\.", "\\+", text) #句號.和加號+是特殊的,要添加\\來識別
[1] "AbcdEfgh + Ijkl MNM"
SyntaxDescription
\\dDigit, 0,1,2 ... 9
\\DNot Digit
\\sSpace
\\SNot Space
\\wWord
\\WNot Word
\\tTab
\\nNew line
^Beginning of the string
$End of the string
\Escape special characters, e.g. \\ is "\", \+ is "+"
|Alternation match. e.g. /(e|d)n/ matches "en" and "dn"
·Any character, except \n or line terminator
[ab]a or b
[^ab]Any character except a and b
[0-9]All Digit
[A-Z]All uppercase A to Z letters
[a-z]All lowercase a to z letters
[A-z]All Uppercase and lowercase a to z letters
i+i at least one time
i*i zero or more times
i?i zero or 1 time
i{n}i occurs n times in sequence
i{n1,n2}i occurs n1 - n2 times in sequence
i{n1,n2}?non greedy match, see above example
i{n,}i occures >= n times
[:alnum:]Alphanumeric characters: [:alpha:] and [:digit:]
[:alpha:]Alphabetic characters: [:lower:] and [:upper:]
[:blank:]Blank characters: e.g. space, tab
[:cntrl:]Control characters
[:digit:]Digits: 0 1 2 3 4 5 6 7 8 9
[:graph:]Graphical characters: [:alnum:] and [:punct:]
[:lower:]Lower-case letters in the current locale
[:print:]Printable characters: [:alnum:], [:punct:] and space
[:punct:]Punctuation character: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~
[:space:]Space characters: tab, newline, vertical tab, form feed, carriage return, space
[:upper:]Upper-case letters in the current locale
[:xdigit:]Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f