The AWK Programming Language | AWK 编程语言
source link: http://blog.tangzhixiong.com/post-0046-awk.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
The AWK Programming Language
The AWK Programming Language | AWK 编程语言
虽然全书没有看完(作者也说看第一章就够了),但还是要底气很足地说这本书真的很好。作者信手拈来各种好的例子。
An Awk Tutorial
Get Started
Hello World Example: awk '$3 > 0 {print $1, $2 * $3}' data.txt
The Structure Of AWK Program
awk ' program ' infiles
awk ' program ' # get input from command line
awk -f progfile <optional list of input files>
- Format:
pattern { action }
Simple Output
What AWK Can?
Printing Fields, Selecting Input, And Transforming Data
- Printing Every Line
{ print }
{ print $0 }
- Printing Certain Fields
{ print $1 }
- Computing and Printing
{ print $2, $3 * $4 }
NF
, the Number of Fields{ print NF, $1, $NF }
: print number of fields and first and last fields of each input line
NR
, the Number of Lines Read so far{ print NR, $1 }
- Putting Text in the output
{ print "Total cost of the lunch is ", $1, $2 * $3 }
Fancier Output
printf ("format", value1, value2, ...)
%s
,%f
,%.2f
for formats\n
for newline%-s
, left justfied
Sorting
awk '{ printf("%6.2f %s\n", $2 * $3, $0) }' emp.data | sort
Selection
- Selection By Comparison:
$2 > 8
- Selection By Computation:
$2 * $3 > 40
- Selection By Text Content:
$2 == "Susan"
- Combination of Patterns
$2 >= 20 || $3 <= 50
!($2 < 20 && $3 > 50)
Data Validation
NF I= 3 { print $0, "number of fields is not equal to 3" } $2 < 3.35 { print $0, "rate is below minimum wage" } $2 > 10 { print $0, "rate exceeds $10 per hour" } $3 < 0 { print $0, "negative hours worked" } $3 > 60 { print $0, "too many hours worked" }
- BEGIN and END
BEGIN { something you want to do at begin } { process }
print ""
: print empty lineprint
: print input line
Computing with AWK
Counting
{ counter = counter + 1 }
END { print ("there are %d lines.\n", counter) }
Computing Sums and Average
Handing Text
String Concatenation
{ names = names $1 " " }
END { print names }
Printing the Last Input Line
{ last = $0 }
END { print last }
Built-in Functions
{ print $1, length($1) }
Counting Lines, Words
{ nc = nc + length($0) + 1
nw = nw + NF }
END { print NR "lines", nw, "words", nc, "chars" }
Control-flow statements
- If-Else Statement
- While Statement
- For Statement
Arrays
Print Input In Reverse Order By Line
{ line[NR] = $0 }
END { i = NR
while (i > 0) {
print line[i]
i = i - 1
}
}
A Handful Of Useful “One-Liners”
END { print NR }
: print Line NumNR == 10 { print }
: print specific line{ print $NF }
: print last field{ field = $NF } END { print field }
: last field of last line{ nf = nf + NF } END { print "there are ", nf, " fields in this file" }
Lines that contain “apple”
/apple/ { lines = lines + 1 } END { print lines }
largest field and the line contains it
$1 > max { max = $1; maxline = $0 } END { print max, maxline }
NF > 0
: more than one fieldlength($0) > 80
: lines that longer than 80 characters{ print NF, $0 }
{ print $2, $1 }
{ tmp = $1; $1 = $2; $2 = tmp; print }
{ $1 = NR; print }
: print Line Num, then the line{ $2 = ""; print }
: erase the second fieldReverse fields
{ for (i = 1; i <= NF; i = i + 1) printf ("%s", $i) printf('\n') }
WHAT’S NEXT
- keep reading
- keep writing scripts
- 入门到这就够了。这是本书的第一章。后面具体介绍了 awk 各个细部。可以瞟一眼,有个概念,用的时候再查。
THE AWK LANGUAGE
USSR 8649 275 Asia
Canada 3852 25 North America
China 3705 1032 Asia
USA 3615 237 North America
Brazil 3286 134 South America
India 1267 746 Asia
Mexico 762 78 North America
France 211 55 Europe
Japan 144 120 Asia
Germany 96 61 Europe
England 94 56 Europe
program format: one line or multi lines breaks with “"
- patterns
{ statements }
{awk}BEGIN { statements }
END { statements }
expression { statements }
/ regular expression / { statements }
pattern1, pattern2 { statements }
- Actions
print
,printf()
if
,if else
,for
,do while
,break
,continue
,next
,exit
- variables
- built-in variables:
$0
,$1
, …,$NF
- field variables:
FS
,OFS
(output field separator),ORS
(output record separator)
- built-in variables:
string as regular expression
BEGIN { FS = "\t"} { pop[$3] += $4 } END { for (name in pop) print name, pop[name]}
split("/123/23/55", str, "/")
:arr["1"], arr["2"], arr["3"]
{ print $1, $3 > ($3 > 40? "file-a.txt", "file-b.txt") }
{ print > $1 }
BEGIN { FS = "[{\n\t}]:;" }
BEGIN { FS = "[ ]" }
: a single blankawk -F';:[\t]\n' '{ print }' file-in.txt
RS
(record separator)- The getline function
getline <"file"
getline x <"file"
while (getline <"file")
: dangerouswhile (getline <"file" > 0)
: safe
- ARGV:
ARGC
,ARGV
Making A Shell Command From An Awk Program
DATA PROCESSING
histgram:
{ data[int($1/10)]++ }
END { for (i=0; i<10; i++)
printf("%s", rep(data[i], "*"))
}
function rep(n, s, t)
{
while (n-- > 0)
{
t = t s
}
return t
}
convert mmddyy to yymmdd: { $1 = substr($1,5,2) substr($1,1,2) substr($1,3,2) print $1 }
reports and databases
processing words
little languages
experiments with algorithms
epilog
Related Issues not found
Please contact @district10 to initialize the comment
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK