您好,欢迎光临本网站![请登录][注册会员]  
文件名称: Awk - A Tutorial and Introduction - by Bruce Barnett.pdf
  所属分类: 其它
  开发工具:
  文件大小: 2mb
  下载次数: 0
  上传时间: 2019-08-31
  提 供 者: wangql*******
 详细说明:Awk is an extremely versatile programming language for working on files. We'll teach you just enough to understand the examples in this page, plus a smidgen. The examples given below have the extensions of the executing script as part of the filename. Once you download it, and make it executable, yoprogramming. There are three variations ofAwK AWK-the original from at&T NAWK-A newer, improved version from at&T GAWK- The free software foundation 's version Originally, I didn't plan to discuss naWk, but several UNIX vendors have replaced AWK with NAWK, and there are several incompatibilities between the two. It would be cruel of me to not warn you about the differences. So I will highlight those when I come to them. It is important to know than all of,'s features are in naWK and GaWK Most, if not all, ofNAWK's features are in gawK. nawk ships as part of solaris. gawK does not. However many sites on the Internet have the sources freely available. If you user Linux, you have gawK. But in general, assume that i am talk ing about the classic a wK unless otherwise noted Why is aWK so important? It is an cxccllent filter and report writer. Many UNIX utilities gencratcs rows and columns of information. awK is an cxccllent tool for proccssing thesc rows and columns, and is casic to usc awk than most conventional programming languages. It can be considered to be a pseudo-C interpretor, as it understands the same arithmatic operators as C. AwK also has string manipulation functions, so it can search for particular strings and modify the output. AwK also has associative arrays, which are incredible useful, and is a feature most computing languages lack. Associative arrays can make a complex problem a trivial exercise be too confusing to discuss three different versions ofAWK. I won 't cover the gnU version of AWK called"gawk, r I won't exhaustively cover AwK. That is, I will cover the essential parts, and avoid the many variants of AwK. It might Similarly, I will not discuss the new at&T aWK called"nawk The new AWK comes on the Sun system, and you may find it superior to the old AWK in many ways. In particular, it has better diagnostics, and won't print out the infamous bailing out near lime."message the original AwK is prone to do. Instead,"hawk" prints out the line it didnt understand, and highlights the bad parts with arrowS. GaWK does this as well, and this really helps a lot. If you find yourself needing a feature that is very difficult or impossible to do in AWK, I suggest you either use NAWK, or GAWK, or convert your AWK script into PERL using the a2p"conversion program which comes with PERL. PERL is a marvelous language and i use it all the time but i do not plan to cover perl in these tutorials having made my intention clcar. i can continuc with a clcar conscicncc Many UNIX utilities havc strange namcs. AWK is onc of thosc utilities. It is not an abbreviation for awkward. In fact, it is an elegant and simple language. The work"AWK is derived from the initials of the language's three developers: A Aho, B W. Kernighan and P. Weinberger. Basic structure The essential organization of an awK program follows the form Pattern action j The pattern specifies when the action is performed. Like most UNIX utilities, AWK is lime oriented. That is, the pattern specifies a test that is performed with each line read as input If the condition is true then the action is taken. The default pattern is something that matches every line. This is the blank or null pattern. Two other important patterns are specified by the keywords BEGiN" and"END. As you might expect, these two words specify actions to be taken before any lines are read, and after the last line is read. The AWK program below beGin t print START J pr⊥2t END print STOP W adds onc linc bcforc and onc linc after the input filc. This isn,'t vcry uscful, but with a simple changc, wc can makc this into a typical AWK program begiN print File \tO wner",") { prnt$8,"t",$3} ENd print"-DONE-"3 I'll improve the script in the next sections, but we'll call it"FileOwner. But let's not put it into a script or file yet. I will cover that part in a bit. Hang on and follow with me so you get the flavor ofawK. The characters t"Indicates a tab character so the output lines up on even boundries. TheS8"""have a meaning similar to a shell script. Instead of the eighth and third argument, they mean the eighth and third field of the input line. You can think of a field as a column, and the action you specify operates on each line or row read in There are two differences between A WK and a shell processing the characters within double quotes. AWK understands spccial characters follow the ""charactcr lkc". The Bourne and C uniX shells do not. Also unlike the shell(and PerD) awk docs not cvahuatc variables within strings. To cxplain, the sccond linc could not bc writtcn like this Gprint t$3") That example would print$8 $3. " Inside the quotes, the dollar sign is not a special character. Outside, it corresponds to a field What do i mean by the third and eight field? Consider the solaris"/usr/bin/ls-f"command which has eight cons of information. The System v version (Similar to the linux version) /usr/5bn/ls-L"has 9 columns. The third cohn is the owner, and the eighth(or nineth) cohn in the name of the file. This awk program can be used to process the output of the"bs-I command, printing out the filename, then the owner, for each file I'l show you how Update: On a lnux system, change$ to $9 One more point about the use of a dollar sign. In scripting languages like Perl and the various shells, a dollar sign means the word following is the name of the variable. Awk is different. The dollar sign means that we are refering to a field or column in the current line. When switching between Perl and AWK you must remener that"S"has a different meaning So the following picc of codc prints two"ficlds"to standard out. The first fick printed is the numbcr 5", the sccond is the fifth ficld (or colmn) on thc input linc BEGIN (x51 f print x, Sx) Executing an AwK script So let's start writing our first AWK script. There are a couple of ways to do this Assuming the first script is called"Fileowner, " the vocation would be Is-1 FileOwner This might generate the following if there were only two files in the current directory File owner a file barnett another file barnett DONE There arc two problcms with this script. Both problems arc casy to fix, but I'll hold off on this until I cover thc basics The script itself can be written in many ways. The C shell version would look like this t! /bin/csh -f t linux users have to change $8 to s9 awk I\ B上G⊥N( print"i1e\ tOwner"} print $8,"\t" $3 上ND print -DONE -" Click here to get file: awk example csh As you can see in the above script, each line of the AwK script must have a backslash if it is not the last line of the script. This is necessary as the C shell doesn,t, by default, allow strings I have a long list of complaints about using the C shell. See Top Ten reasons not to use the C shell The Bourne shell(as docs most shells allows quoted strings to span scvcral lincs bin/sh Linux users have to change $8 to $9 awk begin print" tOwner") { prnt$8,"t",$3} ENd print"-DONE-") Click here to get file: awk example.sh The third form is to store the commands in a file, and execute awk-ffilename Since AWK is also an interpretor, you can save yourself a step and make the file executable by add one line in the beginning of thc filc #!/bin/awk -f beGiN print FiletO wner") { prmt$8,"t,$3} ENd print"-DONE-") Click here to get file: awk example lawk Change the permission with the chmod command, (ie. "chmod +x awk examplel awk), and the script becomes a new command. Notice the"I"option following# ! /bin/awk above, which is also used n the third format where you use AWK to execute the file directly, i.e. awk-ffilename". The"f" option specifies the awk file containing the instructions. As you can see, AWK considers lines that start with a#"to be a comment, just like the shell. To be precise, anything from the "y "to the end of the line is a comment (unless its inside an awK string. However, I always comment my AWK scripts with the#at the start of the line, for reasons I'll discuss later Which format should you usc? I prefer thc last format when possible. It's shorter and simpler. It's also casicr to debug problcms. If you nccd to usc a shell, and want to avoid using too many files, you can combine them as wc did in thc first and sccond cxamplc Which shell to use with awk? The format of AwK is not free-form You canot put new line breaks just anywhere. They must go in particular locations. To be precise, in the original a wk you can insert a new line character after the curly braces and at the end of a command, but not elsewhere. If you wanted to break a long line into two lines at any other place, you had to backslash: #!/bin/awk -f beGin print File\towner") print $8, "t"\ END print"-DONE-") Click hcrc to gct file: awk cxamplc2 awk The bourne shell version would be #!/bin/sh aw bEgin print"File\tO wner") i print$8, t", ENd print"] Click here to get file: awk example2.sh while the c shell would be #!/bin/csh-f awk begin i print"File tOwner")\ print $8, t", I S3} ENd print""\ Click here to get file: awk example2 csh s you can see, this demonstrates how awkward the c shell is when enclosing an AWK script. Not only are back slashes needed for every lne, some lines need two (Note-this is true when using ol awk(e.g. on Solaris) because the print statement had to be on one line. Newer AWK's are more flexible where new lines can be added. Many people will warn you about the C shell. Some of the problems are subtle, and you may never see them. Try to inchude an AWK or sed script within a C shell script, and the back slashes will drive you crazy. This is what convinced me to learn the Bourne shell years ago, when I was starting out. I strongly recommend you use the bourne shell for any AWK or sed script. If you don,t use the Bourne shell, then you should learn it. As a minimum, learn how to set variables, which by somc strange coincidence is the subjcct of the ncxt scction. Dvnamic variables Sincc you can makc a script an awK cxccutablc by mentioning#1 bin/awk-f' on the first linc, inchuding an AWK script inside a shell script isn t needed unless you want to either e liminate the need for an extra file or if you want to pass a variable to the insides ofan AwK script. Since this is a common problem, now is as good a time to explain the technique. I'll do this by showing a simple awk program that will only print one cohn. NOTE: there will be a bug the first version. The number of the column will be specified by the first argument. The first version of the program, which we will call'Cohmn"looks like this #I bin/sh #NOTE- this script does not work columNs awk 'print Scolumn) Click here to get file (but be aware that it doesn't work Columnl sh A suggested use is: Is-1 Column 3 This would print the third cohimn from the s command, which would bc the owner of thc file. You can change this into a utility that counts how many files arc owned by cach uscr by adding ls-1 Column 3 uniq-c sort-nr Only one problem: the script doesn't work. The value of the"column variable is not seen by awK. Change"awk to"echo"to check. You need to turn off the quoting when the variable is seen. This can be done by ending the quoting, and restarting it after the variable # bin/sh colum awk 'iprint S Scolurmm'f Click here to get file: Column2. sh This is a very important concept, and throws experienced programmers a curve ball. In many computer languages, a string has a start quote, and end quote, and the contents in between. If you want to inchude a special character inside the quote, you must prevent the charactcr from having the typical mcaning. In the C language, this is down by putting a backslash bcforc thc charactcr. In othcr languages, thcrc is a spccial combination of charactcrs toto this. In thc C and Bourne shell, the quote is just a switch. It turns thc interpretation modc on or off. Thcre is rcally no such conccpt as start of string and"end of string. " The quotes toggle a switch inside the interpretor The quote character is not passed on to the application. This is why there are two pairs of quotes above. Notice there are two dollar signs. The first one is quoted, and is seen by awk. the second one is not quoted so the shell evaluates the variable, and replaces column" by the value. If you don' t understand, either change"awk"to"echo, "or change the first line to read# /bin/sh-x. Some improvements are needed, however. The Bourne shell has a mechanism to provide a vahue for a variable if the value isn't set, or is set and the value is an empty string. This is done by using the format Variable: defaultvalue This is shown below where the default column will be one #!/ bin/sh column$1: 1) awk'iprint $) Click hcrc to gct filc: Column3. sh We can save a line by combining these two steps #I /bin/sh awk 'print $$1:-13' Click here to get file: Cohumn4sh It is hard to read but it is compact There is one other method that can be used If you execute an awk command and include on the command line variable-value this variable will be set when the AWK script starts. An example of this use would be bin/sh awk'print Scc=S1: 13 Click hcrc to gct file: Columns.sh cxamplc, howcvcr, bccausc you can usc it with any script or command. Thc sccond mcthod is spccial to AWK: Cr This last variation does not have the problems with quoting the previous example had. You should master the earlier Modern awK's have other options as well. See the comp. unix shell FAQ The Essential Syntax of AWK Earlier I discussed ways to start an AWK script. This section will discuss the various grammatical elements ofAWK Arithmetic Expressions There are several arithmetic operators, similar to C. These are the binary operators which operate on two variables AWK Table 1 Binary operators I Operator ype Meaning Arithmetic Addition Arithmetic Subtraction Arithmetic Multiplication Arithmetic Division Arithmetic Modulo I string Concatenation 一一一一一一一一一一一一一一一一一一一一一一一 Using variables with the vahuc of"7"and"3, "AWK returns the following results for cach opcrator when using the print command l Expression Result 2.33333 1783 73 3 There are a few points to make. The modulus operator finds the remainder after an integer divide. The print command output a floating point number on the divide, but an integer for the rest. The string concatenate operator is confusing, since it isn't even visible. Place a space between two variables and the strings are concatenated together. This also shows that numbers are converted automatically into strings when needed. Unlike C, awk doesnt have"types" of variables. There is one type only, and it can be a string or number. The conversion rules are simple. a number can easily be converted into a string. When a string is converted into a number awk will do so. The string 123 will be converted into the number 123. however, the string 123X will be converted into the number 0 (nawk will behave differently, and converts the string into integer 123, which is found in the beginning of the string Unary arithmetic operators The+"and"-"operators can be used before variables and mumbers IfX equals 4, then the statement print-X will print -4 The Autoincrement and Autodecrement Operators AWK also supports the++and"-"operators of C. both increment or decrement the variables by one. The operator can only be used with a single variable, and can be before or after the variable. The prefix form modifies the value, and then uses the result, while the postfix form gets the results of the variable, and afterwards modifies the variable. As an example if X has the value of 3 then the awK statement print x++,,++X would print the numbers 3 and 5. These operators are also assignment operators, and can be used by themselves on a Assignment Operators Variables can be assigned new values with the assignment operators. you know about ++"and"--, "The other assignment statement is simply variable arichmetic expression Certain operators have precedence over others, parenthesis can be used to control grouping. The statement 1+2*34: is the same as X=(1+(2*3)"4" Both print out74 Notice spaces can be added for readability. AWK, like C, has special assignment operators, which combine a calculation with an assignment. Instead of saying XX you can more concisely say +=2 The complete list follows AWK Tab_e 2 Assignment Operators l Operator Meaning d resu⊥ t to variable Subtract result from variable Multiply variable by result Divide variable by resul App l y modu o to variable Conditional expressions The second type of expression in AWK is the conditional expression. This is used for certain tests, like the if or while Boolean conditions evaluate to true or false. In awK. there is a definite difference between a boolean condition and an arithmetic cxprcssion. You cannot convert a boolean condition to an intcger or string. You can, howcvcr, usc an arithmetic cxprcssion as a conditional cxprcssion. a valuc ofo is falsc, while anything clsc is truc. Undcfincd variables has the value ofo. Unlike awK. NawK lets you use booleans as integers arithmetic values can also be converted into boolean conditions by using relational operators AWK Tabie 3 Relational Operators l Operator Meaning Is eg Ts not eg Ts greater than > Is greater than or equal to < Is less than Is less than or equal to These operators are the same as the c operators They can be used to compare numbers or strings. With respect to strings, lowcr casc letters arc grcatcr than uppcr casc letters Regular Expressions Two operators are used to compare strings to regular expressions AWK Tab_e 4 Regular Expression Operators Operat○r Meaning Matches Doesn t match The order in this case is particular. The regular expression must be enclosed by slashes, and comes after the operator. AWK supports extended regular expressions, so the following are examples of valid tests. word / START lawrence welk - /(onetwo three)
(系统自动生成,下载前可以参看下载内容)

下载文件列表

相关说明

  • 本站资源为会员上传分享交流与学习,如有侵犯您的权益,请联系我们删除.
  • 本站是交换下载平台,提供交流渠道,下载内容来自于网络,除下载问题外,其它问题请自行百度
  • 本站已设置防盗链,请勿用迅雷、QQ旋风等多线程下载软件下载资源,下载后用WinRAR最新版进行解压.
  • 如果您发现内容无法下载,请稍后再次尝试;或者到消费记录里找到下载记录反馈给我们.
  • 下载后发现下载的内容跟说明不相乎,请到消费记录里找到下载记录反馈给我们,经确认后退回积分.
  • 如下载前有疑问,可以通过点击"提供者"的名字,查看对方的联系方式,联系对方咨询.
 输入关键字,在本站1000多万海量源码库中尽情搜索: