3

The SELECT statement in the SAS DATA step

 2 years ago
source link: https://blogs.sas.com/content/iml/2016/06/20/select-when-sas-data-step.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

The SELECT statement in the SAS DATA step

12

Every beginning SAS programmer learns the simple IF-THEN/ELSE statement for conditional processing in the SAS DATA step. The basic If-THEN statement handles two cases: if a condition is true, the program does one thing, otherwise the program does something else.

Of course, you can handle more cases by using multiple ELSE IF statements. I have seen SAS programs that contain contains dozens of ELSE clauses. Sometimes a long sequence of IF-THEN/ELSE statements is necessary, such as when you are testing complex logical conditions.

Flow chart diagram for a switch statement (SELECT statement)

An alternative control statement in SAS is the SELECT-WHEN statement. The SELECT-WHEN statement (sometimes simply called the SELECT statement) enables you to conditionally execute statements based on the value of a single categorical variable. Usually the variable can have three or more valid values that you want to handle.

The following example uses the Sashelp.Heart data set, which contains data for 5,167 patients in a medical study. The Smoking_Status variable is a categorical variable that encodes the average number of cigarettes that each patient smokes per day. The following DATA step view implements a recoding scheme, which is sometimes the easiest way to force levels of a nominal variable to appear in a natural order during a SAS analysis.

/* example of using the SELECT statement */
data Heart / view=Heart;
set sashelp.heart;
select (Smoking_Status);
   when ('Non-smoker')        Smoking_Cat=1;
   when ('Light (1-5)')       Smoking_Cat=2;
   when ('Moderate (6-15)')   Smoking_Cat=3;
   when ('Heavy (16-25)')     Smoking_Cat=4;
   when ('Very Heavy (> 25)') Smoking_Cat=5;
   otherwise                  Smoking_Cat=.;
end;
run;

The SELECT-WHEN statement is easy to read. You specify the name of a variable on the SELECT statement. You then list a sequence of WHEN statements. Each WHEN statement specifies a particular value for the variable. If the variable has that value, the program conditionally executes a statement, which in this example assigns a value to the Smoking_Cat variable.

Notice that you can use the OTHERWISE keyword to handle missing values, invalid data, or default actions.

You can also combine categories in a WHEN statement. For example, in a statistical analysis you might want to combine the 'Heavy' and 'Very Heavy' categories into a single group. In the WHEN statement you can specify multiple values in a comma-separated list:

   /* combine the 'Heavy' and 'Very Heavy' categories */
   when ('Heavy (16-25)', 'Very Heavy (> 25)') Smoking_Cat=4;

If the WHEN condition is true, the program will execute one statement. This is the same rule that the IF-THEN statement follows. To execute more than one statement, use a DO-END block, which groups statements together:

   when ('Non-smoker') do;       /* execute multiple statements */
      Smoking_Cat=1;
      IsSmoker = 0;
   end;

I use the SELECT-WHEN statement as a "table lookup" when a program needs to branch according to the value of a single categorical variable that has three or more valid values. The basic SELECT-WHEN statement is not as flexible as the IF-THEN/ELSE statement, but, when applicable, it results in very clean and easy-to-read programs.

Other languages have similar branching statements. The SQL language supports a CASE-WHEN statement. The C/C++ and Java/Javascript languages support a switch-case statement. Whereas the CASE-WHEN statement in SAS executes one statement, the switch-case statement implements fallthrough, so C-programmers often use the break statement to exit the switch block.

Some languages do not support a special switch statement, but instead require that you use IF-THEN/ELSE statements. Python and the SAS/IML language fall into this category.

There is an alternative syntax for the SELECT-WHEN statement that does not specify an expression in the SELECT statement. Instead, you specify logical conditions in the WHEN statements. This alternate syntax is essentially equivalent to an IF-THEN/ELSE statement, so which syntax you use is a matter of personal preference. Personally, I use SELECT-WHEN for branching on a known set of discrete values, and I use the IF-THEN/ELSE statement to handle more complex situations.

WANT MORE GREAT INSIGHTS MONTHLY? | SUBSCRIBE TO THE SAS TECH REPORT

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK