SAS Tips & Tricks

Table of Contents

Introduction

This document is a collection of notes detailing solutions to various problems I have encountered when working with SAS. These were developed using SAS 9.2+ on Windows 7.

There is no real order to the sections.

Enhanced Editor modification tricks

General modifications

These are miscellaneous modifications not substantial enough to warrant separate sections, yet nonetheless useful:

  1. Change the default ERROR color from dull red to bright red1.
  2. Tools > Options... > Enhanced Editor Options
    • Tab size: 2 spaces
    • Check "Insert spaces for tabs"
    • Check "Replace tabs with spaces on file open"
    • Check "Show line numbers"2
    • Check "Show section brackets in margin"

Change the default size of the Editor, Log, and Explorer Windows

SAS does not have a documented way to change these. The following trick, however, may be used to set the windows how you like them. First, select the Editor window. Then goto Tools > Options > Preferences. Check the box for "Save settings on exit". Arrange your windows how you want them and exit SAS. When SAS reopens, the windows will have retained their arrangement. Go back to the menu and uncheck the "Save settings on exit." This will prevent the arrangement from being reset.

Open the Enhanced Editor maximized

In the AUTOEXEC.SAS file, place

dm 'awsmaximize on';

Open the WORK library by default in the Enhanced Editor

In the AUTOEXEC.SAS, place

dm 'dmsexp; expand libraries; expand work;';

Keyboard macro tricks

Abbreviation Macros

An abbreviation can be combined with a macro. This means that you can assign a macro to a key sequence rather than a key 'chord'. To do this, create your abbreviation. Then go to the macro menu and edit it. Do not assign it a key. The abbreviation will be called as usual, but will also execute the automated portion.

This is different than just creating a macro which is not assigned keys. For instance, creating a macro called "test" which inserts "Hello, World!" will not be prompted to execute when you type 'test' in the editor.

Abbreviation Macro Example: Insert Comment (//)

In the Enhanced Editor, it is possible to combine abbreviations with macros. An abbreviation is typically understood to be a sequence of characters that initiates a prompt which, when selected by pressing TAB or ENTER, replaces the character sequence with a predefined string. A macro, by contrast, is an automated process. Through the Keyboard Macro dialog, it is only possible to assign a macro to a key "chord" such as Ctrl+Shift+P. However, if you define an abbreviation, it appears in the Keyboard Macro dialog as a macro. This allows you to assign an automation to to a key sequence.

To illustrate, let us create an automated abbreviation for writing comments. Block comments can be inserted in two ways by default. Neither is particularly convenient:

Option 1: you can manually type '/', '*', '<your comment>', '*', and finally '/'. The '*' requires a shift, the '/' does not. It's easy to mis-type this.

Option 2, you can press Ctrl+/ to comment out the line. On a blank line, this inserts only the string '/**/'. This requires you to then manually move the cursor into the correct position.

Let's automate this:

  • Goto Tools > Add Abbreviation
  • For the abbreviation, put '//'
  • For the Text to insert for abbreviation, put '/**/'
  • Press OK
  • Goto Tools > Keyboard Macros > Macros...
  • Select //
  • Press Edit
  • From the "Categories" drop down, select "Navigation"
  • From the list, select "Move cursor left"
  • Press the >> symbol twice
  • Press OK
  • Close the Keyboard Macros dialog.

Now, when you type '//' you may press either TAB or ENTER. The block comment symbols '/**/' will be inserted, but your cursor will be conveniently placed between the asterisks.

Keyboard Macros

Suggested bindings:

Keyboard macro Key binding
Duplicate Line Ctrl + Shift + D
Goto Debug Toggle Alt + Shift + U
Goto Import Alt + I
Goto Last Macro Definition Alt + L
Goto Main Alt + M
Goto Main Last Edit Point Alt + Shift + M
Goto Mark Ctrl + M
Goto Next Macro Alt + N
Goto Next Macro Last Line Alt + Shift + N
Goto next marked line Ctrl + '
Goto Previous Macro Alt + P
Goto Previous Macro Last Line Alt + Shift + P
Goto Previous Marked Line Ctrl + ;
Goto TestSuite Alt + T
Header Box and Move Down Alt + Ctrl + .
Header Box Close Ctrl + .
Header Box Realign Alt + .
Insert Line Above Current Line Alt + Ctrl + O
Insert Macro Call Alt + Y
Insert Macro Declaration Ctrl + Shift + M
Insert Macro Execution Statement Ctrl + Shift + J
Kill Current Line Ctrl + K
Move current line down Alt + Down
Move current line up Alt + Up
Select All Macro Definitions Alt + Shift + L
Select between mark and point Ctrl + Shift + Space
Select Current Macro Name Ctrl + Shift + S
Select Current Word Alt + S
Select folding section Ctrl + Shift + Enter
Select line Ctrl + Enter
Set mark Ctrl + Shift

Keyboard macro definitions

Duplicate Line

  1. Move cursor to beggining of line
  2. Repeat line

Find Selection Backward

  1. Reset search
  2. Set search string to "%let debug"
  3. Set search to search code
  4. Set search to case insensitive
  5. Set search direction to forward
  6. Find next occurence using the current settings
  7. Move cursor to next word boundary
  8. Move cursor to next word boundary

Find Selection Forward

  1. Reset search
  2. Set the search string to current selection
  3. Set search to search code
  4. Set search to search string literals
  5. Set search to include comments
  6. Set search to match substrings
  7. Set search to case insensitive
  8. Set search direction to forward
  9. Find next occurrence using the current settings

Goto Debug Toggle

  1. Reset search
  2. Set search string to "%let debug"
  3. Set search to search code
  4. Set search to case insensitive
  5. Set search direction to forward
  6. Find next occurrence using the current settings
  7. Move cursor to next word boundary
  8. Move cursor to next word boundary

Goto Last Macro Definition

  1. Move cursor to end of document
  2. Reset search
  3. Set search to search code
  4. Set search to case insensitive
  5. Set search direction to backward
  6. Set search string to "%mend;"
  7. Find next occurrence using the current settings
  8. Move cursor to end of line
  9. Move cursor up
  10. Reset search
  11. Set search to search code
  12. Set search direction to forward
  13. Set search string to "%macro"
  14. Find next occurrence using the current settings
  15. Move cursor to beginning of line

Goto Import

  1. Reset search
  2. Set search to search code
  3. Set search to case insensitive
  4. Set search direction to forward
  5. Set search string to "%macro Import"
  6. Find next occurrence using the current settings
  7. Move cursor to beginning of line

Goto Main

  1. Reset search
  2. Set search to search code
  3. Set search to case insensitive
  4. Set search direction to forward
  5. Set search string to "%macro Main()"
  6. Find next occurrence using the current settings
  7. Move cursor to beginning of line

Goto Main Last Edit Point

  1. Reset search
  2. Set search to search code
  3. Set search to case insensitive
  4. Set search direction to forward
  5. Set search string to "%macro Main()"
  6. Find next occurrence using the current settings
  7. Move cursor to beginning of line
  8. Reset search
  9. Set search to search code
  10. Set search to case insensitive
  11. Set search direction to forward
  12. Set search string to "%mend;"
  13. Find next occurrence using the current settings
  14. Move cursor to beginning of line
  15. Reset search
  16. Set search to search code
  17. Set search to case insensitive
  18. Set search direction to backward
  19. Set search string to ";"
  20. Find next occurrence using the current settings
  21. Move cursor to end of line

Goto Mark

  1. Go to mark

Goto Next Macro

  1. Move cursor to end of line
  2. Reset search
  3. Set search to search code
  4. Set search direction to forward
  5. Set search to case insensitive
  6. Set search string to "%macro"
  7. Find next occurrence using the current settings
  8. Move cursor to beginning of line
  9. Move cursor to next word boundary
  10. Move cursor to next word boundary
  11. Move cursor to next word boundary

Goto Next Macro Last Line

  1. Move cursor to end of line
  2. Reset search
  3. Set search to search code
  4. Set search direction to forward
  5. Set search to case insensitive
  6. Set search string to "%macro"
  7. Find next occurrence using the current settings
  8. Move cursor to beginning of line
  9. Reset search
  10. Set search direction to forward
  11. Set search to case insensitive
  12. Set search string to "%mend;"
  13. Find next occurrence using the current settings
  14. Move cursor to beginning of line
  15. Reset search
  16. Set search direction to backward
  17. Set search to case insensitive
  18. Set search to search code
  19. Set search string to ";"
  20. Find next occurrence using the current settings
  21. Move cursor to end of line

Goto next marked line

  1. Move cursor left
  2. Move cursor right
  3. Go to the next marked line

Goto Previous Macro

  1. Move cursor up
  2. Reset search
  3. Set search to search code
  4. Set search string to "%mend;"
  5. Set search direction to forward
  6. Find next occurrence using the current settings
  7. Reset search
  8. Set search to search code
  9. Set search direction to forward
  10. Set search string to "%macro "
  11. Find previous occurrence using the current settings
  12. Find previous occurrence using the current settings
  13. Move cursor to beginning of line
  14. Move cursor to next word boundary
  15. Move cursor to next word boundary
  16. Move cursor to next word boundary

Goto Previous Macro Last Line

  1. Move cursor up
  2. Reset search
  3. Set search to search code
  4. Set search string to "%mend;"
  5. Set search directiont to forward
  6. Find next occurrence using the current settings
  7. Reset search
  8. Set search to search code
  9. Set search direction to forward
  10. Set search string to "%macro"
  11. Find previous occurrence using the current settings
  12. Find previous occurrence using the current settings
  13. Move cursor to beginning of line
  14. Reset search
  15. Set search direction to forward
  16. Set search to case insensitive
  17. Set search to search code
  18. Set search string to "%mend;"
  19. Find next occurrence using the current settings
  20. Move cursor to beginning of line
  21. Reset search
  22. Set search direction to backward
  23. Set search to case insensitive
  24. Set search to search code
  25. Set search string to ";"
  26. Find next occurrence using the current settings
  27. Move cursor to end of line
  28. Move cursor down

Goto Previous Marked Line

  1. Move cursor left
  2. Move cursor right
  3. Go to the previous marked line

Goto TestSuite

  1. Reset search
  2. Set search to search code
  3. Set search to case insensitive
  4. Set search direction to forward
  5. Set search string to "%macro TestSuite("
  6. Find next occurrence using the current settings
  7. Move cursor to beginning of line

Header Box and Move Down

  1. Move cursor to beginning of line
  2. Move cursor to beginning of line
  3. Insert the string "/* " at the current position
  4. Move cursor to end of line
  5. Inser the string " " at the current position
  6. Move cursor up
  7. Move cursor to end of line
  8. Insert carriage return
  9. Insert the string " " at the current position
  10. Move cursor to end of line
  11. Move cursor down
  12. Move cursor left
  13. Move cursor left
  14. Insert the string "*/" at the current position
  15. Move cursor up
  16. Delete line
  17. Move cursor to beginning of line
  18. Extend selection to end of line
  19. Remove trailing whitespace
  20. Move cursor left
  21. Move cursor to end of line
  22. Move cursor down
  23. Move cursor to end of line

Header Box Close

  1. Move cursor to end of line
  2. Insert the string " " at the current position
  3. Move cursor up
  4. Move cursor to end of line
  5. Insert carriage return
  6. Insert the string " " at the current position
  7. Move cursor to end of line
  8. Move cursor down
  9. Move cursor left
  10. Move cursor left
  11. Insert the string "/*" at the current position
  12. Move cursor up
  13. Delete line
  14. Move cursor to beginning of line
  15. Extend selection to end of line
  16. Remove trailing whitespace
  17. Move cursor left
  18. Move cursor to end of line

Header Box Close and move down

  1. Move cursor to end of line
  2. Insert the string " " at the current position
  3. Move cursor up
  4. Move cursor to end of line
  5. Insert carriage return
  6. Insert the string " " at the current position
  7. Move cursor to end of line
  8. Move cursor down
  9. Move cursor left
  10. Move cursor left
  11. Insert the string "/*" at the current position
  12. Move cursor up
  13. Delete line
  14. Move cursor to beginning of line
  15. Extend selection to end of line
  16. Remove trailing whitespace
  17. Move cursor left
  18. Move cursor to end of line
  19. Move cursor down
  20. Move cursor to end of line

Header Box Realign

  1. Move cursor to end of line
  2. Move cursor to previous word boundary
  3. Delete to next word boundary
  4. Insert the string "
  5. Move cursor up
  6. Move cursor to end of line
  7. Insert carriage return
  8. Insert the string "
  9. Move cursor to end of line
  10. Move cursor down
  11. Move cursor left
  12. Move cursor left
  13. Insert the string "*/" at the current position
  14. Move cursor up
  15. Delete line
  16. Move cursor to beginning of line
  17. Extend selection to end of line
  18. Remove trailing whitespace
  19. Move cursor left
  20. Move cursor to end of line
  21. Move cursor to previous word boundary

Insert header Comment Ending

  1. Insert the string " " at the current position
  2. Move cursor up
  3. Move cursor to end of line
  4. Insert carriage return
  5. Insert the string " " at the current position
  6. Move cursor to end of line
  7. Move cursor down
  8. Move cursor left
  9. Move cursor left
  10. Insert the string "*/" at the current position
  11. Move cursor up
  12. Delete line
  13. Move cursor to beginning of line
  14. Extend selection to end of line
  15. Remove trailing whitespace
  16. Move cursor left
  17. Move cursor to end of line
  18. Move cursor to previous word boundary

Insert Line Above Current Line

  1. Mark current position
  2. Move cursor to beginning of line
  3. Insert carriage return
  4. Go to mark

Insert line below current line

  1. Mark current position
  2. Move cursor to end of line
  3. Insert carriage return
  4. Go to mark

Insert Macro Call

  1. Insert the string "%" at the current position
  2. Paste from clipboard
  3. Move cursor to end of line
  4. Insert the string "();" at the current position
  5. Move cursor left
  6. Move cursor left

Insert Macro Declaration

  1. Insert the string "%macro ();" at the current position
  2. Insert carriage return
  3. Insert carriage return
  4. Insert carriage return
  5. Insert carriage return
  6. Insert the string "%mend;" at the current position
  7. Move cursor up
  8. Move cursor up
  9. Move cursor up
  10. Move cursor up
  11. Move cursor to end of line
  12. Move cursor left
  13. Move cursor left
  14. Move cursor left

Insert Macro Execution Statement (new)

  1. Move cursor to end of line
  2. Insert carriage return
  3. Insert the string "%put NOTE: [MACRO] Executing: " at the current position
  4. Move cursor up
  5. Move cursor to beginning of line
  6. Move cursor to next word boundary
  7. Move cursor to next word boundary
  8. Move cursor to next word boundary
  9. Extend selection to end of line
  10. Copy selection
  11. Move cursor to end of line
  12. Move cursor down
  13. Move cursor to end of line
  14. Paste from clipboard
  15. Move cursor left
  16. Move cursor left

Kill Current Line

  1. Unmark the current line
  2. Delete line

Move current line down

  1. Move cursor to end of line
  2. Extend selection to beginning of line
  3. Cut selection
  4. Delete line
  5. Move cursor to end of line
  6. Insert carriage return
  7. Paste from clipboard

Move current line up

  1. Move cursor to end of line
  2. Extend selection to beginning of line
  3. Cut selection
  4. Delete line
  5. Move cursor up
  6. Move cursor up
  7. Move cursor to end of line
  8. Insert carriage return
  9. Paste from clipboard

Move selection up

  1. Cut selection
  2. Delete line
  3. Move cursor up
  4. Move cursor up
  5. Move cursor to end of line
  6. Insert carriage return
  7. Paste from clipboard

Select All Macro Definitions

  1. Move cursor to end of document
  2. Reset search
  3. Set search to search code
  4. Set search to case insensitive
  5. Set search direction to backward
  6. Set search string to "%mend;"
  7. Find next occurrence using the current settings
  8. Move cursor to end of line
  9. Extend selection to beginning of document

Select between mark and point

  1. Extend selection from mark

Select Current Macro Name

  1. Move cursor to end of line
  2. Reset search
  3. Set search to case insensitive
  4. Set search to search code
  5. Set search direction to backward
  6. Set search string to "%macro"
  7. Find next occurrence using the current settings
  8. Move cursor right
  9. Move cursor to next word boundary
  10. Extend the selection to the next word boundary

Select Current Word

  1. Move cursor to next word start
  2. Move cursor to next word start
  3. Move cursor to previous word start
  4. Move cursor left
  5. Move cursor left
  6. Move cursor left
  7. Move cursor to next word boundary
  8. Extend the selection to previous word start

Select folding section

  1. Move cursor to beginning of line
  2. Reset search
  3. Set search direction to forward
  4. Set search to case insensitive
  5. Set search to ignore text in comments
  6. Set search to search code
  7. Turn on regular expression mode for searching.
  8. Set search string to \<[rqm][ue][ni][dt]*\>;
  9. Find next occurrence using the current settings
  10. Move cursor right
  11. Mark current position

Select line

  1. Move cursor to beginning of line
  2. Extend selection to end of line

Set mark

  1. Mark current position

KEYS menu

There exist several 'layers' of hotkeys in the enhanced editor. These are the Display Manager (DM) hotkeys, keyboard macros, and abbreviations.

The DM keys have two separate menus which are accessed via either the SAS command line or by issuing dm 'KEYS';. A separate menu exists for the Editor and for a Viewtable.

Key Definition
F3 zoom off; log; wpgm; clear log; end; log; wpgm;
F4 zoom off; log; wpgm; clear log; end;
F7 log; wpgm; submit '%ListMacroDefinitions();'
F9 %sysexec(start explorer.exe "%sysfunc(pathname(word))" && exit);
F11 zoom
F12 next viewtable:libref.dataset; end;
SHF F7 odsresult; clear; dmsexp; expand libraries; expand work; log; wpgm;
SHF F8 output; winclose; dmsexp; expand libraries; expand work; log; wpgm;
CTL F11 prevwin
CTL F12 next
CTL H dlgreplace
CTL I backward 5
CTL J dlgfind
CTL K forward 5
CTL L right
CTL M bottom
CTL Q log; wpgm;
CTL R left
CTL T zoom off; log; wpgm; clear log; end; log; wpgm;
CTL U top
CTL W dmsexp; uplevel; expand libraries; expand work;
CTL Y next viewtable: libref.dataset;
RMB wpopup

KEYS <DMKEYS>

Key Definition Operation
F1 help Access SAS Help
F2    
F3 zoom off;log;wpgm;clear log;end;log;wpgm; Disable zoom, clear log, submit code and return user to a log/editor arrangement. Used when output is not important.
F4 zoom off;log;wpgm;clear log;end; Disable zoom, clear log, submit code, but show output.
F5 log; Activate log
F6    
F7 log;wpgm;submit '%ListMacroDefinitions();' Display macro structre in log3
F8    
F9 %sysexec(start exlorer.exe "%sysfunc(pathname(work))" && exit); Open the temporary WORK directory in Windows Explorer
F11 zoom Toggle zoom
F12 next viewtable:libref.dataset;end; Close a viewtable. Press repeatedly to close all open viewtables.
SHF F1    
SHF F2    
SHF F6    
SHF F7 odsresults;clear;dmsexp;expand libraries;expand work;log;wpgm; Clear results window and return to log/editor/work library arrangement
SHF F8 ouput;winclose;dmsexp;expand libraries;expand work;log;wpgm; Clear output window and return to log/editor/work library arrangement
SHF F10    
SHF F11 prevwind navigate to previous window
SHF F12 next navigate to next window
   
ALT F1 wdockviewresize resize the explorer window. Useful when working with a single monitor.
ALT F2 wdockviewminimize Minimize explorer window. Useful when working with a single monitor.
ALT F3 wdockviewrestore Restore explorer window to previous size. Useful when working with a single monitor
   
CTL H dlgreplace Open replace dialog window
   
CTL Q log;wpgm; Toggle log/editor arrangement
CTL T zoom off;log;wpgm;clear log;end;log;wpgm; Same as F3, sumbit code and return to log/editor arrangement
   
CTL W dmsexp;uplevel;uplevel;expand libraries;expand work; navigate to work library, used in conjunction with CTL Q.

When to use macro variables

Define any program parameter which may change, such as an input or output directory, using a macro variable. Avoid naming macro variables the same as filerefs, librefs, and data set variables.

dirData "C:\data\directory\path";

libname inData  "&dirData.";
libname outData "&dirData.";

Macro return value trick

If a macro ends with a variable call, that value is essentially returned. SAS macros are simply text replacement. Therefore, if the last statement is a macro variable resolution, then the resolved value is what remains after the macro has executed. This means that macros which return values cannot be called in open code. They must be part of some other statement, such as %put.

%macro Greetings(amIFriendly);

  %local returnValue;

  %if &amIFriendly. = yes %then %let returnValue = Hello, world!;
  %else %let returnValue = Bah, humbug...;

  &returnValue.
%mend;

%put %Greetings(yes);

Tricks to simulate a real array

SAS does not have arrays. To imitate arrays, one can use macro lists. Doing this allows for an intuitive approach to data processing.

Delimit lists using a pipe. While it is possible to use spaces or commas, SAS has extreme difficulty with such delimiters. The hoops one must jump through make them impractical choices. Pipes work well because they are uncommon in data, are visually appealing, and provide a (relatively) simple syntax.

Macro lists are 1-based.

Well-formed list

A well-formed list is delimited by pipes (|). Each of the following lists have three elements.

%let list1 = e1|e2|e3;
%let list2 = e1|e2|;
%let list3 = |e2|e3;
%let list4 = ||;

Create vertically defined macro list

It is often helpful to present a list vertically rather than horizontally. This may be because the list changes frequently or the list is longer than 80 characters. However, since SAS macro variables are just text, any whitespace used in aligning the elements of the list is also stored in the macro variable.

Consider the following list. Were it not given vertically, then it would extend beyond the acceptable line length.

%let verticalList =
  element1
  element2
  element3
  element4
  element5
  element6
  element7
  element8
  element9
  element10
  element11
  element12
  element13
  element14
  element15
  element16
  element17
  element18
  element19
  element20
;

Note that the whitespace is stored along with the elements:

%put &verticalList;
element1   element2   element3   element4   element5   element6   element7   element8
element9   element10   element11   element12   element13   element14   element15   element16
element17   element18   element19   element20

One approach to eliminate the unnecessary whitespace is to use the default autocall macro %CMPRES.

%let cleanList = %cmpres(&verticalList);

This gives:

%put &cleanList;
element1 element2 element3 element4 element5 element6 element7 element8 element9 element10
element11 element12 element13 element14 element15 element16 element17 element18 element19
element20

However, using the TRANWRD and COMPBL functions, it is possible to do the above in a single step.

%let listVariables = %sysfunc(tranwrd(%sysfunc(compbl(
  body_weight_mg_f
  body_weight_mg_m
  body_length_mm_f
  body_length_mm_m
  gsi_f
  gsi_m
)), %str( ), |));

Get length of a list

To get the length of a list, use the following macro4.

%macro ListLength(list);
  %if %sysevalf(%superq(list)=, boolean) %then %let count = 0;
  %else %let count = %eval(%sysfunc(countc(&list, '|')) + 1);
  &count
%mend;

Note that the conditional is simply the raw form of the %IsEmpty() macro.

To call the macro, the list must be resolved when passed (i.e. be called with an ampersand). For example,

%put %ListLength(&listTest);

Note how this is different from how accessing a list item is done.

If for some reason you have to use a space delimited list, the length can be found by using a SAS Name Literal.

%put %ListLength(&listTest, delimiter = ' 'n);

To view a list

To view a list, use the &= construct:

%put &=list;

Access an item in a list by index

Macro list indexes start at 1. To access an element, use the following5. Note that because %SUPERQ() is used the macro name is provided; the list name should not be resolved. (i.e. use listTest rather than &listTest.).

For example, to get the 7th element of a pipe-delimited list, one would call the following.

%let seventhElement = %sysfunc(scan(%superq(listTest), 7, |, m));

Note that this approach will return a blank for an empty macro list. That is, a macro variable may be empty, but a macro list is never truly empty, despite the convention we have adopted for counting the number of elements in a list.

This can be generalized and wrapped in a macro as follows.

%macro ListElement(list, n, delimiter=|);
  %let nthElement = %sysfunc(scan(%superq(&list.), &n., &delimiter., m));
  &nthElement
%mend;

Note that the list name should NOT be resolved in this call (i.e. there should be no ampersand.) For example,

%put %ListElement(listTest, 2);

Loop through a list

To loop through a list, one must use %ListLength and manually parse the text list6.

%local i;

%do i = 1 %to %ListLength(&list.);
%let element = %scan(&list, &i, '|');
%put NOTE: [MACRO] Processing element: &element;

  /*Put repetitive junk here*/

  %end;

Get index given element

The following macro parses a macro list looking for a particular element value. If found, the index number is returned. Otherwise, 0 is returned.

%macro GetElementIndex(list, elementToMatch);

  %local  index found returnValue currentElement;
  %let    found = 0;

  %do index = 1 %to %ListLength(&list);
      %let currentElement = %scan(&list, &index, |);

      %if &currentElement. = &elementToMatch. %then %do;
        %let found = 1;
        %goto break;
        %end;
    %end;
    %break:

  %if &found. = 1 %then %let returnValue = &index.
  %else                 %let returnValue = &found.;

  &returnValue.
%mend;

Use different sessions for different programs

Do not execute different programs in the same session. A corollary of this is: at no point during a program should any data sets be deleted. 7

Use an autocall library

Overview

Put generally, an autocall library is a directory containing macro definitions which are compiled on the fly at run time. Each macro definition is contained in its own file of the same name. If a macro is not defined in the current session, SAS will traverse a hierarchy of libraries until a macro definition corresponding to the call is found. SAS supplies several such libraries by default. For example, the %TRIM() and %LOWCASE() functions are autocall macros. Users may add their own directories and macros to the autocall hierarchy.

Why not use %include?

The primary advantage of using autocall libraries is the ability to compile macros on the fly without having to write out an explicit definition or import. This makes them well suited for utility functions.

Instead of an autocall library, one could create a single SAS file (such as functions.sas) containing all the macros, options, and various open code statements typically used and then run a single %include statement at the beginning of each program.

This is a problem for several reasons:

  1. It is difficult to maintain. To update a macro, one must open the SAS file, locate the macro, and then change it as opposed to simply swapping out individual files in a directory.
  2. It is difficult to tell what programs the 'functions.sas' code contains. With an autocall library, one can look at the directory and see what macros are included. SAS requires that each autocall macro definition file have precisely the same name as the macro being called, ensuring a one-to-one correspondence.
  3. Using a single file begs the question, 'Is a different functions.sas file created for each program or should the functions.sas file contain all possible macros, even those the program doesn't need?' If the former, how does one differentiate the various functions.sas files? How is the code contained within them tracked? If not all functions.sas files are identical, then when a specific macro is needed, locating the corresponding functions.sas is difficult. The latter case is clearly inefficient. In addition to the extra processing, the extra macros are stored in each SAS session, obscuring which macros are actually being used.

There are other reasons why autocall libraries are better suited to managing utilities than %include which will not be belabored here.

How to implement a personal Autocall library

The literature detailing the behavior and implementation of autocall libraries leaves much to be desired when contrasted with the relative ease of actually setting one up. Before explaining how to set up a personal autocall directory, a few words must be said about how the autocall facility functions.

SAS gives preference to previously compiled macros. If a macro has already been compiled in the current session, that definition will be the one used when the macro call is executed. When a macro is called and a definition has not yet been compiled in the current program, SAS searches a hierarchy of directories for a SAS program of the same name as the macro.

A list of autocall libraries available to a session can be found in the SASAUTOS system variable. To view the contents of this variable, issue the following statement.

%put %sysfunc(pathname(sasautos));

The file paths listed in SASAUTOS are defined in a config file. The config file which defines the SASAUTOS system variable can be found using PROC OPTIONS.

proc options option = config;
run;

The default directory containing the config file which defines SASAUTOS on Windows is:

C:\Program Files\SASHome\SASFoundation\9.4\nls\en\

It is likely that you will need admin rights to edit/save the sasv9.cfg file. Assuming you have these, ignore the warning at the top and proceed with confidence to the -SET SASAUTOS definition. (If you're still concerned, create a copy of the config file off to the side. Should anything go wrong which you can't fix, replace the invalidated config file with the copy to restore SAS to its original state.) Append the SASAUTOS definition with your file path, making sure to enclose it in double quotes. Save the file, reopen SAS, and run the pathname call again and see that your personal autocall library is now included in the autocall search path.

-SET SASAUTOS (

    "!SASROOT\core\sasmacro"
    "!SASROOT\aacomp\sasmacro"
    "!SASROOT\accelmva\sasmacro"
    "!SASROOT\assist\sasmacro"
    "!SASROOT\dmscore\sasmacro"
    "!SASROOT\ets\sasmacro"
    "!SASROOT\gis\sasmacro"
    "!SASROOT\graph\sasmacro"
    "!SASROOT\hps\sasmacro"
    "!SASROOT\iml\sasmacro"
    "!SASROOT\or\sasmacro"
    "!SASROOT\qc\sasmacro"
    "!SASROOT\stat\sasmacro"
    "C:\Your\Personal\Autocall\Directory\Path"
    )

That's it! Now, just add .sas files of the exact same name as the macro to the directory to be able to compile macros on the fly!

SAS configuration files

A word should be said about SAS configuration files. SAS allows for extreme flexibility in configuring individual users across groups of SAS installations. At initialization, SAS traverses a hierarchy of directories looking for different config files. Each subsequent file has the ability to modify or supersede the settings defined in the previous configuration. The default implementation on Windows is somewhat bizarre. Two config files are created, one in each of the following directories:

C:\Program Files\SASHome\SASFoundation\9.4
C:\Program Files\SASHome\SASFoundation\9.4\nls\en

According to the search hierarchy, the last config file to be executed is the one called, if any, from a SAS command line. By default in Windows, SAS does this via the Start Menu. In the Start Menu shortcut for SAS, a -config command is issued in the target call. The default option points to the sasv9.cfg file in the first directory listed above. That config file then points to the second config file.

If the configuration files have not been previously modified, there should be no deviation in setting up a personal autocall library from the description given here. In the event your system has a different set up, or you would like more information, check the Files Used by SAS documentation for an explanation of how configuration files are used.

AUTOEXEC.SAS

Also in the Files Used by SAS is information about the AUTOEXEC.sas file. This is a file which is used to execute code during the SAS initialization.

Create a file named AUTOEXEC.SAS in the same directory as SAS.EXE. Any code in this program will be executed during initialization. See Section Enhanced Editor modification tricks for suggested AUTOEXEC.SAS code.

If running programs in batch mode, an error message may appear saying,

This window is unavailable in line-mode.

The cause is noted in SAS Problem Note 44705. It is caused by submitting code in batch which is only available in interactive mode. For example, a display manager (DM) command.

Unfortunately, there is no workaround for this on Windows as the SYSENV automatic variable is always "FORE".

How to migrate a SAS Setup

WARNING: Migrate a SAS Setup is untested!

The DMKEYS and VT hotkeys are stored in the sasuser.profile file. This can be found at %put %sysfunc(pathname(sasuser)). Copy that into the new installation's corresponding directory.

Export/Import winreg.exe

Computer\HKEY_CURRENT_USER\Software\SAS Institute Inc.\Enhanced Editor\9.4\
Editor Customizations\User Defined Keyboard Mappings

Have a standard folder hierarchy

Use a simple folder hierarchy. Define all file paths at the start of the program. No file paths should change beyond this point. 8

%let dirBasePath  = C:\this is\the base\path;
%let dirData      = &dirBasePath.\data;
%let dirOutput    = &dirData.;

%let xlFile       = DataToBeReadIn.xlsx;

%include 'C:\SAS Utilities\ExcelOpen.sas' ;
%include 'C:\SAS Utilities\ExcelClose.sas' ;
%include 'C:\SAS Utilities\EstablishDDELink.sas' ;

Automatically clear the Log

It is recommended to clear the log on every submission of code. This may be done automatically via the KEYS menu with the command clear log;9.

Use version control

Version control systems are things like Git, Mercurial, or Fossil.

Do not track changes to programs within the program10.

That is, do not include statements such as these within the program:

13Feb07         "macro called" message added

25Feb07         Remove progname >32 error reporting to stderr

19Jul07         Add "lisfile" variable

30Jul07         Header tidy

Have standard headers

Project Program

/****************************************************************************/
/*  Program Name  :                                                         */
/*  Purpose       :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Author        :                                                         */
/*  Last Update   :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Input(s)      :                                                         */
/*  Output(s)     :                                                         */
/*  Ext. Macro(s) :                                                         */
/*  Structure     :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Note(s)       :                                                         */
/*                                                                          */
/*  Development   :                                                         */
/*                                                                          */
/****************************************************************************/
  1. Program Name

    This should be the same as the program's file name. The program name should include a version number. While including the name here is redundant, and therefore requires extra work, it is a matter of completeness. Code is not always viewed as a file. For instance, in verification, it may be viewed as a print out. Including the program name within the program code helps make sure this information is available to the reader.

    The name of the current program can easily be obtained by submitting:

    %put %sysget(SAS_EXECFILENAME);

  2. Purpose

    The purpose of a program should be indicated in the name of the file/program. However, due to character limitations and an interest in brevity, more description may be needed. This is the place to provide such details.

  3. Author

    Take pride in your work. (aka. Who to come after if things go wrong.)

  4. Last Update

    This is a light form of version control.

  5. Input(s)

    What outside data sources are required for execution.

  6. Output(s)

    Anything the program writes to file.

  7. Macro(s)

    This is a list of external macros required by the program. Macros defined within the program should not be listed here. Instead, they should be listed under structure.

  8. Structure

    List all macros defined within the program here. Because macros should have descriptive names, this list serves as a kind of 'table of contents.' It also helps explain the logical execution of the program.

    The macro 'ListMacroDefinitions' has been created to assist with this process. It can be placed in an AUTOCALL library and its execution bound to a hotkey.

  9. Note(s)

    Use this section to describe general information a user may need to know when running the program. This may be differences from previous versions, explanation of terminology, or whether the program is part of a sequence.

  10. Development

    Use this section to describe detailed information regarding the development of the program. This is a home for all the nit-picky details involved in programming that should be kept track of.

Utility Program

/****************************************************************************/
/*  Program Name  :                                                         */
/*  Purpose       :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Author        :                                                         */
/*  Last Update   :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Input(s)      :                                                         */
/*  Output(s)     :                                                         */
/*  Usage         :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Example(s)    :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Template      :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Note(s)       :                                                         */
/*                                                                          */
/*  Development   :                                                         */
/*                                                                          */
/****************************************************************************/
  1. Program Name

    This should be the same as the program's file name. The program name should include a version number. While including the name here is redundant, and therefore requires extra work, it is a matter of completeness. Code is not always viewed as a file. For instance, in verification, it may be viewed as a print out. Including the program name within the program code helps make sure this information is available to the reader.

    The name of the current program can easily be obtained by submitting:

    %put %sysget(SAS_EXECFILENAME);

  2. Purpose

    The purpose of a program should be indicated in the name of the file/program. However, due to character limitations and an interest in brevity, more description may be needed. This is the place to provide such details.

  3. Author

    Take pride in your work. (aka. Who to come after if things go wrong.)

  4. Last Update

    This is a light form of version control.

  5. Inputs

    What outside data sources are required for execution.

  6. Outputs

    Anything the program writes to file.

  7. Usage

    Provide instructions on how to use the utility.

  8. Example

    Provide examples of how to use the utility. It is recommended to provide many examples covering as many use cases as possible.

  9. Template

    Provide well-formatted and blank macro call. This allows the user to quickly copy the macro call using Alt+LMB.

  10. Note(s)

    Include caveats to usage here. For instance, are there certain situations the user should be aware of? Are there known issues and workarounds?

  11. Development

    Include the nit-gritty details here. For example, if a non-intuitive approach was implemented, explain that here. It may also be used to note TODO items.

Create a debug toggle

All programs should have a 'debug' mode and a 'production' mode11. Such an implementation serves two purposes. First, it facilities development by preventing unnecessary execution (and thus delays). Second, it provides protection against unwanted output/overwriting of files, thus aiding validation.

To implement a debug mode requires two elements. First, a variable which is used to toggle interactions with the outside system. Second, a mechanism to trigger interactions based on the toggle.

For example,

/*The toggle variable*/
%let debug = true;

/*The interaction mechanism*/
%macro Skip(toggle, statement);
  %if not (&toggle = true or &toggle = 1) %then
    %sysfunc(dequote(&statement.));
%mend;

/*Example of execution*/
%Skip(&debug., '%put Hello, world!');

To prevent unexpected overwrites, save all programs with 'debug' mode toggled TRUE.

Use separate data for separate analyses

Try to create separate data sets for different analyses whenever possible12.

Always try to be sorted

All data should be sorted in a logical way at all times 13.

Avoid sorts whenever possible14.

Try to work from memory

Work from memory. Access libraries only at the beginning and end of a program15.

Do not process permanent data sets in anyway during program execution. The only time a program should touch a permanent file is when it is either reading or writing16.

/*BAD (alters a permanent data file)*/

    proc sort data = inData.mortality;
      by group;
    run;

/*GOOD: (does not alter permanent data)*/

    data raw_mortality;
      set inData.mortality;
    run;

    proc sort data = raw_mortality out = mortality;
      by group;
    run;

Use unique data set names

Create a unique name for each data set. That is, do not overwrite data sets. Create only a single data set per data step17.

Call data sets with DATA=

Data sets should always be explicitly called using the DATA= option.

Use a default length

Implement a default length unless not doing so is absolutely mandatory. This can be done by creating macro variable which is substituted into length definitions. Typically, this will only need to be done for character data. The default numeric length of 8 is often sufficient for most applications. The good way to implement this is as follows18.

%let defLength = 100;

data sample1;
  length
    character $ &deflength.
    numeric   8
  ;
  character = "Stuff";
  numeric   = 6.28;
run;

proc sql noprint;
  create table sample2 (
      character char(&defLength.)
    , numeric   num
  );
  insert into sample2
    values('Stuff', 6.28)
  ;
  select *
  from sample2
  ;
quit;

proc sql noprint;
  create table sample3 as
  select
      *
    , 'More stuff'  as character2 length = &defLength.
    , 22 / 7        as pie
  from sample2
  ;
quit;

Don't mix macro and data step statements

Avoid mixing macro and non-macro control structures19.

%let sex = M;

/*Cannot execute the data step without executing the
  entire macro. Also, the syntax is overly complex.
  Short-circuit logic is likely not required.*/
%macro Bad();
%put NOTE: [MACRO] Executing: Bad();

  data bad;
    set sashelp.class;

    %if &sex = M %then %do;
      if age >= 18 then registered = 1;
    %end;
  run;
%mend;

%Bad();

%macro Good();
%put NOTE: [MACRO] Executing: Good();

  data good;
    set sashelp.class;

    if "&sex." = 'M' and age >= 18 then registered = 1;
  run;
%mend;

%Good();

Trick for handing system options

SAS does not process all statements sequentially. This can sometimes cause problems when trying to conditionally set system options. One situation which presents problems is in controlling the log output on a utility macro.

To control system options sequentially, first define a %SetSystemOptions macro20:

%macro SetSystemOptions(opt1, opt2, opt3);
  options &opt1. &opt2. &opt3.;
%mend;

At the start of the program, capture the current setting for the options that will change.

%let originalNOTES = %sysfunc(getoption(notes));

Then, whenever the system options need to be changed, the %SetSytemOptions macro can be called:

%if %IsEmpty(fileName) %then %do;
  %put ERROR: [MACRO] FILENAME argument is blank.;
  %SetSystemOptions(&originalNOTES.);
  %abort cancel;
  %end;

Terminate execution with macros

Whenever possible, terminate program execution using macro statements. Macro statements make the source code more concise, as well as print less to the log. Before terminating execution print to the log why execution is being terminated. See Handling System Options.

%macro Example(input);
%put NOTE: [MACRO] Executing: Example(input=&input);
  %if &input. = '' %then %do;
    %put ERROR: [MACRO] Invalid input.;
    %abort cancel;
    %end;
%mend;
data _null_;
  do i = 1 to 43;
    if i > 42 then do;
      put 'ERROR: [DEV] Life is now meaningless.';
      abort cancel nolist;
      end;
  end;
run;

How to use comments

Comments should be used only for description purposes and be used only when absolutely necessary. Logical documentation should be handled primarily through macro and variable names. That is, comments should be used to explain 'why' when the reason is not obvious21.

Code should never be commented out. Remove all unused code! If there are conditions under which the code applies, hard code that logic. If requirements keep changing, fix why requirements keep changing. If requirements cannot be known until runtime, code a hard stop which prevents execution from proceeding until the necessary information is obtained22.

Good (explains why statement exists):

  /*Rows 277 and 282 of the input data contain typos*/
  if _N_ in (276 282) and raw_name_of_treatment_group = '19 \302\265g/L' then
    raw_name_of_treatment_group = '18 \302\265g/L';

Bad (superfluous):

data clean;
  set raw;

  *Delete blank rows;
  if %IsMissingAllVariables() then delete;
run;

Sample situations where comments are helpful:

  1. A non-conventional solution was implemented.
  2. An esoteric piece of code was used.

For example,

/*...*/

/*Check that summaryYM has form 20yyMmm*/
else if (not prxmatch("/20\d{2}M\d{2}/", summaryYM)) then do;

/*...*/

How to avoid iteration errors

When writing macros (especially utilities), avoid using macro loop structures if the utility could be called from within another macro loop. This is because a macro is merely text replacement. If the outer loop uses the same iterator variable name as the inner loop, then the two iterator variables are actually the same variable after substitution. This will cause the "inner" iterator variable to reassign the "outer" iterator.

%macro UtilityUsingLoop();
  %do i = 1 %to 3;
    %put Inner loop iterator: &i;
  %end;
%mend;

%macro CallUtilityInALoop();
  %do i = 1 %to 5;
    %put Outer loop iterator (prior): &i;
    %UtilityUsingLoop();
    %put Outer loop iterator (post): &i;
  %end;
%mend;

%CallUtilityInALoop();

%macro Expansion();
  %do i = 1 %to 5;
    %put Outer loop iterator (prior): &i;

%do i = 1 %to 3;
  %put Inner loop iterator: &i;
%end;

    %put Outer loop iterator (post): &i;
  %end;
%mend;

%Expansion();

Both of these produce an output of:

Outer loop iterator (prior): 1
Inner loop iterator: 1
Inner loop iterator: 2
Inner loop iterator: 3
Outer loop iterator (post): 4
Outer loop iterator (prior): 5
Inner loop iterator: 1
Inner loop iterator: 2
Inner loop iterator: 3
Outer loop iterator (post): 4
Outer loop iterator (prior): 5
Inner loop iterator: 1
Inner loop iterator: 2
Inner loop iterator: 3
Outer loop iterator (post): 4
Outer loop iterator (prior): 5
Inner loop iterator: 1
Inner loop iterator: 2
Inner loop iterator: 3
Outer loop iterator (post): 4
...

We can see that the "inner" iterator always resets the "outer" iterator to 4, causing the outer loop to then execute the inner loop ad infinum.

One way to avoid this is to place the loop inside a _null_ data step. Unfortunately, since a data step is executed, extra effort is required to generate the same log output. Here notes are suppressed.

%macro UtilityUsingNullLoop();
  data _null_;
    do i = 1 to 3;
      put 'Inner loop iterator: ' i;
    end;
  run;
%mend;

%macro CallUtilityInALoop();
  %do i = 1 %to 5;
    %put Outer loop iterator (prior): &i;
    %UtilityUsingNullLoop();
    %put Outer loop iterator (post): &i;
  %end;
%mend;

options nonotes;
%CallUtilityInALoop();
options notes;

This generates the following output in the log:

Outer loop iterator (prior): 1
Inner loop iterator: 1
Inner loop iterator: 2
Inner loop iterator: 3
Outer loop iterator (post): 1
Outer loop iterator (prior): 2
Inner loop iterator: 1
Inner loop iterator: 2
Inner loop iterator: 3
Outer loop iterator (post): 2
Outer loop iterator (prior): 3
Inner loop iterator: 1
Inner loop iterator: 2
Inner loop iterator: 3
Outer loop iterator (post): 3
Outer loop iterator (prior): 4
Inner loop iterator: 1
Inner loop iterator: 2
Inner loop iterator: 3
Outer loop iterator (post): 4
Outer loop iterator (prior): 5
Inner loop iterator: 1
Inner loop iterator: 2
Inner loop iterator: 3
Outer loop iterator (post): 5

Another way to avoid these kinds of errors is to define the iterators using %local.

Use a program template

The following template is suggested as an abbreviation. See keyboard macros. For header descriptions, see Have standard headers.

/****************************************************************************/
/*  Program Name  :                                                         */
/*  Purpose       :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Author        :
/*  Last Update   :                                               */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Input(s)      :                                                         */
/*  Output(s)     :                                                         */
/*  Ext. Macro(s) :                                                         */
/*  Structure     :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Note(s)       :                                                         */
/*                                                                          */
/*  Development   :                                                         */
/*                                                                          */
/****************************************************************************/

********************************************************************
** User Specific Parameters
********************************************************************;
%let debug        = true;

%let dirBasePath  = ;
%let dirData      = ;
%let dirOutput    = ;

********************************************************************
** Development Parameters
********************************************************************;
options nosource dsoptions = note2err;

libname inData  "&dirData.";
libname outData "&dirOutput.";

%let defLength = 100;

********************************************************************
*** Utilities
********************************************************************
%macro Skip(toggle, statement);
  %if not (&toggle = true or &toggle = 1) %then
    %sysfunc(dequote(&statement.));
%mend;

********************************************************************
** Main Definition
********************************************************************;
%macro Main();
%put NOTE: [MACRO] Executing: Main();

%mend;


**********************************
*** Subroutines
**********************************;


********************************************************************
** Main Call
********************************************************************;

%Main();

%put NOTE: [DEV] Process complete.;
options source;

Check data immediately

Data checks should happen as immediately as relevant. Failed checks should terminate the program and force the user to address the issue. If checks are occurring at the end of a program, this demonstrates a lack of confidence/understanding in how the program executes. Such a program should be rewritten so that there is confidence in it. While it would be hubris for a programmer to believe they have accounted for every possible error, all known possibilities should be hard coded. These checks should communicate what is being checked to the log.

Avoid formats

Explicitly managing formats is often unnecessary. Formats are meta data. They do not affect the underlying data value. Formats are a layer of obscurity and introduce inconsistencies. For example, a variable may appear to be rounded/abbreviated due to formatting whereas non-formatted data does not. Formatting is concerned with how data is presented and such decisions are often arbitrary. To determine whether an error exists requires foreknowledge of the formatting or time spent investigating. When formatting is used only as required, these inconsistencies are minimized.

By the Single Responsibility Principle, all formatting can be delegated to a single step at the instant it is required. Output considerations should be handled as late as possible in the program. Since output occurs at the end of a process, formats should be handled on the back end, during that process. Doing so provides a logical separation of ideas.

Clearly, formats have good uses. Dates and times are certainly easier to work with when they are formatted. And technically, SAS automatically applies a "BEST" type by default. This is often sufficient enough for us to dismiss formats altogether until forced into a situation is better managed with an extra layer of obscurity.

Avoid labels

Labels are not forward facing. While it is possible in some instances to make them forward facing, this is not true across all interfaces. For instance, variable labels can be made the default face for data set views. However, a data set label cannot be made to face by default within the explorer. Determining the label in the explorer requires specifically looking for it; either opening a viewtable or looking at its properties. This lack of consistency makes labels a bad choice for naming data sets.

Just as with formats, labels are meta-data. Specifically, they are 'descriptive meta data'. Such information is for discovery and identification. Because labels do not perform this task well, they should not be used. Furthermore, a variable name performs this same duty and performs it well. Labels introduce an unnecessary layer of obscurity.

A further argument against labels is that they are session dependent. If a data set is exported, the label is not always retained on import. Well chosen names often perform quite well.

Avoid RETAIN statements

RETAIN statements should be used only to retain values across data separate data step iterations. Do not use it to rearrange variables. The reordering of variables is a side-effect and is best handled by PROC SQL or a LENGTH statement. The behavior of retain is not always immediately obvious. Because of this, and it's ability to silently change data, its use should be avoided unless absolutely necessary.

Avoid negative language

Refrain from using negative language. Use KEEP instead of DROP. Favor in to not in. Try to match equality rather than inequality23.

Get a list of unique data set values

There are several ways to obtain a list of unique values of a data set variable. The one which appears to work best with a list-based approach uses the INTO : statement of PROC SQL.

/*Find what leagues exist in the 'baseball' dataset*/
proc sql noprint;
  select distinct league
  into : listUnique separated by '|'
  from sashelp.baseball
  ;
quit;

%put &=listUnique;

It is also possible to create multiple lists from a single PROC SQL statement.

proc sql noprint;
  select distinct
      group
    , count(distinct replicate)
  into
      :listGroups           separated by '|'
    , :listReplicateCounts  separated by '|'
  from f1_table_data
  group by group
  ;
quit;
%put ERROR: [CHECK] &=listGroups &=listReplicateCounts;

Note that sometimes it may be helpful to use the TRIMMED option on the INTO statement. This removes all leading and trailing white space.

Count number of observations per grouping

The following example counts the number of replicates per generation/treatment-group:

proc sql noprint;
  create table uni as
  select distinct
      generation
    , treatment
    , group
    , count(distinct replicate)
  from raw_growth_and_ssc
  group by
      generation
    , treatment
    , group
  order by
      generation
    , group
  ;
quit;

Print a list of currently defined macros to the log

Create a .sas file titled ListMacroDefinitions.sas containing the following code. Place this program in an autocall library (See section No description for this link). The program can then be bound to a hotkey by entering KEYS into the command line and entering the following key definition.

log; wpgm; submit '%ListMacroDefinitions();'

What this does is clear the log, return control to the editor window, and then execute the %ListMacroDefinitions() macro. The macro prints to the log the names of all macros defined in the current editor window, provided the program has been saved to disk. Note that spaces can be included at the beginning of macro names so that indentation can be used to help organize the code.

/****************************************************************************/
/*  Program Name  : ListMacroDefinitions                                    */
/*  Purpose       : List all macro names as defined in the current program  */
/*                  in the log.                                             */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Author        : Matt Trzcinski                                          */
/*  Last Update   : 2017/04/03                                              */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Note(s)       : Current file must be saved before executing macro. The  */
/*                  program reads in that saved file and parses it for      */
/*                  macro definitions.                                      */
/*                                                                          */
/*                  Macro name is considered to be the set of characters    */
/*                  between the string '%macro ' and the left parenthesis.  */
/*                  This allows for indentation of macro names within the   */
/*                  list by inserting spaces at the beginning of the macro  */
/*                  name.  For example, %macro   ThirdLevelMacro();         */
/*                  Support is included for only 10 macro levels.           */
/*                                                                          */
/*                  It is advised to place this macro inside an AUTOCALL    */
/*                  library so that it may be bound to a shortcut via the   */
/*                  KEYS menu.  AUTOCALL libraries can be found by issuing  */
/*                          %put %sysfunc(pathname(sasautos));              */
/*                  Once placed in an AUTOCALL library, issue:              */
/*                                    dm 'KEYS';                            */
/*                  For a hotkey, assign the following code:                */
/*                        submit '%ListMacroDefinitions();'                 */
/*                  Press Ctrl + s to save the keybinding.                  */
/*                                                                          */
/*  Development   : Create error handling for case when file is not saved.  */
/*                                                                          */
/*                  It appears impossible to delete a macro from within     */
/*                  that same macro.  As such, ListMacroDefinitions is      */
/*                  not deleted after it is compiled. It may be possible to */
/*                  delete by implementing a shorter macro name and         */
/*                  explicitly calling PROC CATALOG from KEYS.  However,    */
/*                  this approach presents problems when trying to mask     */
/*                  log notes.  KEYS commands are restricted in character   */
/*                  length.                                                 */
/*                                                                          */
/****************************************************************************/

%macro ListMacroDefinitions();

  option nonotes;

  dm 'clear log;';

  /*Assign full path of current program to temporary macro
    variable.*/
  proc sql noprint;
    select distinct xpath
    into : _path trimmed
    from dictionary.extfiles
    where xpath contains "%sysget(SAS_EXECFILENAME)"
    ;
  quit;

  /*Read program in line-by-line and parse for
    macro definitions.  To call this step using
    KEYS requires the file path to be hardcoded.
    SAS cannot dynamically resolve the path. When
    the program is compiled, the path is resolved
    before being saved.*/
  data _null_ / pgm=work.list_macros;
    infile "&_path" dlmstr = '``?`?`' lrecl = 32767;
    length
      line      $ 32767
      macroName $ 42 /*ValidV7 macro name is 32 + 10 possible indentation levels*/
    ;
    input line;

    /*Accommodate leading spaces in macro names. Note that
      the position of the cursor after the space in
      '%macro ' is 8*/
    if line =: '%macro' then do;
      lengthFromCol1ToLeftParenthesis = index(line,'(');
      lengthMacroName                 = lengthFromCol1ToLeftParenthesis - 8;
      macroName                       = substr(line, 8, lengthMacroName);
      lengthStrippedMacroName         = length(strip(macroName));
      numberOfLeadingSpaces           = lengthMacroName - lengthStrippedMacroName;

      put +numberOfLeadingSpaces macroName; /*put strips leading spaces from variable*/
    end;
  run;

  /*Run list_macros program*/
  data pgm=work.list_macros;
  run;

  proc delete data = list_macros (memtype = program);
  run;

  option notes;

%mend;

For example, the following program will generate the following log.

%macro TopLevel();
  %put Hello;
%mend;

%macro  SecondLevel();
  %put World;
%mend;

%macro   ThirdLevel();
  %put Goodbye;
%mend;

%macro  SecondLevelAgain();
  %put Cruel;
%mend;

%macro  SecondLevelYetAgain();
  %put World;
%mend;
TopLevel
  SecondLevel
  ThirdLevel
  SecondLevelAgain
  SecondLevelYetAgain

Print a list of dataset variables to the log

There are several ways to list the variables in a dataset, such as PROC CONTENTS. However, not only does PROC CONTENTS require a varnum, the values aren't easily copied and focus needs to change to either the output or results window. A better solution is the following macro. Not only will it list data set variables in a format convenient for copying, it also has the option to number them.

/****************************************************************************/
/*  Program Name  : ListDatasetVariables                                    */
/*  Purpose       : Print dataset variables names to log as a vertical list */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Author        : Matt Trzcinski                                          */
/*  Last Update   : 2017/06/22                                              */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Input(s)      : data set name                                           */
/*  Output(s)     : Log message                                             */
/*  Usage         : dataset= Data whose variables are to be listed.         */
/*                   number= (optional) Number each variable. Default is    */
/*                   false.                                                 */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Example(s)    : %ListDatasetVariables(sashelp.class);                   */
/*                                                                          */
/*                    Output                                                */
/*                    ------                                                */
/*                      Dataset: SASHELP.CLASS                              */
/*                                                                          */
/*                      Name                                                */
/*                      Sex                                                 */
/*                      Age                                                 */
/*                      Height                                              */
/*                      Weight                                              */
/*                                                                          */
/*                 %ListDatasetVariables(sashelp.class, number=true);       */
/*                                                                          */
/*                    Output                                                */
/*                    ------                                                */
/*                      Dataset: SASHELP.CLASS                              */
/*                                                                          */
/*                      1: Name                                             */
/*                      2: Sex                                              */
/*                      3: Age                                              */
/*                      4: Height                                           */
/*                      5: Weight                                           */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Note(s)       :                                                         */
/*                                                                          */
/*  Development   : Possibly adapted from:                                  */
/*                  http://www2.sas.com/proceedings/forum2008/104-2008.pdf  */
/*                                                                          */
/****************************************************************************/

%macro IsEmpty(macroVariable);
  %sysevalf(%superq(&macroVariable)=, boolean)
%mend;

%macro SetSystemOptions(opt1, opt2, opt3);
  options &opt1. &opt2. &opt3.;
%mend;

%macro ListDatasetVariables(dataset, number=false) / minoperator mindelimiter = ',';

  %let originalNOTES  = %sysfunc(getoption(notes));

**********************************
*** Validate inputs
**********************************;
  %if %IsEmpty(dataset) %then %do;
      %put ERROR: [MACRO] DATASET argument is blank.;
      %SetSystemOptions(&originalNOTES.);
      %abort cancel;
      %end;

  %if not ( %upcase(&number.) in (TRUE, FALSE) ) %then %do;
      %put ERROR: [MACRO] Invalid value [&number.] for NUMBER.;
      %SetSystemOptions(&originalNOTES.);
      %abort cancel;
      %end;

  %if not %sysfunc(exist(&dataset.)) %then %do;
      %put ERROR: [MACRO] Dataset %upcase(&dataset.) could not be found.;
      %SetSystemOptions(&originalNOTES.);
      %abort cancel;
      %end;

**********************************
*** List variables
**********************************;
  %put;
  %put Dataset: %UPCASE(&dataset.);
  %put;
  %let dataset = %sysfunc(open(&dataset, i));
  %let numberOfVariables = %sysfunc(attrn(&dataset, nvars));
  %do i = 1 %to &numberOfVariables.;
    %if &number = true %then %put &i: %sysfunc(varname(&dataset, &i));
    %else %put %sysfunc(varname(&dataset, &i));
  %end;
  %let rc = %sysfunc(close(&dataset));

**********************************
*** Housekeeping
**********************************;
  %SetSystemOptions(&originalNOTES.);

%mend;

Get a list of datasets in the WORK library

The table DICTIONARY.TABLES contains all the datasets for each library. This can be used with PROC SQL to create a list of data sets in the WORK directory. Such a list can be useful for exporting, etc.

Let us look at what data sets are available in the DICTIONARY.TABLES table.

proc sql noprint;
  create table _test as
  select memname
  from dictionary.tables
  ;
quit;

This can be modified so that rather than putting the data set names into a data set, they are instead put into a macro list.

proc sql noprint;
  select memname
  into : datasetList separated by '|'
  from dictionary.tables
  where libname = 'SASHELP'
    and memname contains 'QTR'
  ;
quit;

%put &=datasetList;

Note that LIBNAME and MEMNAME must be in all caps.

EXAMPLE: Using data set list to export data

EXAMPLE: Exporting print ready data sets

With a list data sets in the current directory, along with the %CreateDatasetListing macro, print ready datasets can easily be created.

%macro ExportDatasetListings();
%put NOTE: [MACRO] Executing: ExportDatasetListings();

  /*Get list of datasets that need to be printed*/
  proc sql noprint;
    select memname
    into : listDatasets separated by '|'
    from dictionary.tables
    where libname = 'WORK'
      and memname contains 'LAGDA391'
    ;
  quit;

  %do d = 1 %to %ListLength(&listDatasets.);
  %let dataset = %ListElement(listDatasets, &d);
  %put NOTE: [MACRO] Processing dataset: %upcase(&dataset.);


    %CreateDatasetListing(title=&studyID., dataset=&dataset., dirOut=&dirOutput.);

  %end;

%mend;

%ExportDatasetListings();

Export a subset of WORK library datasets

This technique allows for quick and easy exporting. Data sets can be named with various prefixes such as 'clean_' or 'raw_' so that they may be distinguished by type.

%macro ListLength(list);
  %if %sysevalf(%superq(list)=, boolean) %then %let count = 0;
  %else %let count = %eval(%sysfunc(countc(&list, '|')) + 1);
  &count
%mend;

%macro ListElement(list, n);
  %let nthElement = %sysfunc(scan(%superq(&list.), &n., |, m));
  &nthElement
%mend;

%macro  ExportDatasets();
%put NOTE: [MACRO] Executing: ExportDatasets();

  proc sql noprint;
    select memname
    into : datasetList separated by '|'
    from dictionary.tables
    where libname = 'WORK'
      and memname not contains 'RAW'
    ;
  quit;

  %do d = 1 %to %ListLength(&datasetList.);
  %let dataset = %ListElement(datasetList, &d);
  %put NOTE: [MACRO] Exporting dataset= &dataset.;

    data outData.%lowcase(&dataset.);
      set &dataset.;
    run;
  %end;
%mend;

Structure code for a plot

When creating a plot, output should be directed to the RESULTS window only when DEBUG=TRUE. To do this requires the following.

Within the "Development Specific Parameters" section, place the following code24.

%let device   = png;
%let goptions =
      rotate  = landscape
      csymbol = black
      xpixels = 1294
      ypixels = 800
;

/*%Skip is undefined at this point in the program*/
data _null_;
  ods listing;
  ods html path = "%sysfunc(pathname(WORK))";

  if lowcase("&debug.") = 'true' then
    call execute("
      ods listing close;
      ods results;
      goptions
        reset   = all
        device  = &device.
        &goptions.
      ;
    ");
  else
    call execute("
      ods html close;
      goptions
        reset   = all
        device  = &device.
        gsfname = outPlot
        gsfmode = replace
        &goptions.
      ;
    ");
run;

Then, to actually plot the data, use:

%Skip(&debug., 'filename outPlot "&dirOutput.\&figureName..&device.";');

proc gplot data = node_data;
  plot y_value * x_value = symbol_definition /
    legend = legend1
    haxis = axis1
    vaxis = axis2
  ;
run;
quit;

%Skip(&debug., 'filename outPlot clear;');

See Dynamically Naming Plots and Tables to see how to name plots according to the analysis variables.

Create a print ready dataset listing

The following code is to be used in a verification process. It outputs a given data set in a print ready .rtf file. Columns are fit to the page, pages are numbered with X of Y, and appropriate titles are automatically generated.

/****************************************************************************/
/*  Program Name  : CreateDatasetListing                                    */
/*  Purpose       : Create well formatted listing for a given data set      */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Author        : Matt Trzcinski                                          */
/*  Last Update   : 2017/06/21                                              */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Input(s)      :                                                         */
/*                  title       = (Optional) Title (i.e. study name)        */
/*                  dataset     = Dataset to create listing for.            */
/*                  dirOut      = (Optional) Output path. Default is        */
/*                                C:\Users\&SYSUSERID\Documents             */
/*                  orientation = (Optional) Portrait or landscape. Default */
/*                                is portrait.                              */
/*                                                                          */
/*  Output(s)     : title - DATASET.rtf or DATASET.rtf                      */
/*                                                                          */
/*  Usage         : %CreateDatasetListing(                                  */
/*                            title=                                        */
/*                    ,     dataset=                                        */
/*                    ,      dirOut=                                        */
/*                    , orientation=                                        */
/*                    ,       style=                                        */
/*                  );                                                      */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Example(s)    :                                                         */
/*                                                                          */
/*                  %CreateDatasetListing(Study 123456                      */
/*                                      , sashelp.us_data                   */
/*                                      , "C:\Users\&SYSUSERID\Documents"   */
/*                   );                                                     */
/*                                                                          */
/*                                                                          */
/*                  %CreateDatasetListing(    dataset= sashelp.baseball     */
/*                                      , orientation= portrait             */
/*                  );                                                      */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Note(s)       : Accepts both quoted and unquoted arguments for          */
/*                  title and dirOut.                                       */
/*                                                                          */
/*  Development   : Create custom style template in a future version.       */
/*                  Currently, the MONOSPACE template is being used.        */
/*                                                                          */
/*                  Revise with %SetSytemOptions controlling system options.*/
/*                  Technically, this also clears title1, title2, and       */
/*                  footnote1.  Titles/footnotes are stored in               */
/*                  dictionary.titles. Something similar to                 */
/*                  %SetSystemOptions could be set up to preserve titles    */
/*                                                                          */
/****************************************************************************/

%macro CreateDatasetListing(title=
                         , dataset=
                         , dirOut='C:\Users\&SYSUSERID\Documents'
                         , orientation=PORTRAIT
                         , style=monospace
       ) / minoperator ;
%put NOTE: [MACRO] Executing: CreateDatasetListing(title=&title, dataset=&dataset, dirOut=&dirOut, orientation=&orientation);
%put;

  %let startTime = %sysfunc(datetime());

  /*Validate inputs*/
  %if %sysevalf(%superq(dataset)=, boolean) %then %do;
    data _null_;
      put 'ERROR: [MACRO] Missing parameter DATASET.';
      abort cancel;
    run;
  %end;

  %if not %sysfunc(exist(&dataset)) %then %do;
    data _null_;
      put "ERROR: [MACRO] Dataset %upcase(&dataset.) does not exist.";
      abort cancel;
    run;
  %end;

  %if not(%upcase(&orientation) in (LANDSCAPE PORTRAIT)) %then %do;
    data _null_;
      put "ERROR: [MACRO] Invalid orientation: &orientation.";
      abort cancel;
    run;
  %end;

  filename dirCheck "&dirOut.";
  %if not %sysfunc(fexist(dirCheck)) %then %do;
    data _null_;
      put "ERROR: [MACRO] Invalid directory &dirOut";
      option nonotes;
      filename dirCheck clear;
      option notes;
      abort cancel;
    run;
  %end;
  %else %do;
    option nonotes;
    filename dirCheck clear;
    option notes;
  %end;

  /*Remove quotes from arguments*/
  %if not %sysevalf(%superq(title)=, boolean) %then %do;
    %let title  = %sysfunc(dequote(&title));
    %end;
  %let dirOut = %sysfunc(dequote(&dirOut));

  /*Set output file name depending on whether
    title is given*/
  %if not %sysevalf(%superq(title)=, boolean) %then %do;
    %let fileName = &title - %upcase(&dataset);
    %end;
  %else %let fileName = %upcase(&dataset.);

  /*Save current SAS session options*/
  %let original_number      = %sysfunc(getoption(number));
  %let original_date        = %sysfunc(getoption(date));
  %let original_label       = %sysfunc(getoption(label));
  %let original_center      = %sysfunc(getoption(center));
  %let original_orientation = %sysfunc(getoption(orientation));

  /*Set options for good looking output*/
  options
    nonumber
    nodate
    nolabel
    center
    orientation = &orientation.
  ;

  /*Suppress listing*/
  ods listing close;
  /*Set output to rtf*/
  ods rtf file = "&dirOut.\&fileName..rtf" style=&style.;
  /*Prevent output file from automatically opening*/
  ods noresults;
  options nonotes;

  %if not %sysevalf(%superq(title)=, boolean) %then %do;
    title "&title.";
  %end;
  title2    "Dataset: &dataset.";
  footnote  'Page (*ESC*){thispage} of (*ESC*){lastpage}';

    proc print
      heading=horizontal
      split = '_' /*If variables have _ separating words, then this splits into separate words*/
      data = &dataset.
      ;
    run;

  /*Restore system settings*/
  title;
  title2;
  footnote;

  ods rtf close;
  ods results;
  ods listing;
  options notes;

  options
    &original_number.
    &original_date.
    &original_label.
    &original_center.
    orientation = &original_orientation.
  ;

  /*Time keeping*/
  %put NOTE: [MACRO] CREATEDATASETLISTING macro used (Total process time):;

  %let duration = %sysfunc(putn(%sysevalf(%sysfunc(datetime())-&startTime.), time12.3));
  %if %sysfunc(minute("&duration."t)) > 0 %then %do;
    %put NO%str(TE-)         real time            %substr(&duration., 3, 8);
  %end;
  %else %do;
    %put NO%str(TE-)         real time            %substr(&duration., 6, 5) seconds;
  %end;
%mend;

Dynamically name plots and tables

When creating tables and figures, all items subject to change should be stored in a macro variable defined at the top of the program. This single access point allows for quick modification and helps keep code within the appropriate line length.

The task becomes non-trivial, however, when the label, itself a macro variable, contains macro variables which are as-of-yet undefined in the program. For example, consider plots created by looping through a list of analysis variables. When the general form of the figure name is defined at the top of the program, the particular variables corresponding to each plot will be undefined.

To remove this warning, the %NRSTR and %UNQUOTE must be used. The former prevents macros from resolving during macro compilation (i.e. "masks" them). The latter allows the macro to resolve during macro execution (i.e. by "unmasking").

Here is a basic example which illustrates the concepts.

/*Top of program*/
%let figureName = Analysis figure for &analysisVariable..;

/* ... */

/*Later in program*/
%let analysisVariable = FOO;
%put &figureName.;

This generates the warning:

WARNING: Apparent symbolic reference ANALYSISVARIABLE not resolved.

To remove the error, we must first mask using %NRSTR and unmask using %UNQUOTE.

/*Top of program*/
%let figureName = Analysis figure for %nrstr(&analysisVariable.).;

/* ... */

/*Later in program*/
%let analysisVariable = FOO;
%put %unquote(&figureName.);

In practice, the technique will look similar to the following.

%let figureName = This figure has an %nrstr(%upcase(&analysisVariable.)) in the middle of the name.;

/* ... */

%Skip(&debug., 'filename outPlot "&dirOutput.\&studyID. - Figure &figureNumber. - %unquote(&figureName.).&device.";');

Remove outliers programmatically

A common method of removing outliers from an analysis involves using repeated if statements and copying/pasting the corresponding values in.

%macro RemoveOutliers();
%put NOTE: [MACRO] Executing: RemoveOutliers();

  data observations_less_outliers;
    set observations;

    if treatment = '10.9' and replicate = 'D' and day = 21 and hll_mm = 7.9   then do;
      hll_mm = .;
      put 'NOTE: [MACRO] Outlier removed. HLL_MM value set to missing at row: ' _N_;
      end;

    if treatment = '0.0'  and replicate = 'B' and day = 21 and bw_g   = 2.277 then do;
      bw_g = .;
      put 'NOTE: [MACRO] Outlier removed. BW_G   value set to missing at row: ' _N_;
      end;
  run;
%mend;

While this approach works, it is ill advised. Errors in data entry would either remove the wrong points or not remove the points at all. The ability to check for errors is limited.

A 'hands-off' approach is better. In this approach, the outliers are provided as a data set, the result of the process determining outliers. Here, the outliers are read directly from the dataset. The corresponding if statements are automatically generated. This reduces the effect of typos. A typo would exist in the statements generating the if and likely stand out more.

The single quotes are used to indicate the if statement which is being created. The double quotes are used to surround character data.

%macro  ImportOutliers();
%put NOTE: [MACRO] Executing: ImportOutliers();

  data outliers;
    set inData.outliers_&studyID.;
  run;
%mend;

%macro  RemoveOutliers();
%put NOTE: [MACRO] Executing: RemoveOutliers();

  proc sql noprint;
    select
      cat('if group = '
        , strip(put(group, best32.))
        , ' and replicate = "'
        , strip(replicate)
        , '" and day = '
        , strip(put(day2, best32.))
        , ' and '
        , strip(parameter)
        , ' = '
        , strip(put(obsed, best32.))
        , ' then do; '
        , strip(parameter)
        , ' = . ;'
        , ' put "NOTE: [MACRO] Removed '
        , strip(parameter)
        , ' value at line: " _N_ ; end;'
      ) length = 32767
    into : listIfs separated by ' '
    from outliers
    ;
  quit;

  data observations_less_outliers;
    set observations;
    &listIfs.;
  run;
%mend;

The length = 32767 is required because the CAT function has a default buffer size of 200. If you concatenate more than this, SAS will truncate the IF statement. The max length of a macro variable is 32767. Setting the length to this avoids any possibility of truncation.

To verify that the correct statement was assigned, use the following %put statement:

%put [%quote(&listIfs)];

The %quote is necessary because the statement we created contains quotes and semi-colons. Without the %quote function, SAS will try to execute the code which the macro variable resolves to. The square brackets are in place so to verify that no leading or trailing spaces exist. Note that in the Enhanced Editor, using a square bracket immediately after %put eliminates the syntax highlighting. It is not an error, merely poor implementation of syntax highlighting by SAS.

How to import Excel data

Excel data is best handled through the DDE faculty.

The Case for DDE

Within the SAS community, the user is advised not to use DDE. Many of the reasons why are stated in the article Why your DDE programs won't work anymore. This guide contends that DDE is not only a valid solution, but is indeed the best. Consider the following.

First, the article "Why your DDE programs won't work anymore" is written by a Senior Manager of SAS Online Communities. I infer from this title that he is SAS's online presence. It is his job to push SAS products. This appears to be the ultimate purpose of the article: purchase a SAS product. He makes several valid points, but on examination they do not prove sufficient to rule out use of DDE.

There is no evidence that DDE is going to go away any time soon. While it is no longer supported, it appears that it will continue to be included with Windows. It is part of Windows 10 and the 2016 Windows Suite. It offers functionality that SAS doesn't. Furthermore, as anyone who has worked with Windows before knows, Windows retains all of its failed endeavors for backwards compatibility. Examples of this include WordBasic and VBScript.

The primary argument is that DDE will not work if SAS is being used from a centralized Unix server. This is a poor argument for two reasons. First, DDE only exists on Windows and OS/2. DDE does not exist on Unix systems. Furthermore, we are not using a Unix system. Second, the argument that programs will break if DDE disappears is also unfounded. Storing the import process in a separate function allows the mechanics of the process to change independently of the end result. If the function is to import Excel data via a "OpenExcel()" function, all that should be required is an update to how that particular function operates. The programs which use it should remain unaffected.

Interacting with the Windows Environment: Relevant SAS keywords

X Statement

Runs an operating system command or a Windows application from within a SAS session. Can be run outside of a DATA step. See X Command.

X Command

(Immediately executed, regardless) Command refers the the SAS command line, either in the shell or in the command line within the editor. Enters operating environment mode or enables you to submit a Windows command without ending your SAS session. The X command is a global SAS statement. Is exeecuted independently of conditionals. Similar to CALL SYSTEM, %SYSEXEC, and SYSTEM().

CALL SYSTEM Routine

(Data Step, executed when Data Step executes) Submits an operating system command or a Windows application for execution. The CALL SYSTEM routine is similar to the X command. However, the CALL SYSTEM routine is callable and can therefore be executed conditionally. Does not return a code. Similar to SYSTEM() function.

%SYSEXEC Statement

(Macro Language) Issues operating environment commands. The %SYSEXEC statement causes the operating environment to immediately execute the command that you specify and assigns any return code from the operating environment to the automatic macro variable SYSRC. That is, the statement cannot be used conditionally. Use the %SYSEXEC statement and the automatic macro variables SYSSCP and SYSSCPL to write portable macros that run under multiple operating environments.

WAITFOR Statement

Suspends execution of the current SAS session until the specified tasks finish executing.

SYSTASK Statement

(Data Step) Executes, lists, or terminates asynchronous tasks.

FILENAME Statement

Associates a SAS fileref with an external file or an output device, disassociates a fileref and external file, or lists attributes of external files.

INFILE Statement

Specifies an external file to read with an INPUT statement. It defines the options to be used with the INPUT statement.

Functions

FILENAME Function

Assigns or deassigns a fileref for an external file, directory, or output device.

DINFO Function

Returns information about a directory.

DOPEN Function

Opens a directory and returns a directory identifier value.

%SYSFUNC

Execute SAS functions or user-written functions.

%QSYSFUNC

Execute SAS functions or user-written functions.

FOPEN()

Port of the SCL function. Opens an external file and returns a file identifier value. FOPEN returns a 0 if the file could not be opened. It seems that "open" in this context refers to the I/O stream between a file and SAS and not opening the application which accesses the file.

FCLOSE()

Closes an external file, directory, or directory member. FCLOSE returns a 0 if the operation was successful and (not 0) a non-zero value if it was not successful. If you open a file within a DATA step, it is closed automatically when the DATA step ends.

SYSTEM()

Issues an operating environment command during a SAS session and returns the system return code. Similar to X command, CALL SYSTEM, and %SYSEXEC. May be run conditionally, but requires more overhead than X and %SYSEXEC.

System Options

XCMD System Option

Specifies that the X command is valid in the current SAS session. The default is XCMD.

XWAIT System Option

Specifies whether you have to type EXIT at the DOS prompt before the DOS shell closes. The default is XWAIT. That is, the shell automatically closes with NOXWAIT.

XSYNC System Option

Controls whether an X command or statement executes synchronously or asynchronously. Synchronous means that control is not returned to SAS until the command has completed. The default is XSYNC.

XMIN System Option

Specifies to open the application specified in the X command in a minimized state or in the default active state. Default NOXMIN.

INFILE Options: Non-Delimiter Options

NOTAB

This option exists only on Windows. It is an option for the FILE, FILENAME, INFILE, and INCLUDE statements, as well as %INCLUDE. It is only valid for use with DDE. There is no separate documentation page for it. The documentation that does exist is vague: "This option enables you to use nontab character delimiters between variables." It is unclear whether this needs to be specified when using an option like DLM= to define an alternative delimiter.

MISSOVER

SAS INPUT default is FLOWOVER which continues to the next input data record if it does not find a value in the current input. That is, it reads values sequentially. MISSOVER forces SAS to conform to an input structure. MISSOVER assigns missing values when values are not present when expected.

INFILE Options: Delimiter Options

DELIMITER= (DLM=)

Specifies alternate delimeter. The default is a blank space. It seems that sometimes this is (incorrectly) referred to as a tab. Note that each character in this list is separately considered a delimiter, as with an OR operator. For example, dlm = '!#' says that data is delimited by either a bang (!) or by a hash (#).

DLMSTR=

Specify a character string to read as a delimiter. Note that this defines a group of characters which together delimit data. For example, if dlmstr = '!#', then the sequence '!#' must appear in order for data to be separated.

DLMOPT=

Toggle case-insensitive or removal of trailing blanks.

DSD (delimiter-sensitive-data)

Toggles delimiters inside quotation marks to be treated as character data.

Special Characters

Tab - '09'x Carriage return - '0D'x Linefeed - '0A'x

Understanding DDE

DDE stands for Dynamic Data Exchange. It is a method of interprocess communication under Windows and OS/2. While the scope of DDE extends beyond SAS, Word, and Excel, these three programs are the only ones which concern us.

A DDE link establishes a client-server relationship between the two applications. Although such relationships are usually two sided, with SAS it is one-sided. SAS can only send messages; it cannot receive them. That is, SAS can direct Word or Excel, but these applications cannot influence the behavior of SAS.

"DDE uses a client/server relationship to enable a client application to request information from a server application. SAS is always the client. In this role, SAS requests data from server applications, sends data to server applications, or sends commands to server applications." - SAS Companion for Windows, 4ed

It is not clear, but it seems that the DDE process operates through a shared file between two applications. SAS manages DDE through the FILENAME statement. Two DDE filename statements exist; the system link (1) and the workbook range link (2). These naming conventions are the author's.

/*System link*/
filename ddeSYS dde 'excel|system';

/*Workbook range link*/
filename ddeRNG dde "Excel|&dataDirectory.[&fileName.]Mortality (Day0-7)!C1:C5"

A system link is used to issue macro commands (WordBasic or VBA).

"DDE server applications support certain commands that you can issue by using a DDE link to control the application. To use these commands, use the special topic name SYSTEM in the DDE triplet and leave the item name blank. You can then use the INPUT statement for input from an application and the PUT statement to issue commands to the server application." - Controlling Another Application Using DDE

A workbook range link directly accesses a specific worksheet or word table/document.

How to merge cells in a Word table

By virtue of the Single Responsibility Principle, merging should occur separately from the creation of the table. It so happens that SAS cannot, for all practical purposes, merge cells within a table25.

The table which is printed should be identical to the dataset on which it is based26. Once this has been achieved, some mechanism needs to handle the table creation by simply transforming the dataset into a table. Experience has shown PROC REPORT to be ably suited to this task27. Once the table has been written to file, a script may be called which then modifies the table28.

Sample Program

%macro  GenerateTable();
%put NOTE: [MACRO] Executing: GenerateTable();

  %OutputTable();
  %Skip(&debug., '%FormatExternalWordTable();');

%mend;

%macro   OutputTable();
%put NOTE: [MACRO] Executing: OutputTable();

  %Skip(&debug., '
    ods noresults;
    ods rtf file="&dirOutput.\&rtfName..rtf";
  ');

  title "&title.";
  proc report data = table_data
              style(header)=[
                background    = white
                rules         = none
                verticalalign = bottom
              ]
              SPLIT='00'x
              ;
    columns
      treatment
      replicate_id
      ('Nuptial Tubercle Score'
        tubercle_score_n
        mean_tubercle_score_replicate
        mean_tubercle_score_treatment
        median_tubercle_score_treatment
        sem_tubercle_score_treatment
        cv_tubercle_score_treatment
      )
    ;

    define treatment                        / center 'Treatment (*ESC*)n (\302\265g/L)';
    define replicate_id                     / center 'Replicate';

    define tubercle_score_n                 / center 'N';
    define mean_tubercle_score_replicate    / center 'Replicate (*ESC*)n Mean';
    define mean_tubercle_score_treatment    / center 'Mean';
    define median_tubercle_score_treatment  / center 'Median';
    define sem_tubercle_score_treatment     / center 'SEM';
    define cv_tubercle_score_treatment      / center 'CV';
  run;
  title;

  %Skip(&debug., '
    ods rtf close;
    ods results;
  ');

%mend;

%macro   FormatExternalWordTable();
%put NOTE: [MACRO] Executing: FormatExternalWordTable();

  %WordOpen("&dirOutput.\&rtfName..rtf");

  %MergeColumns();

  %WordClose(
      save = yes
    , name = &rtfName.
    , path = &dirData.
  );

%mend;

%macro    MergeColumns();
%put NOTE: [MACRO] Executing: _MergeColumns();

  %let scriptFile = MergeColumns;
  %let temp       = %sysfunc(getoption(WORK));

  data _null_;
    file "&temp.\&scriptFile..vbs";
    put 'myCols = Array(1, 5, 6, 7, 8)';
    put 'myRows = Array(3, 7, 11, 15, 19)';
    put 'Set oWord  = GetObject(, "Word.Application")';
    put 'Set oDoc   = oWord.ActiveDocument';
    put 'Set oTable = oDoc.Tables(1)';
    put 'Set fso = CreateObject ("Scripting.FileSystemObject")';
    put 'Set stdout = fso.GetStandardStream (1)';
    put 'stdout.WriteLine "Document hidden for editing. Will return shortly."';
    put 'oWord.Visible = False';
    put 'for i = 0 to uBound(myCols)';
    put '  for j = 0 to uBound(myRows)';
    put '    Set rCells = oTable.Cell(myRows(j) + 1, myCols(i)).Range';
    put '    rCells.End = oTable.Cell(myRows(j) + 3, myCols(i)).Range.End';
    put '    rCells.Select';
    put '    rCells.Delete';
    put '    oTable.Cell(myRows(j), myCols(i)).Merge oTable.Cell(myRows(j) + 3, myCols(i))';
    put '    oTable.Cell(myRows(j), myCols(i)).Range.ParagraphFormat.Alignment = 1';
    put '    oTable.Cell(myRows(j), myCols(i)).Range.Cells.VerticalAlignment = 1';
    put '  next';
    put 'next';
    put 'oWord.Visible = True';
  run;

  options noxwait;
  %sysexec(start /wait cscript "&temp.\&scriptFile..vbs" //nologo);
  options xwait;
%mend;

Sample Program: Automatically determine merge regions

%macro  GenerateTables();
%put NOTE: [MACRO] Executing: GenerateTables();

  %do i = 1 %to %ListLength(&listTables.);
    %let listItem         = %scan(&listTables, &i, '|');

    %let generation       = %scan(&listItem.,  1, '_');
    %let tableNum         = %scan(&listItem.,  2, '_');

    %let fileName         = %unquote(&tableName.);
    %let tableDataSubset  = &generation._table_data;

    data &tableDataSubset.;
      set table_data;
      where generation  = "&generation.";
    run;

    %OutputTable(&tableDataSubset., &fileName.);
    %Skip(&debug., '%FormatExternalWordTable(&generation., &tableDataSubset., &fileName.);');

    %end;
%mend;

%macro   OutputTable(tableDataSubset, fileName);
%put NOTE: [MACRO] Executing: OutputTable(&tableDataSubset.);

  %Skip(&debug., '
    ods noresults;
    ods rtf file="&dirOutput.\&fileName..rtf";
  ');

  title "&fileName.";
  proc report data = &tableDataSubset.
              style(header)=[
                background    = white
                rules         = none
                verticalalign = bottom
              ]
              spanrows
              SPLIT='00'x
              ;
    columns
      treatment

      replicate
      replicate_n
      replicate_mean

      treatment_mean
      treatment_median
      treatment_std
      treatment_sem
      treatment_cv
      treatment_min
      treatment_q1
      treatment_q3
      treatment_max
    ;

    define treatment        / center style(column)={cellwidth=1in}    'Treatment (*ESC*)n (mg/L)';

    define replicate        / center style(column)={cellwidth=0.5in}  'ID';
    define replicate_n      / center style(column)={cellwidth=0.5in}  'N';
    define replicate_mean   / center style(column)={cellwidth=0.5in}  'Mean';

    define treatment_mean   / center style(column)={cellwidth=0.5in}  'Mean';
    define treatment_median / center style(column)={cellwidth=0.75in} 'Median';
    define treatment_std    / center style(column)={cellwidth=0.5in}  'Std (*ESC*)n Dev';
    define treatment_sem    / center style(column)={cellwidth=0.5in}  'SEM';
    define treatment_cv     / center style(column)={cellwidth=0.5in}  'CV (*ESC*)n (%)';
    define treatment_min    / center style(column)={cellwidth=0.5in}  'Min';
    define treatment_q1     / center style(column)={cellwidth=0.5in}  'Q1';
    define treatment_q3     / center style(column)={cellwidth=0.5in}  'Q3';
    define treatment_max    / center style(column)={cellwidth=0.5in}  'Max';
  run;
  title;

  %Skip(&debug., '
    ods rtf close;
    ods results;
  ');
%mend;

%macro   FormatExternalWordTable(generation, tableDataSubset, fileName);
%put NOTE: [MACRO] Executing: FormatExternalWordTable(generation=&generation, tableDataSubset=&tableDataSubset, fileName=&fileName.);

  %WordOpen("&dirOutput.\&fileName..rtf");

  %MergeCells(&generation., &tableDataSubset.);

  %WordClose(
      save    = yes
    , name    = &fileName.
    , path    = &dirOutput.
    , format  = rtf
  );

%mend;

%macro    MergeCells(generation, tableDataSubset);
%put NOTE: [MACRO] Executing: MergeCells(generation=&generation, tableDataSubset=&tableDataSubset);

  %let scriptFileName = MergeCells_&generation.;
  %let dirSessionTemp = %sysfunc(getoption(WORK));
  %let scriptPath     = &dirSessionTemp.\&scriptFileName..vbs;

  %GetReplicateCounts(&tableDataSubset.);
  %WriteScriptToFile(&generation., &scriptPath.);
  %RunScript(&scriptPath.);

%mend;

%macro     GetReplicateCounts(tableDataSubset);
%put NOTE: [MACRO] Executing: GetReplicateCounts(tableDataSubset=&tableDataSubset);

/*The index of myRows corresponds to each group.  The index
  of listReplicateCounts also corresponds to group.  Therefore,
  the same iterator (i) is used to merge across replicates for
  each group.*/
  proc sql noprint;
    select  count(replicate)
    into :listReplicateCounts  separated by ','
    from &tableDataSubset.
    group by group
    ;
  quit;

  %DefineVBScriptArray();

%mend;

%macro      DefineVBScriptArray();
%put NOTE: [MACRO] Executing: DefineVBScriptArray();

  %global vbArray;
  %let vbArray = Array(&listReplicateCounts);
  %put NOTE: [MACRO] &=vbArray;
%mend;

%macro     WriteScriptToFile(generation, scriptPath);
%put NOTE: [MACRO] Executing: WriteScriptToFile(generation=&generation, scriptPath=&scriptPath, vbArray=%quote(&vbArray));

  /*Script functions properly on first call.  On subsequent calls
    it seems to fail on 'Set oWord = GetObject(, "Word.Application")'.
    It is not clear why this is.  The script runs successfully when
    run independently of SAS. Word appears to be properly opened/
    closed via the corresponding macros.  Objects should not exist
    beyond the lifetime of a given script.  Trying to force garbage
    collection has no effect.  A delay of 200 ms appears sufficient
    for Windows to sort out whatever connection issues need to be
    resolved.*/
  data _null_;
    file "&scriptPath.";
    put 'WScript.Sleep 200';
    put 'Set oWord  = GetObject(, "Word.Application")';
    put 'Set oDoc   = oWord.ActiveDocument';
    put 'Set oTable = oDoc.Tables(1)';
    put 'Set fso    = CreateObject ("Scripting.FileSystemObject")';
    put 'Set stdout = fso.GetStandardStream (1)';
    put 'stdout.WriteLine "Document hidden for editing. Will return shortly."';
    put 'oWord.Visible = False';
    put ' ';
    put "' Insert 'Replicate' into header";
    put 'oTable.Rows.Add(oTable.Rows(1))';
    put 'myCols = Array(1, 5, 6, 7, 8, 9, 10, 11, 12, 13)';
    put 'for i = 0 to uBound(myCols)';
    put '  oTable.Cell(1, myCols(i)).Merge oTable.Cell(2, myCols(i))';
    put 'next';
    put 'oTable.Cell(1, 2).Merge oTable.Cell(1, 4)';
    put 'oTable.Cell(1, 2).Range.InsertAfter "Replicate"';
    put ' ';
    put "' Merge table cells";
    put 'myCols         = Array(1, 5, 6, 7, 8, 9, 10, 11, 12, 13)';
    put "replicateCount = &vbArray.";
    put ' ';
    put "' Number of groups and replicates often vary";
    put "' Automatically determine group boundaries";
    put "' uBound(replicateCount) = # of groups";
    put 'ReDim myRows(uBound(replicateCount))';
    put 'myRows(0) = 3';
    put 'for group = 1 to uBound(myRows)';
    put '  myRows(group) = myRows(group - 1) + replicateCount(group - 1)';
    put 'next';
    put ' ';
    put 'for i = 0 to uBound(myRows)';
    put '  for j = 0 to uBound(myCols)';
    put '   Set rCells = oTable.Cell(myRows(i) + 1, myCols(j)).Range';
    put '      Set rCells = oTable.Cell(myRows(i) + 1, myCols(j)).Range';
    put '      rCells.End = oTable.Cell(myRows(i) + replicateCount(i) - 1, myCols(j)).Range.End';
    put '      rCells.Select';
    put '      rCells.Delete';
    put '      oTable.Cell(myRows(i), myCols(j)).Merge oTable.Cell(myRows(i) + replicateCount(i) - 1, myCols(j))';
    put '    oTable.Cell(myRows(i), myCols(j)).Range.ParagraphFormat.Alignment = 1';
    put '    oTable.Cell(myRows(i), myCols(j)).Range.Cells.VerticalAlignment = 1';
    put '  next';
    put 'next';
    put 'oWord.Visible = True';
  run;
%mend;

%macro     RunScript(scriptPath);
%put NOTE: [MACRO] Executing: RunScript(&scriptPath.);

  options noxwait;
  %sysexec(start /wait cscript "&scriptPath." //nologo);
  options xwait;
%mend;

WordOpen

/****************************************************************************/
/*  Program Name  : WordOpen                                                */
/*  Purpose       : Marcro utility to open Word application or specific     */
/*                  Word file. Terminates if process takes longer than      */
/*                  10 seconds.                                             */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Author        : Matt Trzcinski                                          */
/*  Last Update   : 2017/05/06                                              */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Input(s)      : None or Word file (.doc, .docm, .docx, .docx, .dot,     */
/*                  .dotm, .dotx, .htm, .html, .htm, .html, .mht, .mhtml,   */
/*                  .odt, .pdf, .rtf, .txt, .wps, .xml, .xml, .xps)         */
/*  Output(s)     : None                                                    */
/*  Usage         : To open an Word file, pass the full file path into      */
/*                  WordOpen(). Arguments may be either quoted or           */
/*                  unquoted. Arguments are not case sensitive. To open     */
/*                  the Word application, leave argument blank. A single    */
/*                  Word session is forced to avoid conflicts.              */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Example       : Open a Word file:                                       */
/*                                                                          */
/*                    %WordOpen(C:\filepath\test.docx);                     */
/*                                                                          */
/*                  Open the Word application:                              */
/*                                                                          */
/*                    %WordOpen();                                          */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Note          : Program checks if the Word application has opened by    */
/*                  attempting to establish a Winword|System DDE link.      */
/*                  WordOpen() may run longer than 10 seconds in the case   */
/*                  where the Word application itself has not been fully    */
/*                  loaded due to OS delays.  In terms of DATA step         */
/*                  diagnostics, WordOpen() times out according to the      */
/*                  'cpu time' and not the 'real time'. If the process      */
/*                  hangs on a data step, try closing all Mircosoft         */
/*                  applications before running this macro.                 */
/*                                                                          */
/*                  Exercise caution when opening a document from a network */
/*                  drive. Word must open before loading the document.      */
/*                  Since WordOpen() checks for whether the Word            */
/*                  application has loaded, program execution may continue  */
/*                  before the document has been opened due to network      */
/*                  delays. If a document DDE link is attempted prior to    */
/*                  the document having loaded an error will occur.         */
/*                                                                          */
/****************************************************************************/

%macro IsEmpty(macroVariable);
  %sysevalf(%superq(&macroVariable)=, boolean)
%mend;

%macro SetSystemOptions(opt1, opt2, opt3);
  options &opt1. &opt2. &opt3.;
%mend;

%macro WordOpen(inFile) / minoperator mindelimiter = ',';
%put NOTE: [&SYSMACRONAME] Executing: WordOpen(inFile=&inFile);
%if 0 %then %nrstr(%mend);

  %let macroStart    = %sysfunc(datetime());

  %let originalNOTES = %sysfunc(getoption(notes));
  %let originalXWAIT = %sysfunc(getoption(xwait));
  %let originalXSYNC = %sysfunc(getoption(xsync));

  options nonotes noxsync noxwait;

********************************************************************
** Verify Arguments
********************************************************************;
  %if %IsEmpty(inFile) %then %let filePath = WINWORD;
  %else %let filePath = %sysfunc(dequote(&inFile.));

  %let fileExtension = %substr(&filePath., %sysfunc(findc(&filePath., ., B)) + 1);

  %if not (%upcase(&fileExtension.) in (DOC, DOCM, DOCX, DOCX,
                                       DOT, DOTM, DOTX,
                                       HTM, HTML, HTM, HTML,
                                       MHT, MHTML, ODT, PDF,
                                       RTF, TXT, WPS,
                                       XML, XML, XPS, WINWORD))%then %do;
    %put ERROR: [&SYSMACRONAME] Invalid input. Not a valid Word file.;
    %SetSystemOptions(&originalNOTES., &originalXWAIT., &originalXSYNC.);
    %abort cancel;
    %end;

********************************************************************
** Attempt to open Word/Test whether Word has opened
********************************************************************;
  /*filePath must be quoted in case path contains space.
    Double quotes are used on "WINWORD" to match those
    surrounding the resolved &filePath. The switch /q
    suppresses the splash screen*/
  %if "&filePath." = "WINWORD" %then %do;
    %let cmdStatement = start winword /q;
    %let openMessage  = Word application;
    %end;
  %else %do;
    %let cmdStatement = start winword /q "&filePath.";
    %let openMessage  = &filePath.;
    %end;

  filename wordDDE dde 'winword|system';

  %let linkConnection = %sysfunc(fopen(wordDDE, S));

  %if &linkConnection > 0 %then %do;
    %put ERROR: [&SYSMACRONAME] Word already open.;
    %SetSystemOptions(&originalNOTES., &originalXWAIT., &originalXSYNC.);
    %abort cancel;
    %end;
  %else %do;
    %SetSystemOptions(notes);
    %put NOTE: [&SYSMACRONAME] Opening &openMessage....;
    %SetSystemOptions(nonotes);
    %sysexec(&cmdStatement.);

    /*Run until either Word opens (linkConnection > 0)
      or until 10 seconds have passed.*/
    %let stopTime = %sysevalf(%sysfunc(datetime()) + 10);

    %do %until (&linkConnection. > 0);
      %if (%sysfunc(datetime()) >= &stopTime.) %then %do;
       %put ERROR: [&SYSMACRONAME] Operation timed out.;
       %sysexec(taskkill /f /im winword.exe);
       %SetSystemOptions(&originalNOTES., &originalXWAIT., &originalXSYNC.);
       %abort cancel;
       %end;

      %let linkConnection = %sysfunc(fopen(wordDDE, S));
    %end;
  %end;

  %let closeReturnCode = %sysfunc(fclose(&linkConnection.));

********************************************************************
** Housekeeping
********************************************************************;
  filename wordDDE clear;
  %SetSystemOptions(notes, &originalXWAIT., &originalXSYNC.);

  %put NOTE: [&SYSMACRONAME] Succesfully opened &openMessage..;
  %put NOTE: [&SYSMACRONAME] WORDOPEN macro used (Total process time):;

  %let duration = %sysfunc(putn(%sysevalf(%sysfunc(datetime())-&macroStart.), time12.3));
  %if %sysfunc(minute("&duration."t)) > 0 %then %do;
    %put NO%str(TE-)         real time            %substr(&duration., 3, 8);
    %end;
  %else %do;
    %put NO%str(TE-)         real time            %substr(&duration., 6, 5) seconds;
    %end;

  %SetSystemOptions(&originalNotes.);

%mend;

WordClose

/****************************************************************************/
/*  Program Name  : WordClose                                               */
/*  Purpose       : Macro utility to close Word application with option to  */
/*                  save.                                                   */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Author        : Matt Trzcinski                                          */
/*  Last Update   : 2017/05/06                                              */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Input(s)      : None                                                    */
/*  Output(s)     : May save file as DOC, DOT, TXT, RTF, DOCX, or PDF.      */
/*  Usage         : Takes argument 'save' which must be either 'yes' or     */
/*                  'no'. Default path is the current working directory.    */
/*                  Path may be quoted or unquoted. Default name is         */
/*                  'untitled-document'.  Default format is .rtf. If path   */
/*                  already contains a document of the same name and type   */
/*                  name, the file will be over-written without warning.    */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Example(s)    : Close and save:                                         */
/*                                                                          */
/*                      %WordClose(                                         */
/*                          save  = yes                                     */
/*                        , path  = c:\temp                                 */
/*                        , name  = testname                                */
/*                        , format= docx                                    */
/*                      );                                                  */
/*                                                                          */
/*                  Close without saving:                                   */
/*                                                                          */
/*                      %WordClose(save=no);                                */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Note(s)       : May turn off numlock. Sometimes process will hang.      */
/*                  If this happens try selecting the Word application.     */
/*                                                                          */
/*  Development   : When possible, preference is given to DDE. DDE requires */
/*                  use of WordBasic commands. Documentation for the        */
/*                  commands used in this program are found at the end of   */
/*                  this program.  To download the WordBasic help file,     */
/*                  goto                                                    */
/*                                                                          */
/*                    http://word.mvps.org/Downloads/Wrdbasic.exe           */
/*                                                                          */
/*                  WordBasic does not handle .docx or .pdf.  It also has a */
/*                  problem saving an edited file in the same format as the */
/*                  original. For instance, suppose SAS creates an rtf and  */
/*                  edits it. If WordBasic is used to save the changes as   */
/*                  an rtf, then the changes are not saved.  Because of     */
/*                  this, all saving is handled by VBScript.                */
/*                                                                          */
/*                  It is not possible with DDE or VBScript to call a Save  */
/*                  As dialog. For details, see                             */
/*                                                                          */
/*                    http://stackoverflow.com/questions/4386124/           */
/*                    how-can-i-use-the-common-save-as-dialog-from-vbscript */
/*                                                                          */
/*                  It is possible to write a C# script and compile it      */
/*                  using Powershell.  However, doing so requires use of    */
/*                  the 'iex' command.  Calls to Powershell 'iex' generate  */
/*                  a FireEye HX ALERT security notification from CIT the   */
/*                  following day.                                          */
/*                                                                          */
/* TODO           : Investigate why SAS sometimes hangs until Word is       */
/*                  selected.  It may have something to do with the call to */
/*                  GetObject function or .ActiveDocument method in the     */
/*                  VBScript.                                               */
/*                                                                          */
/****************************************************************************/

%macro IsEmpty(macroVariable);
  %sysevalf(%superq(&macroVariable)=, boolean)
%mend;

%macro SetSystemOptions(opt1, opt2, opt3);
  options &opt1. &opt2. &opt3.;
%mend;

%macro WordClose(save=, path=, name=, format=) / minoperator mindelimiter = ',';

  %let macroStart     = %sysfunc(datetime());
  %let originalNOTES  = %sysfunc(getoption(notes));
  %let originalXWAIT  = %sysfunc(getoption(xwait));

  options nonotes noxwait;

  filename pwd '.';
  %let originalPWD = %sysfunc(pathname(pwd));
  filename pwd clear;

********************************************************************
** Verify Word is open
********************************************************************;
  filename wordDDE dde 'winword|system';

  %let linkConnection = %sysfunc(fopen(wordDDE, S));

  %if (&linkConnection. <= 0) %then %do;
    %put ERROR: [&SYSMACRONAME] Word not open. Failed to close Word.;
    %SetSystemOptions(&originalNotes., &originalXWAIT.);
    %abort cancel;
    %end;

  %let closeReturnCode = %sysfunc(fclose(&linkConnection));

********************************************************************
** Verify arguments
********************************************************************;
  %if       %upcase(&save.) = NO and not %IsEmpty(path) %then
    %put WARNING: [&SYSMACRONAME] SAVE argument is 'NO' but PATH argument is defined.;
  %else %if %upcase(&save.) = NO and not %IsEmpty(name) %then
    %put WARNING: [&SYSMACRONAME] SAVE argument is 'NO' but NAME argument is defined.;
  %else %if %upcase(&save.) = NO and not %IsEmpty(format) %then
    %put WARNING: [&SYSMACRONAME] SAVE argument is 'NO' but FORMAT argument is defined.;

  %if %IsEmpty(save) %then %do;
    %let save = NO;
    %put WARNING: [&SYSMACRONAME] SAVE is blank. Defaulting to SAVE=NO.;
    %end;

  %if not (%upcase(&save.) in (YES, NO)) %then %do;
    %put ERROR: [&SYSMACRONAME] Invalid argument SAVE.  Must be yes/no.;
    %SetSystemOptions(&originalNotes., &originalXWAIT.);
    %abort cancel;
    %end;

  %if %upcase(&save.) = YES %then %do;

    %if not %IsEmpty(path) %then %do;
      %let dirExists = %sysfunc(fileexist(&path));
      %if (&dirExists. = 0) %then %do;
       %put ERROR: [&SYSMACRONAME] Invalid path.;
       %SetSystemOptions(&originalNotes., &originalXWAIT.);
       %abort cancel;
       %end;
      %let path = %sysfunc(dequote(&path));
      %end;
    %else %if %IsEmpty(path) %then %do;
      %let path = &originalPWD;
      %SetSystemOptions(notes);
      %put NOTE: [&SYSMACRONAME] No PATH given. PATH set to "&path".;
      %SetSystemOptions(nonotes);
      %end;

    %if %IsEmpty(name) %then %do;
      %let name = untitled-document;
      %put WARNING: [&SYSMACRONAME] No NAME given. NAME set to "&name".;
      %end;

    %if %IsEmpty(format) %then %do;
      %let format = rtf;
      %put WARNING: [&SYSMACRONAME] No FORMAT given. FORMAT set to "&format".;
      %end;

    %if not(%upcase(&format.) in (DOC, DOCX, DOT, TXT, PDF, RTF)) %then %do;
      %put ERROR: [&SYSMACRONAME] Invalid format. Must be 'doc', 'docx', 'dot', 'txt', 'pdf', or 'rtf'.;
      %SetSystemOptions(&originalNotes., &originalXWAIT.);
      %abort cancel;
      %end;
    %end;

  %SetSystemOptions(notes);
  %put NOTE: [&SYSMACRONAME] Executing: WordClose(save=&save, path=&path, name=&name, format=&format);
  %SetSystemOptions(nonotes);

********************************************************************
** Close Word
********************************************************************;
  /*Shell command 'taskkill /F /IM WINWORD.EXE' may cause
    a 'not closed properly' type response upon Word reopen.*/
  %if %upcase(&save.) = NO %then %do;
    data _null_;
      file wordDDE;
      put '[FileExit 2]';
    run;
  %end;
  %else %do;

    %if       %upcase(&format.) = DOC  %then %let formatNumber = 0;
    %else %if %upcase(&format.) = DOCX %then %let formatNumber = 16;
    %else %if %upcase(&format.) = DOT  %then %let formatNumber = 1;
    %else %if %upcase(&format.) = PDF  %then %let formatNumber = 17;
    %else %if %upcase(&format.) = RTF  %then %let formatNumber = 6;
    %else %if %upcase(&format.) = TXT  %then %let formatNumber = 2;

    %let scriptFile = SaveDocumentAs%upcase(&format);
    %let dirTemp    = %sysfunc(getoption(WORK));

    data _null_;
      file "&dirTemp.\&scriptFile..vbs";
      put 'Set fso = CreateObject ("Scripting.FileSystemObject")';
      put 'Set stdout = fso.GetStandardStream (1)';
      put "stdout.WriteLine ""Saving document as &path.\&name..&format....""";
      put 'Dim oWord';
      put 'Dim oDoc';
      put 'Set oWord = GetObject(, "Word.Application")';
      put 'Set oDoc = oWord.ActiveDocument';
      put "oDoc.SaveAs ""&path.\&name..&format."", &formatNumber.";
      put 'oDoc.Close(False)';
      put 'oWord.Application.Quit(False)';
    run;

    /*Set pwd to dirTemp to avoid UNC errors*/
    %let cmd  = cd "&dirTemp";
    %syscall system(cmd);

    %sysexec(start /wait cscript "&dirTemp.\&scriptFile..vbs" //nologo);

    %let cmd  = cd "&originalPWD";
    %syscall system(cmd);
    %end;

********************************************************************
** Housekeeping
********************************************************************;
  filename wordDDE clear;
  options notes;

  %if %upcase(&save.) = YES %then %do;
    %let savedAs = &path.\&name..&format.;
    %let successMessage = Successfully closed Word. File saved as &savedAs..;
    %end;
  %else %let successMessage = Successfully closed Word. File not saved.;

  %put NOTE: [&SYSMACRONAME] &successMessage;
  %put NOTE: [&SYSMACRONAME] WORDCLOSE macro used (Total process time):;

  %let duration = %sysfunc(putn(%sysevalf(%sysfunc(datetime())-&macroStart.), time12.3));
  %if %sysfunc(minute("&duration."t)) > 0 %then %do;
    %put NO%str(TE-)         real time            %substr(&duration., 3, 8);
    %end;
  %else %do;
    %put NO%str(TE-)         real time            %substr(&duration., 6, 5) seconds;
    %end;

  %SetSystemOptions(&originalNotes., &originalXWAIT.);

%mend;

Fill in missing values with previous value

  1. Prefix the variable to be filled in with 'raw_'.
  2. Explicitly list variables using length statement.
  3. Retain non-raw form of variable.
  4. If the raw variable has a value, set variable equal to that value.
data filled_in;
  length
    raw_variable 8
    variable     8
  ;
  set not_filled_in
    (rename = (
      variable = raw_variable
    ))
  ;

  retain variable;

  if not missing(raw_variable) then variable = raw_variable;

  drop raw_variable;
run;

Example

Often times data will have missing values which need to be filled in with the previous value. For example,

%let defLength = 100;

data have;
  length
    group         8
    treatment     $ &defLength.
    replicate     $ &defLength.
    count_female  8
    count_male    8
  ;
  input group treatment $ replicate $ count_female count_male;

datalines;
1 0.0 A 3 2
1 0.0 B 4 2
1 0.0 C 4 5
2 3.2 A 5 4
2 3.2 B 6 3
2 3.2 C 2 2
3 5.1 A 5 1
3 5.1 B 3 4
3 5.1 C 3 1
;
run;

proc transpose data = have out = transposed;
  copy treatment;
  by
    group
    replicate
  ;
run;

This produces the following output. Notice that every other value of treatment is missing.

Obs    group    replicate    treatment       _NAME_       COL1

  1      1          A           0.0       count_female      3
  2      1          A                     count_male        2
  3      1          B           0.0       count_female      4
  4      1          B                     count_male        2
  5      1          C           0.0       count_female      4
  6      1          C                     count_male        5
  7      2          A           3.2       count_female      5
  8      2          A                     count_male        4
  9      2          B           3.2       count_female      6
  10     2          B                     count_male        3
  11     2          C           3.2       count_female      2
  12     2          C                     count_male        2
  13     3          A           5.1       count_female      5
  14     3          A                     count_male        1
  15     3          B           5.1       count_female      3
  16     3          B                     count_male        4
  17     3          C           5.1       count_female      3
  18     3          C                     count_male        1

To fill in the values, rename treatment to raw_treatment. Then, RETAIN a new variable called treatment. At the end of the data step, the raw_treatment variable can be dropped, but it can also be included for verification purposes. The rest of the data step is there to present the variables in a meaningful way.

data want;
  length
    group         8
    replicate     $ &defLength.
    raw_treatment $ &defLength.
    treatment     $ &defLength.
    sex           $ &defLength.
    count         8
  ;
  set transposed
    (rename = (
      treatment = raw_treatment
      _NAME_    = sex
      COL1      = count
    ));

  retain treatment;

  if not missing(raw_treatment) then treatment = raw_treatment;

        if sex = 'count_female' then sex = 'F';
  else  if sex = 'count_male'   then sex = 'M';

run;

Including the raw_treatment column, the resulting data set looks like:

			      raw_
Obs    group    replicate    treatment    treatment    sex    count

  1      1          A           0.0          0.0        F       3
  2      1          A                        0.0        M       2
  3      1          B           0.0          0.0        F       4
  4      1          B                        0.0        M       2
  5      1          C           0.0          0.0        F       4
  6      1          C                        0.0        M       5
  7      2          A           3.2          3.2        F       5
  8      2          A                        3.2        M       4
  9      2          B           3.2          3.2        F       6
  10     2          B                        3.2        M       3
  11     2          C           3.2          3.2        F       2
  12     2          C                        3.2        M       2
  13     3          A           5.1          5.1        F       5
  14     3          A                        5.1        M       1
  15     3          B           5.1          5.1        F       3
  16     3          B                        5.1        M       4
  17     3          C           5.1          5.1        F       3
  18     3          C                        5.1        M       1

How to fill in 'levels'

SAS operates using 'levels' for various things such as plotting. It may be that some levels are lost during analysis. To fix a dataset so that it matches the study design, an effective approach is to create a complete study design manually and then perform an outer left-join between the complete study design and the dataset you wish to fill in. This approach also adapts well to ad-hoc modifications and hard-codings.

The following code is for a study with the following design:

  • 3 generations (F0, F1, and F2)
  • 6 possible treatments (5 for F0, 6 otherwise)
  • Varying numbers of replicates per generation/treatment
  • Two sexes per replicate
%macro   GetStudyDesign();
%put NOTE: [MACRO] Executing: GetStudyDesign();

  data study_design;
    length
      generation  $ &defLength.
      treatment   $ &defLength.
      group       8
      replicate   $ &defLength.
      gender      $ &defLength.
    ;

    array generations (3) $ &defLength. _temporary_ ('F0', 'F1', 'F2');
    array treatments  (6) $ &defLength. _temporary_ ('0.0', '0.077', '0.19', '0.48', '1.2', '3.0');
    array sexes       (2) $ &defLength. _temporary_ ('M', 'F');

    do _i = 1 to hbound(generations);
      generation = generations(_i);

      if generation = 'F0' then do;
        do _j = 1 to 6;
          treatment = treatments(_j);
          group     = _j;

          if treatments(_j) = '0.0' then _last_letter = 'L';
          else                           _last_letter = 'F';

          do _k = 1 to (rank(_last_letter) - rank('A') + 1);
            replicate = char(collate(rank('A')), _k);

            do _m = 1 to hbound(sexes);
              gender = sexes(_m);
              output;
              end;
            end;
          end;
        end;

      else if generation = 'F1' then do;
        do _j = 1 to 5;
          treatment = treatments(_j);
          group     = _j;

          if treatments(_j) = '0.0' then _last_letter = 'X';
          else                           _last_letter = 'L';

          do _k = 1 to (rank(_last_letter) - rank('A') + 1);
            replicate = char(collate(rank('A')), _k);

            do _m = 1 to hbound(sexes);
              gender = sexes(_m);
              output;
              end;
            end;
          end;
        end;

      else if generation = 'F2' then do;
        do _j = 1 to 5;
          treatment = treatments(_j);
          group     = _j;

          _last_letter = 'F';

          do _k = 1 to (rank(_last_letter) - rank('A') + 1);
            replicate = char(collate(rank('A')), _k);

            do _m = 1 to hbound(sexes);
              gender = sexes(_m);
              output;
              end;
            end;
          end;
        end;
      end;

    drop _: ;
  run;
%mend;

The left join would then look something like the following:

%macro   InsertMissingReplicateRows();
%put NOTE: [MACRO] Executing: InsertMissingReplicateRows();

  proc sql noprint;
    create table table_data_with_blanks as
    select
        "&endpoint." as endpoint
      , S.generation
      , S.treatment
      , S.group
      , S.replicate
      , C.replicate_n
      , C.replicate_mean
      , C.treatment_mean
      , C.treatment_median
      , C.treatment_std
      , C.treatment_sem
      , C.treatment_cv
      , C.treatment_min
      , C.treatment_q1
      , C.treatment_q3
      , C.treatment_max
    from study_design S left outer join combined_statistics C
    on    S.generation  = C.generation
      and S.group       = C.group
      and S.replicate   = C.replicate
    ;
  quit;
%mend;

Automatically assign symbol statements

Within GPLOT, a plot symbol call level29 corresponds to the "nth generated symbol statement" and NOT the nth symbol statement, as one might expect30. The consequence of this behavior is that symbols and colors will change should levels be absent from the plot data. This is a problem when (as is almost always the case) symbol definitions represent a particular level (usually a group).

To overcome this problem, the following macro automatically generates and issues symbol statements. It also creates a table which shows the correspondence between level and symbol.

Please note that this code is copied verbatim from a project and has yet to be generalized.

%macro DefineSymbolStatements(plotType);
%put NOTE: [MACRO] Executing: DefineSymbolStatements(plotType=&plotType);

  /*A plot symbol call level (plot y_value * x_value = symbol)
    corresponds to the nth generated symbol statement.
    Typically this is not a problem as the first symbol level
    corresponds to the first symbol statement.  However the
    correspondence is broken if the data is subset.  For
    instance, if symbol = 2 becomes the first level then
    it will correspond to the first SYMBOL statement (and
    consequently have a different color than desired/expected).

    To avoid this complication and the necessity to manually
    update each scenerio, the macro calculates what levels exist
    then issues a corresponding SYMBOL statement so that the nth
    symbol always corresponds to the nth generated symbol statement.*/

  goptions reset = symbol;

**********************************
*** Determine symbol level
**********************************;
  /*Data must be sorted to identify unique symbols.  Data step
    must be used instead of SQL because levels are incremented.*/
  option nonotes;
  proc sort data = &plotType._plot_4_sym out = _levels_pre_sort;
    by symbol;
  run;

  data _levels;
    set _levels_pre_sort;
    by symbol;

    if not first.symbol then delete;

    increment + 1;
    level = increment;

    keep symbol group level;
  run;
  option notes;

**********************************
*** Correspondence table
**********************************;
  proc sql noprint;
    create table symbols_&plotType. as
    select
       level
      , symbol
      , group
      , case
         when symbol in (1:15, 99, 100) then 'Line'
         else 'Censored Value'
       end as Type
      , symget(cats('group',group,'Color')) as color
      , symget(cats('group',group,'Line'))  as line_type
      , case
         /*Line*/
         when symbol in (1:15, 99, 100) then 'symbol'||
                                     strip(put(level, best.))||
                                     ' value=none color=&group'||
                                     strip(put(group, best.))||
                                     'Color. width=5.3 line=&group'||
                                     strip(put(group, best.))||
                                     'LineType. interpol=join;'
         /*Censored value*/
         else 'symbol'||
               strip(put(level, best.))||
               ' value=dot color=&group'||
               strip(put(group, best.))||
               'Color. height=1.8 interpol=join;'
       end as symbol_statement length = &defLength.
    from _levels
    order by symbol
    ;
  quit;

**********************************
*** Issue symbol statements
**********************************;
  data _null_;
    set symbols_&plotType.;
    call execute(symbol_statement);
  run;

**********************************
*** Remove temporary data sets
**********************************;
  option nonotes;
  proc delete data = _levels _levels_pre_sort (memtype = data);
  run;
  option notes;

  %put NOTE: [DEV] Data set %upcase(symbols_&plotType.) contains all symbol definitions.;
%mend;

How to check if a variable is empty

From the marvelous white paper, Is This Macro Parameter Blank?. Changed to the more meaningful, "IsEmpty".

%macro IsEmpty(macroVariable);
  %sysevalf(%superq(&macroVariable)=, boolean)
%mend;

Called as:

%if %IsEmpty(fileName) %then %do;
  %put ERROR: [MACRO] FILENAME argument is blank.;
  %SetSystemOptions(&originalNOTES.);
  %abort cancel;
  %end;

How to find datasets containing given variables

The following macros may be used to first determine a list of datasets to be analyzed. Then, iterate through the list of datasets and choose the first one which contains the variable of interest.

  %macro ListElement(list, n);
    %let nthElement = %sysfunc(scan(%superq(&list.), &n., |, m));
    &nthElement
  %mend;

%macro IsVariableInDataset(dataset, variable);
  %let dsid = %sysfunc(open(&dataset));
  %if %sysfunc(varnum(&dsid, &variable)) > 0 %then %let found = 1;
  %else %let found = 0;
  %let rc = %sysfunc(close(&dsid));
  &found
%mend;

%macro  GetAnalysisDatasetList();
%put NOTE: [MACRO] Executing: GetAnalysisDatasetList();

  %global listDatasets;

  proc sql noprint;
    select memname
    into : listDatasets separated by '|'
    from dictionary.tables
    where libname = 'WORK'
      and memname not contains 'RAW_'
      and memname not contains 'MEANS_'
      and not (substr(memname, 1, 1) = '_')
    ;
  quit;

  %put NOTE: [MACRO] &=listDatasets;
%mend;

%macro  GetDatasetContainingVariable(variable);
%put NOTE: [MACRO] Executing: GetDatasetContainingVariable(variable=&variable);

  %let currentDataset=;
  %let index = 1;

  %do %until (not %IsEmpty(currentDataset));
    %let dataset = %ListElement(listDatasets, &index.);

    %if %IsVariableInDataset(&dataset., &variable.) %then %do;
      %let currentDataset = &dataset;
      %put NOTE: [MACRO] Variable %upcase(&variable) was found in dataset %upcase(&dataset);
      %end;

    %let index = %eval(&index + 1);
  %end;
  &currentDataset
%mend;

How to split a single column into two based on value of an ID column

Suppose you have data of the form:

data have;
  input group replicate $ sex $ count;

datalines;
1 A F 3
1 A M 2
1 B F 4
1 B M 2
1 C F 4
1 C M 5
2 A F 5
2 A M 4
2 B F 6
2 B M 3
2 C F 2
2 C M 2
3 A F 5
3 A M 1
3 B F 3
3 B M 4
3 C F 3
3 C M 1
;
run;

To transfrom the count column into two separate columns, the following technique31 may be used.

proc sort data=have;
  by
    group
    replicate
  ;
run;

proc transpose data=have out=want(drop=_NAME_) prefix=count_;
  by
    group
    replicate
  ;
  id sex;
  var count;
run;

This produces the following dataset:

group    replicate    count_F    count_M

  1          A           3          2
  1          B           4          2
  1          C           4          5
  2          A           5          4
  2          B           6          3
  2          C           2          2
  3          A           5          1
  3          B           3          4
  3          C           3          1

Use PROC DELETE to delete datasets

How to remove all formatting from a dataset

data UNFORMATTED_TABLE;
  set FORMATTED_TABLE;
  format _all_;
  informat _all_;
run;

How to recover from open code recursion

Sometimes when working with complex macros, it will become apparent that things are not executing as they should. This is likely due to a missing semi-colon or parenthesis, etc. Such omissions may not be your fault. When resolving a macro/macro variable, SAS may generate an error before the necessary closing statement is submitted to the compiler. Given how useless many SAS error messages are, such things are easy to overlook. Or, you may have just made a silly mistake! Try running

*'; *"; *); */; %mend; run;

until the following message appears33.

ERROR: No matching %MACRO statement for this %MEND statement.

How to remove blank rows from a dataset

The following is a concise, efficient way to remove any observations which have missing values for all values.

%macro IsMissingAllVariables();
  compress(cats(of _all_),'.') = ' '
%mend;

Call %IsMissingAllVariables inside a data step.

data clean;
  set raw;

  if %IsMissingAllVariables() then delete;
run;

The macro derives from Joe on StackOverflow. Owing to its esoteric functioning, it is best wrapped in a descriptive macro.

How to create data set containing output from a PROC

Overview

SAS syntax was designed without uniformity in mind34. To overcome this in regards to dataset output, SAS implemented the Output Delivery System (ODS).

Each procedure corresponds to a set of tables which are calculated in the background. Sometimes this processing occurs automatically and other times it may be toggled by an option.

Procedure output is managed via a wrapper placed around the PROC block. To suppress output, use ods exclude all;. Note that for this to work, options such as noprint must NOT be present inside the procedure. To re-enable output, issue ods exclude none;.

The datasets created by a procedure which correspond to a particular part of the print output are given in documentation35. To find the appropriate documentation, press F1 and then search for "ods proc <proc-name>". This is usually sufficient to bring up the appropriate listing. Output datasets are selected prior to the PROC running via the ods output <table-name> = <data-set-name>; syntax.

Example

ods exclude all;
ods output ParameterEstimates = model_parameters;

proc logistic data = model_data;
  model survival_day_7 (event = '0') = log_10_transformed / link = probit;
  output
    out       = model_results (drop = _LEVEL_)
    predicted = predicted_percent_dead
    lower     = lower_cl
    upper     = upper_cl
  ;
run;
ods exclude none;

How to open SAS temporary session folder in Windows Explorer

Place the following in the KEYS menu:

%sysexec(start explorer.exe "%sysfunc(pathname(work))" && exit);

By default SAS uses XWAIT. This leaves the command prompt open and requires the user to issue the 'exit' command manually to close the shell. The alternative is NOXWAIT which automatically issues the 'exit' command. It is possible to automatically close the command prompt using option noxwait; %sysexec(start explorer.exe "%sysfunc(pathname(work))"); option xwait;. However, if one wants to bind it to a hotkey, then this is not possible. The KEYS menu has a length restriction on definitions. To circumvent this, we place && exit in the shell command. What this does is close the command prompt from within the shell. The && tells the command prompt that if the previous command was successful (namely, opening the temporary SAS session folder in Windows Explorer), then issue the following command (which happens to be exiting the command prompt).

How to open SAS "Current Folder" in Windows Explorer

There does not appear to be a direct way to open the SAS "Current folder" in Windows Explorer as there is with the SAS temporary session folder.

  1. Add an abbreviation by going to Tools > Add Abbreviation...
  2. In the "Abbreviation" field, put \pwd. For "Text to insert for abbreviation~, enter the following command:

    %sysexec(start explorer.exe "%GetPwd()" && exit);
    
  3. Navigate to Tools > Keyboard Macros > Macros.... Highlight \pwd and select "Edit". Assign the following commands to "Keyboard macro contents":

    - Move cursor to end of line
    - Insert carriage return
    - Insert the string "%sysexec(start explorer..."
    - Extend the selection to the beginning of line
    

Now, when the \pwd abbreviation is used, the call is placed on the next line and highlighted automatically. The abbreviation may be issued anywhere, even in the middle of a line of code, and will not disrupt what has already been written. To remove the line, create a keyboard macro for "Delete line". This will remove the line which created by \pwd and return the source code to its original state.

where VARIABLE vs. where not missing(VARIABLE)

WHERE selects observations from a SAS data set that meets a particular condition. In subsetting, it will keep all records which evaluate as TRUE.

According SAS documentation,

"In SAS, any numeric value other than 0 or missing is true, and a value of 0 or missing is false. Therefore, a numeric variable or expression can stand alone in a condition. If its value is a number other than 0 or missing, the condition is true; if its value is 0 or missing, the condition is false."

A few important observations must be made. First, the above statement applies only to numeric data within a conditional statement, in particular IF. Character data cannot stand alone in a condition36 (an IF statement). Using a stand alone character value in an IF statement generates an error.

Numeric  
Value Truthiness
. False
0 False
Other True

Second, it seems that SAS makes a distinction between IF and WHERE as conditionals. It may be that WHERE is not a 'true' conditional since it is a subsetting operator and doesn't necessarily perform branching. However, to confuse things even more, SAS provides the subsetting IF which performs a similar function to WHERE.

MISSING returns a numeric result: 1 for missing values, 0 otherwise. A value is considered missing if:

  • it is numeric and has value of ., _, .A, …, .Z.
  • it is character and is all blanks or has zero length.

How to run Git from SAS

This assumes you have Git installed and set up. Within your AUTOEXE.sas file, place the following:

%let gwd = C:\path\to\git\working\directory;
%put NOTE: [DEV] Set Git Working Directory to: &gwd.;

Next, open SAS and select the Editor window. Run the command dm 'keys'; to open the KEYS window. For a given hotkey (I prefer F6), enter:

%sysexec(cd &gwd. && start "" "C:\Program Files\Git\bin\sh.exe" && exit);

This works as follows. First, SAS will expand &gwd.. It then opens a Windows Command Prompt. The cd changes directories to whatever &gwd. resolved to. Git-bash then opens in the current directory (which was changed to &gwd.). This is because git-bash opens in whatever the current directory is by default. Finally, whenever git-bash closes, the exit command is given, closing the Windows Command Prompt session.

Elevate NOTEs to ERRORs

In SAS 9.4, it is possible to elevate the level of various notes to errors. At least two options exist for this.

  1. option varinitchk = error;

    This elevates "uninitialized variable" notes to errors (or warnings).

  2. option dsoptions = note2err

    This option is 'undocumented' and generates errors for the following notes:

19   Variable %*v is uninitialized.
97   Missing values were generated as a result of
98   Division by zero detected at %2q.
99   Mathematical operations could not be performed
108  Invalid numeric data, '%*s' , at %2q.
109  Invalid character data, %f , at %2q.
110  Invalid %sargument to function %b at %2q.
139  Argument to function %*s is not a known variable name:  %*v.
140  Argument to function %*s is not a valid variable name:  %*s.
205  Invalid argument(s) to the exponential operator "**" at %2q.
208  Invalid numeric data, %*s='%*s' , at %2q.
209  Invalid character data, %*s=%f , at %2q.
223  A number has become too large at %2q. %w%*s
224  A number has become too large during the compilation phase.
225  Division by zero detected during the compilation phase.
242  Invalid argument(s) to the exponential operator "**".
258  Invalid argument to function %*b at %2q.
259  Invalid first argument to function %*b at %2q.
260  Invalid second argument to function %*b at %2q.
261  Invalid third argument to function %*b at %2q.
262  Invalid fourth argument to function %*b at %2q.
267  Argument %d to function %*b at %2q is invalid.
356  The SUBSTR pseudo-variable function does not allow character
424  Character values have been converted to numeric
425  Numeric values have been converted to character
429  A number has become too large during the compilation phase,
430  Division by zero detected during the compilation phase,
484  Format %*b was not found or could not be loaded.
485  Informat %*b was not found or could not be loaded.

Source: SAS Note and this Stack Overflow post.

Some data flow models

The following two methods work well for creating SD tables:

  1. "Separate Variable Datasets" - The first method creates separate data sets with a unique analysis variable. For instance, the gsim data set will contain the gsim analysis variable. Analysis is driven by a list of analysis variables. The data set corresponding to each analysis variable is found (by searching available data sets). The corresponding dataset is then fed through the process.
  2. "Master Data Set" - The second method cleans the data and appends it a "Master Table". Here, unique analysis variables are not distinguished. For example, there may exist a single GSI variable. An encoded list of analysis variable and subsetting conditions are passed to a function which subsets the master data accordingly. The resulting data set is then passed through the process.

It seems that the second method is the more flexible as the entire process is controlled by the input list. Variables and conditions may be changed easily without the need to manually create the data subset.

"Master Table" Method

  1. Import data
  2. Clean data
    • Create variables
  3. Create table data
    • These are the data sets which are used to create the corresponding table

Code

/****************************************************************************/
/*  Program Name  :                                                         */
/*  Purpose       :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Author        :
/*  Last Update   :                                               */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Input(s)      :                                                         */
/*  Output(s)     :                                                         */
/*  Ext. Macro(s) :                                                         */
/*  Structure     :                                                         */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Note(s)       :                                                         */
/*                                                                          */
/*  Development   :                                                         */
/*                                                                          */
/****************************************************************************/

********************************************************************
** User Specific Parameters
********************************************************************;
%let debug        = true;

%let dirBasePath  = ;
%let dirData      = ;
%let dirOutput    = ;

/*%include "\path\to\WordOpen.sas";*/
/*%include "\path\to\WordClose.sas";*/

********************************************************************
** Development Parameters
********************************************************************;
options nosource dsoptions = note2err;

%let studyID  = ;
%let fileName = &studyID - filename;
%let title1   = title1;

/*endpoint!quality1:condition1!quality2:condition2...
  Table number corresponds to index in list*/
%let listTables = %sysfunc(tranwrd(%sysfunc(compbl(
  var1!group:4
  var1!group:3!replicate_id:'D'
  var2!treatment:'0.1':replicate_id:'A'
)), %str( ), |));

libname inData  "&dirData.";
libname outData "&dirOutput.";

%let defLength = 100;

title "&title1.";
ods listing close;
options nodate nonumber orientation=portrait;

********************************************************************
** Macros
********************************************************************;
%macro Main();
%put NOTE: [MACRO] Executing: Main();

  %ImportData();
  %CreateMasterData();
  %CreateTableData();
  %CalculateMeans();

  %Skip(&debug., '%OpenRTFForWriting();');
  %OutputTables();
  %Skip(&debug., '%CloseRTFForWriting();');

%mend;

**********************************
*** Utilities
**********************************;
%macro GetListQualities(table);
  %let indexOfFirstBash = %index(&table., !);
  %if &indexOfFirstBash = 0 %then %let list=;
  %else %let list = %substr(&table., &indexOfFirstBash. + 1);
  &list.
%mend;

%macro IsEmpty(macroVariable);
  %sysevalf(%superq(&macroVariable)=, boolean)
%mend;

%macro ListLength(list, delimiter=|);
  %if %sysevalf(%superq(list)=, boolean) %then %let count = 0;
  %else %let count = %eval(%sysfunc(countc(&list., &delimiter.)) + 1);
  &count
%mend;

%macro ListElement(list, n, delimiter=|);
  %let nthElement = %sysfunc(scan(%superq(&list.), &n., &delimiter., m));
  &nthElement
%mend;

%macro OpenRTFForWriting();
%put NOTE: [MACRO] Executing: OpenRTFForWriting();
  %Skip(&debug., '
    ods noresults;
    ods rtf file="&dirOutput.\&fileName..rtf";
  ');
%mend;

%macro CloseRTFForWriting();
%put NOTE: [MACRO] Executing: CloseRTFForWriting();
  %Skip(&debug., '
    ods rtf close;
    ods results;
  ');
%mend;

%macro Skip(toggle, statement);
  %if not (&toggle = true or &toggle = 1) %then
    %sysfunc(dequote(&statement.));
%mend;

**********************************
*** Subroutines
**********************************;
%macro  ImportData();
%put NOTE: [MACRO] Executing: ImportData();

  %ImportVar1();
  %ImportVar2();

%mend;

%macro   ImportVar1();
%put NOTE: [MACRO] Executing: ImportVar1();

  data raw_var1;
    set inData.var1;
  run;
%mend;

%macro   ImportVar2();
%put NOTE: [MACRO] Executing: ImportVar2();

  data raw_var2;
    set inData.var2;
  run;
%mend;

%macro  CreateMasterData();
%put NOTE: [MACRO] Executing: CreateMasterData();

  %CleanData();

  data master_data;
    length dataset $ &defLength. ;
    set
      clean_var1  (in = inVar1)
      clean_var2  (in = inVar2)
    ;

         if inVar1        then dataset = 'clean_var1';
    else  if inVar2        then dataset = 'clean_var2';
  run;
%mend;

%macro   CleanData();
%put NOTE: [MACRO] Executing: CleanData();

  %CleanVar1();
  %CleanVar2();

%mend;

%macro    CleanVar1();
%put NOTE: [MACRO] Executing: CleanVar1();

  data clean_var1;
    set raw_var1;
    where not missing(var1);

    keep
      treatment
      group
      replicate_id
      var1
    ;
  run;
%mend;

%macro    CleanVar2();
%put NOTE: [MACRO] Executing: CleanVar2();

  data clean_var2;
    set raw_var2;
    where not missing(var2);

    keep
      treatment
      group
      replicate_id
      var2
    ;
  run;
%mend;

%macro  CreateTableData();
%put NOTE: [MACRO] Executing: CreateTableData();

/*listTable elements have form:

    endpoint!quality:condition!quality:condition ...

  Colon used instead of an equal sign. If an equal sign is used
  directly, when the macro variable is resolved and passed into
  a macro such as %ListLength, SAS sees the resolved macro
  variable as a macro parameter and throws an error.  To avoid
  the complexities of macro quoting, the colon is instead
  replaced via TRANWRD. Bash is used to delimit the endpoint
  from the qualities (instead of a slash since one of the
  conditions contains a slash).*/
  %local i j;

  %do i = 1 %to %ListLength(&listTables.);
  %put NOTE: [MACRO] Processing table: &i;
  %let table          = %scan(&listTables, &i., |);
  %let endpoint       = %ListElement(table, 1, delimiter=!);
  %let listQualities  = %GetListQualities(&table.);

  %let qualitiesToKeep=;
  %put NOTE: [MACRO] Processing table: &table.;
  %put NOTE: [MACRO] Endpoint: &endpoint.;

    %let subsetConditions=;
    %do j = 1 %to %ListLength(&listQualities., delimiter=!);
    %let raw_quality      = %scan(&listQualities., &j., !);
    %let quality_endpoint = %scan(&raw_quality., 1, :);
    %let quality          = %sysfunc(tranwrd(&raw_quality., :, =));

      %let subsetConditions = &subsetConditions. and &quality.;
      %let qualitiesToKeep  = &qualitiesToKeep. &quality_endpoint.;
      %end;

    %put NOTE: [MACRO] &=subsetConditions;
    %put NOTE: [MACRO] &=qualitiesToKeep;

    data table_&i.;
      set master_data;
      where not missing(&endpoint.)
       &subsetConditions.
      ;

      keep
       dataset
       treatment
       group
       replicate_id
       &endpoint.
       &qualitiesToKeep.
      ;
    run;

    %if %GetObsCount(table_&i.) = 0 %then %do;
      %abort cancel;
      %end;

    %end;
%mend;

%macro  CalculateMeans();
%put NOTE: [MACRO] Executing: CalculateMeans();

  %do i = 1 %to %ListLength(&listTables.);
  %let table    = %ListElement(listTables, &i.);
  %let endpoint = %ListElement(table, 1, delimiter=!);
      %CalculateTreatmentMeans(&i., &endpoint.);
      %CalculateOverallMeans(&i., &endpoint.);
    %end;

%mend;

%macro   CalculateTreatmentMeans(index, endpoint);
%put NOTE: [MACRO] Executing: CalculateTreatmentMeans(index=&index, endpoint=&endpoint);

  %CalculateReplicateMeans(&index., &endpoint.);

  proc means noprint data=means_repl_&index. ;
    by group;
    var mean_repl_&index.;
    output
      out     = means_treat_&index. (drop=_type_ _freq_)
      n       = n_treat_&index.
      mean    = mean_treat_&index.
      stddev  = std_treat_&index.
      stderr  = sem_treat_&index.
      cv      = cv_treat_&index.
    ;
  run;
%mend;

%macro    CalculateReplicateMeans(index, endpoint);
%put NOTE: [MACRO] Executing: CalculateReplicateMeans(index=&index, endpoint=&endpoint);

  proc means noprint data=table_&index.;
    by group replicate_id;
    var &endpoint.;
    output
      out   = means_repl_&index. (drop=_type_ _freq_)
      n     = n_repl_&index.
      mean  = mean_repl_&index.
    ;
  run;
%mend;

%macro   CalculateOverallMeans(index, endpoint);
%put NOTE: [MACRO] Executing: CalculateOverallMeans(index=&index, endpoint=&endpoint);

  proc means noprint data=table_&index.;
    by group;
    var &endpoint.;
    output
      out     = means_overall_&index. (drop=_type_ _freq_)
      n       = n_overall_&index.
      mean    = mean_overall_&index.
      stddev  = std_overall_&index.
      stderr  = sem_overall_&index.
      cv      = cv_overall_&index.
    ;
  run;
%mend;

%macro  OutputTables();
%put NOTE: [MACRO] Executing: OutputTables();

  %local i j;

  %do i = 1 %to %ListLength(&listTables.);
  %put NOTE: [MACRO] Processing table: &i;
  %let table          = %scan(&listTables, &i., |);
  %let endpoint       = %ListElement(table, 1, delimiter=!);
  %let listQualities  = %GetListQualities(&table.);

    %let subsetLabel=;
    %do j = 1 %to %ListLength(&listQualities., delimiter=!);
    %let raw_quality      = %scan(&listQualities., &j., !);
    %let quality          = %sysfunc(tranwrd(&raw_quality., :, =));

    %if &j. = 1 %then %let subsetLabel = &quality;
    %else %let subsetLabel = &subsetLabel. and &quality.;
    %end;

    %if not %IsEmpty(subsetLabel) %then %let title3 = where &subsetLabel.;
    %else %let title3=;

    %GenerateTreatmentTable(&i., &endpoint., %str(&title3.));
    %GenerateOverallTable(&i., &endpoint., %str(&title3.));

    %end;
%mend;

%macro   GenerateTreatmentTable(index, endpoint, title2);
%put NOTE: [MACRO] Executing: GenerateTreatmentTable(index=&index, endpoint=&endpoint, title2=&title2);

  title2 "Table &index.-A. Descriptive Stats for REPLICATE Means for %upcase(&endpoint.)";
  title3 "&title3. ";
  proc report data = means_treat_&index.
             style(header)=[
               background    = white
               rules         = none
               verticalalign = bottom
             ]
             SPLIT='00'x
             ;

    column
      group
      n_treat_&index.
      mean_treat_&index.
      std_treat_&index.
      sem_treat_&index.
      cv_treat_&index.
    ;

    define group              / display center 'group';
    define n_treat_&index.    / display center 'N';
    define mean_treat_&index. / display center 'MEAN';
    define std_treat_&index.  / display center 'STDDEV';
    define sem_treat_&index.  / display center 'SEM';
    define cv_treat_&index.   / display center 'CV';
  run;
  title2;
  title3;
%mend;

%macro   GenerateOverallTable(index, endpoint, label);
%put NOTE: [MACRO] Executing: GenerateOverallTable(index=&index, endpoint=&endpoint, label=&label);

  title2   "Table &index.-B. Descriptive Stats for ALL Means for %upcase(&endpoint.)";
  title3  "&title3. ";
  proc report data = means_overall_&index.
             style(header)=[
               background    = white
               rules         = none
               verticalalign = bottom
             ]
             SPLIT='00'x
             ;

    column
      group
      n_overall_&index.
      mean_overall_&index.
      std_overall_&index.
      sem_overall_&index.
      cv_overall_&index.
    ;

    define group                 / display center 'group';
    define n_overall_&index.     / display center 'N';
    define mean_overall_&index.  / display center 'MEAN';
    define std_overall_&index.   / display center 'STDDEV';
    define sem_overall_&index.   / display center 'SEM';
    define cv_overall_&index.    / display center 'CV';
  run;
  title2;
  title3;
%mend;

********************************************************************
** Main
********************************************************************;

%Main();

%put;
%put NOTE: [DEV] Process complete.;
options source;

How to convert data between 'wide' and 'long' formats

Definitions of 'wide' and 'long'

"Wide" data is a general term for data whose values are distributed among several different columns.

Var1 Var2
1 4
2 5

"Long" data is a general term for data which is differentiated by values of a 'type' column.

Type Value
Var1 1
Var2 4
Var1 2
Var2 5

Converting from 'wide' to 'long'

There are two approaches to converting formats: PROC TRANSPOSE and data step arrays. The syntax for PROC TRANSPOSE is such that it does not generalize simply. Refer to the documentation and fiddle with it until it works.

A general approach is characterized in the following:

data wide;
  input name $ var1 var2;
  datalines;
  John 1 4
  Jane 2 5
  ;
run;

data long;
  set wide;
  array vars var1 var2;
  do _i = 1 to dim(vars);
    name    = name;
    varname = vname(vars[_i]);
    value   = vars[_i];
    output;
    end;
  keep name varname value;
run;

How to do date comparisons

SAS allows for a direct comparison with what is called a 'date constant.' This is a date formatted as '01JAN2015'd. Only dates formatted in this way can be compared directly. Otherwise, a variable which is formatted/informatted needs to be used.

How to find the beginning and end of month

Finding the beginning of the month is easy. Finding the end of the month requires the INTNX function.

begin_date = mdy(dmonth, 01, year);
end_date   = intnx('month', mdy(dmonth, 01, year) , 0, 'end');

Convert DDMMMYYYY date to YYYYMMDD

SAS doesn't provide good native support for dates of the form YYYYMMDD. For instance, so called "date constants" can only be expressed as DDMMMYYYY. This is a stupid format for dates37 and converting to a more reasonable one is a non-trivial task.

%macro CovertDDMMMYYYYtoYYYYMMDD(ddmmmyyyy) / minoperator mindelimiter = ',';

  %let yyyy   = %sysfunc(year("&ddmmmyyyy."d));
  %let month  = %sysfunc(month("&ddmmmyyyy."d));
  %let dd     = %sysfunc(day("&ddmmmyyyy."d));

  %if &month. in (1,2,3,4,5,6,7,8,9) %then %let mm = 0&month.;
  %else                                    %let mm = &month.;

  %let yyyymmdd = &yyyy.&mm.&dd.;

  &yyyymmdd.
%mend;

How to convert exponential characters to numeric

If data is imported as character data and that data is in scientific notation, it is not intuitive how one converts it to numeric. Certainly, we know that it needs to use an informat of Ew.. But it's not always that simple. Here's an example.

8.17E-01
3.02E+03

1.35E+03
7.78E+01
NA
1.16E+05
2.18E+05

Footnotes:

1

I have had instances where changing the color via Tool > Options does not take effect. A way around this is to use the color command.

dm 'log; color error pink; wsave;';

To get a list of available colors and options, look up the "COLOR" command in the SAS help.

2

Facilitates use of the 'Go to line:' shortcut (Ctrl + G).

3

The following macro will print the macro structure to the log. The macro accesses the saved SAS file, so the program must be saved first.

/****************************************************************************/
/*  Program Name  : ListMacroDefinitions                                    */
/*  Purpose       : List all macro names as defined in the current program  */
/*                  in the log.                                             */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Author        : Matt Trzcinski                                          */
/*  Last Update   : 2017/04/03                                              */
/*                                                                          */
/*--------------------------------------------------------------------------*/
/*  Note          : Current file must be saved before excecuting macro. The */
/*                  program reads in that saved file and parses it for      */
/*                  macro definitions.                                      */
/*                                                                          */
/*                  Macro name is considered to be the set of characters    */
/*                  between the string '%macro ' and the left parenthesis.  */
/*                  This allows for indentation of macro names within the   */
/*                  list by inserting spaces at the beginnging of the macro */
/*                  name.  For example, %macro   ThirdLevelMacro();         */
/*                  Support is included for only 10 macro levels.           */
/*                                                                          */
/*                  It is advised to place this macro inside an AUTOCALL    */
/*                  library so that it may be bound to a shortcut via the   */
/*                  KEYS menu.  AUTOCALL libraries can be found by issuing  */
/*                          %put %sysfunc(pathname(sasautos));              */
/*                  Once placed in an AUTOCALL library, issue:              */
/*                                    dm 'KEYS';                            */
/*                  For a hotkey, assign the following code:                */
/*                        submit '%ListMacroDefinitions();'                 */
/*                  Press Ctrl + s to save the keybinding.                  */
/*                                                                          */
/*  Developer     : Create error handling for case when file is not saved.  */
/*                                                                          */
/*                  It appears impossible to delete a macro from within     */
/*                  that same macro.  As such, ListMacroDefinitions is      */
/*                  not deleted after it is compiled. It may be possible to */
/*                  delete by implementing a shorter macro name and         */
/*                  explicitly calling PROC CATALOG from KEYS.  However,    */
/*                  this approach presents problems when trying to mask     */
/*                  log notes.  KEYS commands are restricted in character   */
/*                  length.                                                 */
/*                                                                          */
/****************************************************************************/
%macro ListMacroDefinitions();

  option nonotes;

  dm 'clear log;';

  /*Assign full path of current program to temporary macro
    variable.*/
  proc sql noprint;
    select distinct xpath
    into : _path trimmed
    from dictionary.extfiles
    where xpath contains "%sysget(SAS_EXECFILENAME)"
    ;
  quit;

  /*Read program in line-by-line and parse for
    macro definitions.  To call this step using
    KEYS requires the file path to be hardcoded.
    SAS cannot dynamically resolve the path. When
    the program is compiled, the path is resolved
    before being saved.*/
  data _null_ / pgm=work.list_macros;
    infile "&_path" dlmstr = '``?`?' lrecl = 32767;
    length
      line      $ 32767
      macroName $ 42 /*ValidV7 macro name is 32 + 10 possible indentation levels*/
    ;
    input line;

    /*Accomodate leading spaces in macro names. Note that
      the position of the cursor after the space in
      '%macro ' is 8*/
    if line =: '%macro' then do;
      lengthFromCol1ToLeftParenthesis = index(line,'(');
      lengthMacroName                 = lengthFromCol1ToLeftParenthesis - 8;
      macroName                       = substr(line, 8, lengthMacroName);
      lengthStrippedMacroName         = length(strip(macroName));
      numberOfLeadingSpaces           = lengthMacroName - lengthStrippedMacroName;

      put +numberOfLeadingSpaces macroName; /*put strips leading spaces from variable*/
    end;
  run;

  /*Run list_macros program*/
  data pgm=work.list_macros;
  run;

  proc delete data = list_macros (memtype = program);
  run;

  option notes;

%mend;
4

In the past, the %ListLength was implemented as

%macro ListLength(list);
  %let count = %sysfunc(countw(&list, '|'));
  &count
%mend;

This approach does not count blank elements in a list. Note that the old approach counts differently than the new implementation. Consider the list list = e1|e2|.

5

The dataset scan function and the macro function %scan behave differently. The former must be used in order to account for empty list elements. Note that the list mechanism developed here is imperfect. For instance, an empty macro list has length 0. A macro list of the form | has two elements, each of which are blank. It is not possible to have a list containing one element whose value is a space.

It is certainly possible to access items of a list using %scan. However, while it will behave as expected with lists containing no blank entries, it will not behave expected with lists of the sort advised in the "Guide" section.

%scan(&listDatasets, &d, '|');
6

Technically speaking, it is likely faster to assign %ListLength outside of the loop (i.e. not process it on the fly). I recall seeing some SAS documentation advising this as possible performance optimization. However, processing in the manner advised here is typically sufficient. Performing the calculation on the fly helps make the code less cluttered.

7

Data sets should not be deleted for two reasons. The first is that, in general, deleting data sets makes a program less clear. If the data set did not contain information important to the execution of the program task, then it should not have been created. If the data set is important to the process, the data set itself is important for validation and developing an understanding of how the process works. The second reason is that each program should perform a single task. Every data set in that session should be relevant to the program task. Running different programs (i.e. executing two conceptually different tasks) within the same session violates, in a sense, the Single Responsibility Principle. At the very least, it is an unnecessary complication that is easily avoided by starting a new session for each program. If the practice of allowing different programs to run in the same session is allowed, then a whole range of practices need to be adopted to counteract the inevitable conflicts that will occur. Code needs to be added to clear the WORK library and all macro variables. Extra attention needs to be paid to what system options are set. Naming conventions become even more restrictive as there needs to not only be no conflicts within a single program, but across all or many. The author can see no advantage to running different programs within a single session beyond the immediate convenience of grokking how a different program functions (i.e if your current program is based on another and to answer a question about the other's function, you execute a portion of it in the current session). Aside from this edge case, running multiple programs concurrently within the same session should be avoided at all costs and all programs should be written assuming that the person running it adheres to a one-program-one-session mentality. Unfortunately, people do silly things and will disregard such practices. The debug mode should act as a defense against these people.

8

Often, a single folder for 'code' and for 'data' suffices. Although it makes conceptual sense to separate input data from output files, in practice, this tends to be impractical. Since the program rarely cares what is in a folder, the use of separate folders is a human convenience. However, having different folders for input, output, results, and figures is inconvenient. With careful naming of files, a single folder can be well organized. It also avoids needing to update multiple file paths in each program.

9

Doing so ensures that the log entry you are viewing is for the most recently submitted code.

10

Versioning control is best handled by a versioning control system. Including versioning in the program itself isn't of much utility and detracts from overall readability.

11

There exist two logically equivalent perspectives to take here. One is that of 'debug', the other of 'production'. The distinction is merely a logical inversion. Here, 'debug' is meant as 'does not generate a permanent output' whereas 'production' means 'generates a permanent output.' The understanding is that a completed program should produce a permanent output. This implies that a 'production' mode should be preferred; i.e. all completed programs produce something so that production is the default state. However, much of the time working with a program is either its development or its validation. To run a program typically takes seconds versus hours or days. Development and validation often desire no output be permanent. Furthermore, 'debug' is a concise, unambiguous term. 'production' is slightly longer and doesn't necessarily evoke thoughts of programming. Ultimately, the choice is arbitrary. The author has chosen the 'debug' persepective.

12

Doing so helps verification, as the data is isolated by where it is in the process. It also provides extra security against errors in subsetting. An error in subsetting could be silent to include more or less data than intended. Isolating the data helps make these decisions be explicit.

This is clearly not always going to be efficient for program execution times. In circumstances where process time is a consideration, a good strategy is to include an extra descriptor column.

13

Having data sorted in an intuitive way at all times ensures clarity and ease of verification.

14

This may seem to contradict the requirement that all data be sorted at all times. It does not. SAS provides a variety of ways to sort data and not all of them are comparable. Constant calls to PROC SORT break up the flow of a program and make it difficult to read. (The default number of lines in the Editor window is 19. The smallest PROC SORT is 3 lines, plus the 2 lines on either side. That is more than a quarter of the total available visible code when the log is displayed.)

Sorting is best handled on the front end, during the data's creation/cleaning. A properly sorted data set often retains its sorting throughout a program.

Give preference to the PROC SQL ORDER BY statement. While this technically violates the single responsibility principle, it makes intuitive sense and is a minor compromise to ensure consistency across data sets.

Use CLASS statements over BY statements. CLASS statements do not require sorting and often produce the same results.

Avoid PROC MEANS and PROC SUMMARY whenever possible. Use the SQL Summary Functions instead. PROC MEAN/SUMMARY remove the order from data whereas either PROC SQL doesn't or it can be maintained via the ORDER BY statement.

Be warey of the NOTSORTED option.

15

It may be necessary for large scale programs to write data to file and read it to file. However, given the company specific considerations, such a practice is not endorsed here. Issues of efficiency are not a major concern under these conditions. Having to manage multiple libraries beyond a single input and output is an unnecessary complication.

16

This should be obvious. Permanent data files are called that because they remain fixed. Processing has already been done on them. If they are not processed as they need to be, then update the program which generated it to process it correctly. If updating the originating program is not possible, then create a new data set with the needed characteristics. Other program's may rely on the permanent data set in its originial state.

17

Overwriting data sets makes validation extremely difficult because it makes the data time-dependent. It forces the person validating the program to step through the program one line at a time, assessing each line for accuracy in syntax and purpose. A data set may be correct at one time in the program but incorrect at another. This obscures errors. When data sets have unique names, a data set will not be overwritten. A data set becomes time-independent. Such a practice avoids the errors of time-dependent data sets.

Creating a single data set per data step obeys the Single Responsibility Principle. If the two data sets will forever be identical, then two data sets are not necessary. If it is conceivable that one of them may differ in the future, then the data step would be allowed to change for more than a single reason. That is, the data sets would begin to follow two separate paths. Two separate paths require two separate steps. Put differently, if they are conceptually identical, duplication is extraneous. If they are conceptually different, then separation of ideas is required for clarity of presentation.

18

It may be tempting to define defLength as

%let defLength = $ 100. ;

This does not work, however, with SQL. The $ and the period are both not syntactically correct for PROC SQL.

Implementing a default length may affect performance. However, under the consideration of small programs/moderate data sets, this is not an issue. Implementing a default length greater than 8 has many advantages.

  1. Makes all variables between data sets compatible.
  2. Avoids truncation.
  3. Prevents notes regarding possible truncation, thus supporting the 'clean' log policy.
  4. Facilitates the easy arrangement of variables within data sets.

A length of 100 is suggested. This value is typically long enough to avoid truncation. It is also short enough to make viewing datasets easy. (Too large a value often makes it necessary to scroll horizontally when viewing data sets.)

19

Ideally, macros should act as wrappers to a conceptual step; it should be possible to exectute the code contained within a macro independently of the macro. This is not always possible in practice, as with open code looping. However, in cases where it is possible, debugging a macro becomes trivial, as the macro does exactly what the data step/proc it contains does. The macro is there to provide organization and documentation. In cases where execution independent of the macro is not possible, as with looping, it may be necessary to call the macro. Debugging in this case then requires setting the necessary macro variables and executing a test case. Entangling macro statements and non-macro statements eliminates these capabilities. It makes debugging more difficult and obscures clarity.

20

It is also possible to define a single macro variable which is then used to store multiple options:

%macro SetSystemOptions(options);
  options &options.;
%mend;
21

When the code is constructed in the manner advised in this guide, another form of documentation arises naturally, namely, "asking the program." Every program should be created with a DEBUG mode. This means that the user is free to experiment without fear that execution will alter anything (such as output files) irrevocably. The ability to freely execute the program means that aside from looking at the source code for a particular macro, the user may also execute it. For example, suppose the user wants to check if outliers are being removed from a data set. To check, she may run %CreateGSIData() which is contained in Main(). This in turn runs %CreateMaleGSIData() and %CreateFemaleGSI() data. When each of these executes, it can be seen that the former reads observations from WORK.LESS_OUTLIERS_MALE_GONAD and the latter from WORK.RAW_FEMALE_GONAD. This provides several pieces of useful information. First, that the GSI data is contained in a male form and a female form. Second, that the male GSI is bring created without outliers. The user may then proceed accordingly.

22

Unused/commented code can be a major source of clutter. Furthermore, if it hangs around long enough, it becomes a source of confusion.

If you are concerned about needing code at an undetermined later time, this is good reason to adopt some form of versioning. What used to apply but no longer does is essentially the definition of a previous version.

23

Clearly, this is not always possible or practical. However, given the option, using the positive often makes for more direct code. It prevents the reader from having to do a set subtraction in their head. The positive puts the elements in front of them. A KEEP statement tells the reader what to expect in the output. A DROP statement may require actually creating the data set to see what is in it.

Using positives has the additional benefit of scaling well. If an incoming data set changes, using a KEEP statement will preserve the expected output data set's structure. If the new variable is needed later in the program, an error will be raised, alerting the devloper to update the KEEP statement. Consider if a DROP were used instead. The output data set would contain the new variable. If not mandatory to the output data set, the new variable would be clutter. It would require not only adding the new variable to the input data set, but also updating the DROP statement. The clutter would occur silently.

24

Note that to keep the overall program organized, this needs to be placed above the utility macros. As such, these options need to be expressed in a data _null_ step. This resets the default ODS destinations and then, depending on how the debug is toggled, adjusts the output destinations.

Also note that SAS has a 262 character limit for quoted strings before it issues a warning.

WARNING: The quoted string currently being processed has become more
than 262 characters long.  You might have unbalanced quotation marks.

Using the single entry &goptions. approach may trigger this. If a warning occurs, then the NOQUOTELENMAX system option can be used to suppress it. Be sure to renable QUOTELENMAX before the program proceeds.

Using ods html path = "%sysfunc(pathname(WORK))"; appears to be the only way to reset the ODS HTML destination after it has been closed. The default behavior appears to set the current working directory when the HTML destination is reopened. This causes a seemingly unavoidably delay on the first exectuion.

25

PROC REPORT can merge the first column of a table. However, to the author's knowledge, one cannot merge cells in the middle of a table. If it is possible, the author doubts that its implementation will meet the clarity criteria required by this guide.

26

Again, the Single Responsibility Principle. The mechanism which outputs the table should do just that: output the table. It should not handle any calculation or transformation.

27

It does a surprisingly good job of creating headers and the syntax is straight forward when used as a mere printing mechanism.

28

This is necessary because SAS does not have the native capability to format tables in a flexible manner. In the Windows environment, in which this guide is based, VBScript is the only viable option. The obvious choice would be .NET. Unfortunately, .NET is a compiled language. In order to compile a script from SAS, Powershell must be leveraged through system commands. The 'iex' command allows for a .NET script to be compiled and executed on the fly. However, any use of the 'iex' command is flagged by IT, generating an email and a potential investigation. To use VB6 would require first compiling the script into an EXE before it could run. It also appears that VB6 is not natively supported on Windows and requires a separate install. Another suggestion may be to use VBA. VBA does not exist for Word. Instead, an ancient variant called WordBasic is used. The commands to handle merging and alignment do not exist in the language (or do not perform the necessary functions). Furthermore, to use WordBasic or VBA would require the code exist within the document's environment. This would require a template document that could not be created from scratch using SAS. Therefore, the only alternative left is VBScript. Fortunately, this is supported and, although somewhat disparate, the documentation exists or is close enough to VB6 to bludgeon a solution.

If Microsoft removes the functionality required for this approach, try COM objects. See https://stackoverflow.com/a/59144071/5065796 to get started. Specifically, read the relevant chapters of Python Programming on Win32. This is the best explanation I have found of the COM.

29

By "symbol call level", I mean the unique values of the 'symbol' variable within a plot y_value * x_value = symbol statement.

30

Can anyone provide me with a reason why this is? What advantage does using the 'nth generated symbol' present over a 1:1 correspondance with the nth symbol statement? I assume there is a cogent explanation, but have been unable to deduce it myself and have been unable to find one.

31

This is unfortunately the best technique that has been developed to the author's knowledge. It is from the stackoverflow question of the same name.

32

PROC DELETE is available in earlier versions of SAS, but was 'undocumented'. It has been (quietly) formally introduced in SAS 9.4. It tends to be significantly faster than PROC DATASETS.

34

For example, there exist at least 4 different ways to create output data sets from within a procedure.

/*Example 1*/
proc sort data = sashelp.baseball out = baseball_sorted;
  by
    league
    division
  ;
run;

/*Example 2*/
proc means noprint data = baseball_sorted;
  by
    league
    division
  ;
  var nHits;
  output
    out = baseball_avg_hits (drop = _TYPE_ _FREQ_)
    mean = mean_hits
  ;
run;

/*Example 3*/
ods exclude all;
ods output
  statistics  = baseball_statistics
  equality    = baseball_ftest
;
proc ttest data = baseball_sorted;
  class league;
  var nHits;
run;
ods exclude none;

/*Example 4: PROC ANOVA's OUTSTAT= option.*/

More discussion can be found in the StackOverflow question: Understanding SAS output data sets.

35

SAS and its community are big fans of ODS trace. In practice, I find it more informative and helpful to simply look at the documentation directly. Not only does it provide greater detail and an exhaustive list, it is also searchable via Ctrl + F. This is helpful if you're not quite sure what you're looking for.

36

Consider the following:

data input;
  length var $ 10;

  var = 'non-empty';
run;

data test;
  set input;

  if var then put 'True!';
run;

This generates the following error:

NOTE: Character values have been converted to numeric values at the places given by:
    (Line):(Column).
    340:6
NOTE: Invalid numeric data, var='non-empty' , at line 340 column 6.
var=non-empty _ERROR_=1 _N_=1
2018-10-17

Powered by peut-publier

©2020 Matt Trzcinski