/*---------------------------------------------------------------------------------- Mixed_Long_logit:A SAS macro for fitting mixed logistic regression model using PROC GENMOD and PROC NLMIXED. DESCRIPTION: The macro logic is an iterative process that runs PROC GENMOD and then feeds the needed BETA values as input into PROC NLMIXED that is then subsequently run. The PROCs are run alternatively until the input independent variables are exhausted. For example if the input variables were "age race income" the sequence would be as follows. PROC GENMOD using age PROC NLMIXED using age PROC GENMOD using age & race PROC NLMIXED using age & race PROC GENMOD using age & race & income PROC NLMIXED using age & race & income Finish In the initial housekeeping of the macro the first step is to determine the number of independent variables. This is accomplished using the macro named WORDS. This macro came from the SAS technical support website at the address http://support.sas.com/faq/016/FAQ01617.html. Following this task the macro enters into the main processing loop. The initial iteration is unique in that PROC NLMIXED requires only BETA0 and BETA1 (from PROC GENMOD) and no input data set for the parms statement. Thus the macro GETBETAINIT is only run a single time. On subsequent iterations the parms statement is constructed by obtaining a BETA from PROC GENMOD and also using the output data set (MACROLOGIT_OUT) from the previous iteration of PROC NLMIXED (using macro NLMPROC_LOGIT). SYNTAX: %Mixed_Long_logit (_depvar, _indepvars, _dummyvars, _sdest, _libref, indata=) Where the arguments are as follows: _depvar Required parameter. Specifies the dependent (outcome) variable for the model. _indepvars Required parameter. Specifies the independent variable(s) for the model. _dummyvars Optional parameter that is used for dummy variables that can be specified to distinguish when multiple variables need to be added into a model at the same time. When the program was first constructed the program was designed to add one new variable at a time as noted above with the age,race, income example. However in the case of dummy variables (ex.CANCER, CHF) and a reference group (COPD) we want the variables to be processed at the same time rather than sequentially. If we designate CANCER as a dummy variable in the argument _dummyvars then the program knows to skip CANCER but then run the next model with CANCER and CHF entering in together. So the input syntax would be: (ql2_r, age race income cancer chf, cancer, 2, outlib, indata=inlib.vertical) **Please note that cancer is specified first, and then chf in the _indepvars argument. If you had four categories for a variable with three dummies that you wanted to include as well (ex ms_01 ms_02 ms_03 ms_04)and ms_01 was the reference group the syntax would be: (ql2_r, age race ms_02 ms_03 ms_04 income cancer chf, ms_02 ms_03 cancer, 2, outlib, indata=inlib.vertical) _sdest Optional parameter that can be used to set the initial estimate of the standard deviation of the random effect for the intercept. When this parameter is left blank the program uses the SAS default of one for the intial estmate. _libref Optional parameter that can be used for the output libname. If included, the output from the NLMIXED procedure (ODS data set parameterestimates plus a couple of created variableswill be saved in the specified library. The output data set is primarily made up of variables from the ODS data set named ParameterEstimates with the addition of the created variables named variable_name, OR_calculation, L95CI and U95CI. The name of this data set is MACROLOGIT_OUT&i where &i starts at one for the first model and is incremented by one for each additional model.Remember to include a LIBNAME statement for this library in your program. If left blank the output will be written to the temporary WORK library. Do not include a period (.) with libref argument. For example: Correct Argument: outdata Incorrect Argument: outdata. indata= Required parameter. Specifies the input data set. Remember to include the library reference along with the data set name. For example: inlib.vertical_covariate ***** IMPORTANT LIMITATIONS TO BE AWARE OF, PLEASE READ ****** # 1: Please note that SAS version 8.2 truncates variable names at 20 characters in the ODS data set ParmeterEstimates and hence this macro is unable to run properly for variable names greater than 20 characters. To work around this limitation it is recommended that you rename the variable name to something that is 20 characters or less. For example healthcare_utilization could be renamed to healthcare_util. Also if you are using interaction terms in your model it is important that the combination of variable names do not exceed 20 characters. For example while the variables ethnicity and marital_status are under the 20 character limit individually, entered as an interaction term, ethnicity*marital_status, the limit is exceeded. If marital_status is shortened to marital then ethnicity*marital works fine. This limitation will not be applicable for SAS 9.1 since the column width restriction in the ParameterEstimates data set is being removed. #2: Please note that when using interaction terms, the naming of the interaction term must correspond to the order in which the individual variables were introduced into the model. For example if you want the independent variables in the model to be: race gender race*gender then you must specify the _indepvars argument to be: race gender race*gender. If instead you specify it as: race gender gender*race then the macro will not function properly. This limitation is due to a first in first out algorithm SAS uses to store data in tables for making calculations. If this recommendation is not followed then the macro is unable to properly match the interaction term and the macro will not run properly. ----------------------------------------------------------------------------------*/ %macro Mixed_Long_logit (_depvar, _indepvars, _dummyvars, _sdest, _libref, indata=); /* Initial Housekeeping */ %let indepvarlast = ; *Holds the last indep. variable added to indepvarstr; %let indepvarstr = ; *Builds string of ind. vars one variable at a time; %let indepvarctr = 0; *Number of ind. variables calculated by Words macro; %let dummy_ctr = 0; %let dummy_flag = 0; %let dummy_holder = ; %let dummy_beta = ; %let etastr = ; *Create syntax for eta statement in PROC nlmixed; %let parmstr = ; *Create syntax for parm statement in PROC nlmixed; %let estimatestr = ; *Create syntax for estimate statement in PROC nlmixed; %let betainit = ; *Holds value of beta0 from initial PROC genmod; %let betalast = ; *Holds value of beta(i) from PROC genmod GEEEmpPEst; %let indepvarctr = %words(&_indepvars); *Determine # of independent variables; %let outlib = ; *Holds value of the output library name. Can be blank; %let sigma_value = ; *Holds value for sigma for the initial proc genmod; /* Determine if the output data sets should be permanently saved and add period to libref */ %if &_libref ne %then %do; %let _period =.; %let outlib = &_libref&_period; %end; /* Determine if the starting value for sigmau for the initial proc genmod iteration should be let as the default of 1 or change to an input value. */ %if &_sdest ne %then %do; %let _sigmatext = %str(sigmau = ); %let sigma_value = &_sigmatext&_sdest; %end; /* Main processing Loop. The number of independent variables determines the number of times it will execute. In each iteration proc genmod will be executed then the macro NLMPROC_LOGIT. */ %let i=1; %do %while (%qscan(&_indepvars,&i,%str( )) ne ); *Checks until no more indep.vars; *Indepvarstr grows by one var each iteration; %let indepvarstr = &indepvarstr %QSCAN(&_indepvars,&i,%STR( )); %let indepvarlast = %QSCAN(&_indepvars,&i,%STR( )) ; %let q = 1; %do %while (%qscan(&_dummyvars, &q,%str( )) ne ); %if %qlowcase(&indepvarlast) = %qscan(%qlowcase(&_dummyvars),&q,%str( )) %then %do; %let dummy_ctr = %eval(&dummy_ctr + 1); %let dummy_flag = 1; %let dummy_holder = &dummy_holder %qscan(&_dummyvars,&q,%str( )) %str( ); %end; %let q = %eval(&q + 1); %end; %if &dummy_flag ne 1 %then %do; %procgenmod_logit; %if &dummy_ctr = 0 %then %do; %if &i=1 %then %do; *Initial case unique. Beta0,beta1 come from genmod; %getbetainit; *Call macro getbetainit only when i=1; *Use GBIresult from getbetainit to set the value for beta0; %let betainit = %str(beta0 = ) %sysfunc(trim(&GBIresult)); %getbetalast; *Use GBLresult from getbetalast to set the value for beta1; %let betalast = %str(beta1 = ) %sysfunc(trim(&GBLresult)); %let parmstr = &betainit &betalast &sigma_value; *Input for NLMPROC_LOGIT; %end; %else %if &i>1 %then %do; %getbetalast; *Establish value of beta&i using output from genmod; %let betalast = %str(beta&i = ) %sysfunc(trim(&GBLresult)); *Beta&i comes from gm_out plus data set from previous nlmixed; %let parmstr = &betalast / data=&outlib%str(MACROLOGIT_OUT)%eval(&i-1); %end; *Etastr is used as input for macro NLMPROC_LOGIT; %let etastr = &etastr %str(+ beta&i*)%QSCAN(&_indepvars,&i,%STR( )); %NLMPROC_LOGIT; %end; %else %do; *Enter in this branch where previous variable was a dummy variable; *Following calculation will yield one if &dummy_holder contains the initial independent variable(s) in &indepvarstr; %let initcheck = %eval(&i - &dummy_ctr); %if &initcheck = 1 %then %do; %getbetainit ( ); %let betainit = %str(beta0 = ) %sysfunc(trim(&GBIresult)); %let betalast = ; %let k = 1; %do %while (%qscan(&dummy_holder, &k,%str( )) ne ); %let dummy_beta = %QSCAN(&dummy_holder,&k,%STR( )); %getbetadummy(); %let betalast = &betalast %str(beta)%eval(&i - &dummy_ctr)%str( = ) %sysfunc(trim(&GBDresult)); %let dummy_ctr = %eval(&dummy_ctr - 1); %let k = %eval(&k + 1); %end; %getbetalast(); %let betalast = &betalast %str(beta&i = ) %sysfunc(trim(&GBLresult)); %let parmstr = &betainit &betalast &sigma_value; %end; %if &initcheck ne 1 %then %do; %let betalast = ; %let k = 1; %do %while (%qscan(&dummy_holder, &k,%str( )) ne ); %let dummy_beta = %QSCAN(&dummy_holder,&k,%STR( )); %getbetadummy(); %let betalast = &betalast %str(beta)%eval(&i - &dummy_ctr)%str( = ) %sysfunc(trim(&GBDresult)); *Decrease by 1 until no more variables left in dummy_holder; %let dummy_ctr = %eval(&dummy_ctr - 1); %let k = %eval(&k + 1); %end; %getbetalast(); %let betalast = &betalast %str(beta&i = ) %sysfunc(trim(&GBLresult)); *Reset &dummy_ctr to value of the # dummy vars. in &dummy_holder; %let dummy_ctr = %words(&dummy_holder); %let parmstr = &betalast / data=&outlib%str(MACROLOGIT_OUT)%eval(&i-%eval(&dummy_ctr+1)); %end; %let etastr = &etastr %str(+ beta&i*)%QSCAN(&_indepvars,&i,%STR( )); %NLMPROC_LOGIT; %let dummy_holder = ; %let dummy_ctr = 0; %end; %end; *Following true when last indep. var. read is a dummy var. found in &_dummyvars; %if &dummy_flag eq 1 %then %do; %let etastr = &etastr %str(+ beta&i*)%QSCAN(&_indepvars,&i,%STR( )); %end; %let dummy_flag = 0; %let i=%eval(&i + 1); *Increment counter i; %end; %mend Mixed_Long_logit; %MACRO procgenmod_logit; Proc GENMOD data=&indata descending; class id; model &_depvar = &indepvarstr / DIST=BIN LINK=LOGIT; repeated subject=id/type=cs corrw; Title1 " Logistic Regression Model for outcome variable: &_depvar "; Title2 " Independent variables: &indepvarstr"; Title3 " Input data set: &indata"; Title4 " Created on: &sysdate at: &systime"; ods output GEEEmpPEst=gm_out&i; run; %MEND procgenmod_logit; %macro NLMPROC_LOGIT (); proc nlmixed data=&indata qpoints=10; Title1 " Mixed Effect Logistic Regression Model for outcome variable: &_depvar "; Title2 " Indep. vars: &indepvarstr"; Title3 " Input data set: &indata"; Title4 " Created on: &sysdate at: &systime"; parms &parmstr; eta = (beta0 + u) &etastr; expeta = exp(eta); p = (expeta/(1 + expeta)); model &_depvar ~ binomial (1,p); random u ~ normal(0, sigmau*sigmau) subject=id; %put 'Parameter Estimates = ' &outlib%str(MACROLOGIT_OUT)&i; ods output ParameterEstimates=&outlib%unquote(MACROLOGIT_OUT)&i; run; data &outlib%str(MACROLOGIT_OUT)&i; set &outlib%str(MACROLOGIT_OUT)&i; attrib L95CI length=3 format=5.2; attrib U95CI length=3 format=5.2; attrib OR_Calculation length=3 format=5.2; L95CI = exp(lower); U95CI = exp(upper); OR_Calculation=exp(estimate); if parameter eq 'sigmau' then do; or_calculation = .; L95CI = .; U95CI = .; end; run; data &outlib%str(MACROLOGIT_OUT)&i; set &outlib%str(MACROLOGIT_OUT)&i; attrib variable_name length=$32 label='Name of variable'; if substr(parameter,1,4) = 'beta' then do; variable_ctr = input(substr(parameter,5,2),3.); if variable_ctr eq 0 then variable_name = 'intercept'; else variable_name = scan("&_indepvars",variable_ctr,' '); end; else if parameter = 'sigmau' then variable_name = 'sigmau'; proc sort; by variable_ctr; proc print data=&outlib%str(MACROLOGIT_OUT)&i noobs label; var variable_name parameter estimate L95CI U95CI OR_Calculation; run; %mend NLMPROC_LOGIT; /* The WORDS macro counts the number of words in a string and returns that value. I did not write the macro but instead found it on the SAS website in the FAQ's for macros. The URL is http://support.sas.com/faq/016/FAQ01617.html In order for it to work you assign the result of the macro to a new variable like the following: %let newvar = %words (this is a sample); The value of &newvar would be 4; */ %MACRO WORDS(string); %Local Count Word; %let Count=1; %let Word=%qscan(&string,&Count,%str( )); %do %while(&Word ne); %let Count=%eval(&Count+1); %let Word=%qscan(&string,&Count,%str( )); %end; %eval(&Count-1) %MEND WORDS; /* GETBETAINIT is used to obtain beta0 from the initial run of proc gen mod */ %MACRO GETBETAINIT (GBIDATA=gm_out&I); /*Need Global statement so call symput stores value on global table Otherwise value is just stored on local table */ %global GBIresult; DATA _NULL_; SET &GBIDATA; If parm='Intercept' then do; *Remember call symput makes its value available at end of datastep; call symput('GBIresult',round(estimate,.0001)); end; run; %MEND GETBETAINIT; %MACRO GETBETADUMMY (GBDDATA=gm_out&I); %global GBDresult; DATA _NULL_; SET &GBDDATA; If lowcase(parm)="%qlowcase(&dummy_beta)" then do; call symput('GBDresult',round(estimate,.0001)); end; run; %MEND GETBETADUMMY; %MACRO GETBETALAST (GBLDATA=gm_out&I); %global GBLresult; DATA _NULL_; SET &GBLDATA; If lowcase(parm)="%qlowcase(&indepvarlast)" then do; call symput('GBLresult',round(estimate,.0001)); end; run; %MEND GETBETALAST;