SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
Introduction to SAS

                                              Ista Zahn

                                        Harvard MIT Data Center


                                         March 15th, 2013



                                       The Institute
                                       for Quantitative Social Science
                                       at Harvard University



Ista Zahn (Harvard MIT Data Center)         Introduction to SAS    March 15th, 2013   1 / 38
Outline


    1   Introduction

    2   Data Import and Export

    3   Descriptive Statistics

    4   Variable and Value Labels

    5   Data Manipulation

    6   Wrap-up




Ista Zahn (Harvard MIT Data Center)   Introduction to SAS   March 15th, 2013   2 / 38
Introduction


 Topic


    1   Introduction

    2   Data Import and Export

    3   Descriptive Statistics

    4   Variable and Value Labels

    5   Data Manipulation

    6   Wrap-up




Ista Zahn (Harvard MIT Data Center)      Introduction to SAS   March 15th, 2013   3 / 38
Introduction


 Documents for Today




          Log in
                 USERNAME: dataclass
                 PASSWORD: dataclass
          Find class materials on th scratch drive
          Includes data, presentation slides, exercises
          Copy materials to your desktop!




Ista Zahn (Harvard MIT Data Center)      Introduction to SAS   March 15th, 2013   4 / 38
Introduction


 Workshop Description




          This is an INTRODUCTION to SAS – assumes no knowledge of SAS!
          Not appropriate for people already well familiar with SAS
          Learning Objectives:
                 Import and export data in a varity of formats
                 Create and use variable and value labels
                 Perform basic data manipulation
                 Compoute descriptive statistics
                 Wrap-up




Ista Zahn (Harvard MIT Data Center)      Introduction to SAS     March 15th, 2013   5 / 38
Introduction


 Why SAS?




          If you know SAS, it is likely you will not need any other software
          packages
          Among the most powerful statistical software packages available
          Great user community: macros, websites, etc.
          Free online documentation:
          http://support.sas.com/documentation/93/ind ex.html




Ista Zahn (Harvard MIT Data Center)      Introduction to SAS        March 15th, 2013   6 / 38
Introduction


 SAS history
   SAS was first developed in the 1970’s. The world was a lot different then!
   (Image source: http://en.wikipedia.org/wiki/Moore’s_law)




Ista Zahn (Harvard MIT Data Center)      Introduction to SAS   March 15th, 2013   7 / 38
Introduction


 The SAS Environment



          Five basic SAS windows:
                 Results
                 Explorer
                 Editor
                 Log
                 Output
          Set up your SAS window so it fits your own preferences
          Viewing options (Windows only):
                 Window > Tile Vertically




Ista Zahn (Harvard MIT Data Center)      Introduction to SAS      March 15th, 2013   8 / 38
Introduction


 SAS Basics



          SAS has point-and-click interfaces, but most users write
          command-driven SAS programs
          Command syntax is not case sensitive, but file names may be
          Commands can run on multiple lines (but you can’t split words across
          lines)
          Follow every command with RUN ;
          Comments can be written as:
                 /* comment comment comment */ OR
                 * comment comment comment;




Ista Zahn (Harvard MIT Data Center)      Introduction to SAS         March 15th, 2013   9 / 38
Data Import and Export


 Topic


    1   Introduction

    2   Data Import and Export

    3   Descriptive Statistics

    4   Variable and Value Labels

    5   Data Manipulation

    6   Wrap-up




Ista Zahn (Harvard MIT Data Center)             Introduction to SAS   March 15th, 2013   10 / 38
Data Import and Export


 Working With SAS Data


          SAS data sets are stored in files on your hard drive, usually with the
          extension “.sas7bdat”
          Under Unix/Linux SAS data file names must be all lowercase–best to
          observe this restriction on Windows as well
          In SAS there is no concept of “opening” a data set–Instead we use
          libraries to point to directories on the hard drive that contain SAS data
          sets
          Unless you specify otherwise, SAS copies your data temporarily in a
          library called “work”
          The work library is temporary, so
                 use the work library for temporary data sets
                 create and use your own library for any data you wish to save




Ista Zahn (Harvard MIT Data Center)             Introduction to SAS     March 15th, 2013   11 / 38
Data Import and Export


 Libraries and Datasets
   Tell SAS where our data is on our hard drive by creating a library – I’m going
   to name my library introSAS and point SAS to the dataSets folder

         /* initialize a SAS library in the dataSets folder */
         LIBNAME introSAS "C:UsersdataclassDesktopSASintrodataSets";
         RUN;
         /* list sas datasets in the introSAS library / dataSets folder */
         PROC DATASETS library=introSAS;
         RUN;
         /* initialize a SAS library in the SASintroLib folder */
         LIBNAME mylib "C:UsersdataclassDesktopSASintroSASintroLib";
         RUN;
         /* copy pubschool to the "work" library */
         DATA pubschool;
              SET "C:UsersdataclassDesktopSASintrodataSetspubschool";
         RUN;



          The DATA command is actually saving our dataset as a new file called
          “pubschool” in the “work” library
          The SET command is just telling SAS what dataset to save
Ista Zahn (Harvard MIT Data Center)             Introduction to SAS   March 15th, 2013   12 / 38
Data Import and Export


 What if my file is not a SAS file?
   In SAS “import” means “convert to SAS format, copy to library/folder”
       Import Stata files with dbms=dta

         /* import from stata format */
         PROC IMPORT out = pubschool
              DATAFILE = "C:UsersdataclassDesktopSASintrodataSetspubschool.dta"
              DBMS = dta replace;
         RUN;


          Importing ASCII files with (e.g.) dbms=csv

         /* import csv file */
         PROC IMPORT out = pubschool
              DATAFILE = "C:UsersdataclassDesktopSASintrodataSets/pubschool.csv"
              DBMS = csv   replace;
              GETNAMES = yes;
              DATAROW = 2;
         RUN;


Ista Zahn (Harvard MIT Data Center)             Introduction to SAS   March 15th, 2013   13 / 38
Data Import and Export


 Where is my data?

   You can “view” you data in a couple of ways:
          Proc contents

         /* list contents of pubschool data */
         PROC CONTENTS data = pubschool;
              TITLE "Public school contents";
         RUN;



          Data viewer
             1   Go to explorer
             2   Select your “work” library
             3   Click on your dataset (opens in SAS Universal Viewer)
          Your dataset is named in the library as “pubschool” because that’s what
          you named it when you originally opened the dataset



Ista Zahn (Harvard MIT Data Center)             Introduction to SAS      March 15th, 2013   14 / 38
Data Import and Export


 How do I get my data out of SAS?
   In SAS “export” means “convert to a non-SAS format”
          Exporting CSV files:


         /* export to .csv */
         PROC EXPORT data = introSAS.ntcs
              OUTFILE = "C:UsersdataclassDesktopSASintroSASintroLibNeighCrime_NEW
              DBMS = csv;
         RUN;



          Exporting tab delimited files:


         /* export to tab delimited */
         PROC EXPORT data = introSAS.ntcs
              OUTFILE = "C:UsersdataclassDesktopSASintroSASintroLibNeighCrime_NEW
              DBMS = tab;
         RUN;



Ista Zahn (Harvard MIT Data Center)             Introduction to SAS   March 15th, 2013   15 / 38
Data Import and Export


 Exercise 1: Importing Data




      1   Create a library named “mylib” in the SASintroLib folder if it doesn’t
          already exist
      2   Import the Stata file, “ntcs.dta” to the mylib library
      3   Import the ASCII file, “ntcs.csv” to the mylib library
      4   Use “proc datasets” to list the datasets in the mylib libary
      5   Use “proc contents” to review the data you imported in step 2




Ista Zahn (Harvard MIT Data Center)             Introduction to SAS   March 15th, 2013   16 / 38
Descriptive Statistics


 Topic


    1   Introduction

    2   Data Import and Export

    3   Descriptive Statistics

    4   Variable and Value Labels

    5   Data Manipulation

    6   Wrap-up




Ista Zahn (Harvard MIT Data Center)                Introduction to SAS   March 15th, 2013   17 / 38
Descriptive Statistics


 Means, standard deviations, etc.
   Compute averages for q1 and q2 using proc means

         /* means of vars q1 and q2 */
         PROC MEANS data = pubschool;
              VAR q1 q2;
              TITLE "Public school means";
         RUN;


   Compute averages for q1 separately by timezone

         /* means separatly by timezone */
         /* need to sort first */
         PROC SORT data = pubschool;
              by timezone;
         RUN;

         PROC MEANS data = pubschool;
              by timezone;
              VAR q1;
              TITLE "Public school means by timezone";
         RUN;
Ista Zahn (Harvard MIT Data Center)                Introduction to SAS   March 15th, 2013   18 / 38
Descriptive Statistics


 Frequency Tables

   Frequency tables for q1 and q2 using proc freq

         /* counts of responses to q3, q4, and q5 */
         PROC FREQ data = pubschool;
              TABLE q3 q4 q5;
              TITLE "Public school frequencies";
         RUN;


   Frequency tables for q1 by timezone

         /* counts by timezone */
         PROC FREQ data = pubschool;
              TABLE q3*timezone;
              TITLE "Public school frequencies";
         RUN;




Ista Zahn (Harvard MIT Data Center)                Introduction to SAS   March 15th, 2013   19 / 38
Descriptive Statistics


 Correlation and Regression

   We’re interested in looking at the relationship between City Crime Rate
   (C_CRIMRT) and Percent of High School Grads in the City (C_HSGRAD)

         /* Scatterplot of relationship between high school
            graduation rate and crime rate */
         PROC GPLOT data = introSAS.ntcs;
              PLOT     C_HSGRAD * C_CRIMRT;
              TITLE "Percent of High School Graduates and Crime Rates";
         RUN;
         /* correlation between graduation and crime rates */
         PROC CORR data = introSAS.ntcs;
            VAR      C_HSGRAD C_CRIMRT;
                TITLE "Percent of High School Graduates and Crime Rates";
         RUN;
          /* Regression predicting crime rate */
         PROC REG data = introSAS.ntcs;
              MODEL C_CRIMRT = C_HSGRAD C_PERCAP C_POVRTY;
              TITLE "Percent of High School Graduates and Crime Rates";
         RUN;



Ista Zahn (Harvard MIT Data Center)                Introduction to SAS   March 15th, 2013   20 / 38
Descriptive Statistics


 Exercise 2: Correlation and regression



   Use the ntcs data set
      1   Take a look around the ntcs dataset and identify an outcome you’d like
          to predict and few variables (4-6) that you believe would serve as
          relevant predictor variables
      2   Run relevant descriptive statistics on your variables and look at
          histograms and scatterplots
      3   Test correlations leading up to ultimately testing a regression
      4   Run and interpret a regression using your selected variables




Ista Zahn (Harvard MIT Data Center)                Introduction to SAS   March 15th, 2013   21 / 38
Variable and Value Labels


 Topic


    1   Introduction

    2   Data Import and Export

    3   Descriptive Statistics

    4   Variable and Value Labels

    5   Data Manipulation

    6   Wrap-up




Ista Zahn (Harvard MIT Data Center)              Introduction to SAS   March 15th, 2013   22 / 38
Variable and Value Labels


 Variable and Value Labels




          Variable labels refer to the titles associated with each variable
          Value labels refer to the titles you assign to the different levels (i.e.,
          values) of each variable
          EXAMPLE:
                 Variable name: Marital
                 Variable label: Marital status of participant
                 Value labels: 1 = Married, 2 = Separated, 3= Divorced, 4 = Single, etc.




Ista Zahn (Harvard MIT Data Center)              Introduction to SAS   March 15th, 2013   23 / 38
Variable and Value Labels


 Variable Labels


   Adding variable labels is a data step command:

         /* copy pubschool to pubschool2 and label resp and status */
         DATA pubschool2;
           SET pubschool;
           LABEL
              resp = "Participant Identifier"
              status = "Did participant complete survey?"
         RUN;
         /* Check output */
         PROC CONTENTS data = pubschool2;
            TITLE "pubschool2 contents";
         RUN;




Ista Zahn (Harvard MIT Data Center)              Introduction to SAS   March 15th, 2013   24 / 38
Variable and Value Labels


 Creating Value Labels


   Creating value labels is a proc command
          Start by creating the label format

         /* create value label named q1label */
         PROC FORMAT;
          VALUE q1label
                1 = "best"
                2 = "top 5"
                3 = "top 10"
                4 = "top 20"
                5 = "Bottom 80"
                9 = "Don’t know";
         RUN;




Ista Zahn (Harvard MIT Data Center)              Introduction to SAS   March 15th, 2013   25 / 38
Variable and Value Labels


 Using Value labels
   Now we can use this value scheme creating tables,output, etc.

         /* display fequencies, using value labels */
         PROC FREQ data = pubschool;
               TABLES q1;
               FORMAT q1 q1label.;
         RUN;
         /* NOTE: There is a "." after q1label. This alerts SAS
          that you’re referring to a value scheme Saving Value
          Labels in your Dataset */


   We can also save a value label in a data set,

         /* add value label to SAS data set */
         PROC DATASETS library = work;
              MODIFY pubschool2;
         format     q1 q1label.;
         RUN;
         /* confirm that our formats were correctly applied: */
         PROC FREQ data = pubschool2;
         TABLES    q1 q3;
         RUN;
Ista Zahn (Harvard MIT Data Center)              Introduction to SAS   March 15th, 2013   26 / 38
Variable and Value Labels


 Dropping and Renaming Variables
   Keeping a subset of variables is simple–use the “drop” or “keep” command:

         /* create new dataset "pubschoolKeep" with only q1-q5 */
         DATA   pubschoolKeep (keep = q1 q2 q3 q4 q5);
              SET pubschool;
         RUN;
         /* create new dataset pubschoolDrop, excluding q1-q5 */
         DATA   pubschoolDrop       (drop = q1 q2 q3 q4 q5);
              SET pubschool;
         RUN;


   Renameing variables is done in a data step: just put the rename syntax right
   after your data command

         /* change the name of q2 to q2newName */
         DATA pubschool2 (rename = (q2 = q2newName));
              SET pubschool2;
         RUN;
         /* View the dataset: */
         PROC CONTENTS data = pubschool2;
         RUN;
Ista Zahn (Harvard MIT Data Center)              Introduction to SAS   March 15th, 2013   27 / 38
Variable and Value Labels


 SAS variable names




          Must be <= 32 characters in length
          Must start with a letter or underscore
          Can contain only numbers, letters or underscores
                 No special characters: @#$%^&*
          Can contain upper and lowercase letters




Ista Zahn (Harvard MIT Data Center)              Introduction to SAS   March 15th, 2013   28 / 38
Data Manipulation


 Topic


    1   Introduction

    2   Data Import and Export

    3   Descriptive Statistics

    4   Variable and Value Labels

    5   Data Manipulation

    6   Wrap-up




Ista Zahn (Harvard MIT Data Center)            Introduction to SAS   March 15th, 2013   29 / 38
Data Manipulation


 Logic Statements Useful For Data Manipulation



            = (EQ) equal to
             =
                 (NE ) not equal to
            > (GT) greater than
             < (LT) less than
           >= (GE) greater than or equal to
           <= (LE) less than or equal to
             (AND) and
            | (OR) or




Ista Zahn (Harvard MIT Data Center)            Introduction to SAS   March 15th, 2013   30 / 38
Data Manipulation


 Generate New Variables
   A data command - simply put new variable name followed by variable
   condition.

         DATA pubschool2;
              SET pubschool;
              /* create variable "myvar" equal to q1 */
              myvar = q1;
              /* create variable newvar2 equal to 1 */
              newvar2 = 1;
         RUN;


   Create new variable based on values of existing variable

         /* generate newvar4 based on q1 */
         DATA pubschool2;
              SET pubschool;
              if q1=1 then newvar3=1;
              else if q1=2 then newvar3=2;
              else if q1=3 then newvar3=3;
              else if q1=4 then newvar3=4;
              else newvar3=.;
         RUN;
Ista Zahn (Harvard MIT Data Center)            Introduction to SAS   March 15th, 2013   31 / 38
Data Manipulation


 Saving Subsets of Data
   Subsets are created with in a data step
          Create a subset of data including only participants who had a child
          attending a public school


         /* keep only rows where q10 is 1 */
         DATA CurrentPublic;
              SET pubschool;
              if q10=1;
         RUN;



          Create a subset of data including participants in timezone 1 or 2


         /* keep only rows where timezone is 1 or 2 */
         DATA CurrentPublic;
              SET pubschool;
              if timezone=1 | timezone=2;
         RUN;


Ista Zahn (Harvard MIT Data Center)            Introduction to SAS   March 15th, 2013   32 / 38
Data Manipulation


 Missing Values




          SAS’s symbol for a missing value is “.”
          SAS interprets “.” as a small value
          Need to be aware of this when you are manipulating data!
                 What will happen when you use the < or <= commands?




Ista Zahn (Harvard MIT Data Center)            Introduction to SAS   March 15th, 2013   33 / 38
Data Manipulation


 Exercise 3: Data Manipulation
   Use the ntcs.sas7bdat
      1   Attach variable labels using the following codebook: // REGION =
          “Region in United States”, CITY = “Name of City and State”
      2   Create formats in SAS using the codebook below:
                 C_SOUTH: 1 = Southern City, 0=Non-Southern City
                 C_WEST: 1=Western City 0=Non-Western City
      3   Run “proc freq” on C_SOUTH and C_WEST and use formats from
          step 2
      4   Assign the formats for C_SOUTH and C_WEST permanently in your
          ntcs dataset
      5   Confirm with “proc freq” that your labels were correctly assigned
      6   Generate new variables based on the city-level crime variables
          “C_MURDRT” and “C_ROBBRT”. Choose values on which to
          dichotomize each variable and create new variables that have a score of
          “1” if the original variable is above that value, and a score of “0”
          otherwise.
      7   Confirm that your new variables were properly created using “proc freq”
Ista Zahn (Harvard MIT Data Center)            Introduction to SAS   March 15th, 2013   34 / 38
Wrap-up


 Topic


    1   Introduction

    2   Data Import and Export

    3   Descriptive Statistics

    4   Variable and Value Labels

    5   Data Manipulation

    6   Wrap-up




Ista Zahn (Harvard MIT Data Center)   Introduction to SAS   March 15th, 2013   35 / 38
Wrap-up


 Help us make this workshop better!




          Please take a moment to fill out a very short feedback form
          These workshops exist for you – tell us what you need!
          http://tinyurl.com/akyvzle




Ista Zahn (Harvard MIT Data Center)   Introduction to SAS          March 15th, 2013   36 / 38
Wrap-up


 Other Services Available



   Institute for Quantitative Social Science
          www.iq.harvard.edu
   Computer labs
          www.iq.harvard.edu/facilities
   Research Technology Consulting
          www.iq.harvard.edu/researchconsulting
   Training
          http://projects.iq.harvard.edu/rtc/filter_by/workshops




Ista Zahn (Harvard MIT Data Center)   Introduction to SAS   March 15th, 2013   37 / 38
Wrap-up


 Additional Resources


   How do I get SAS?
          Your Department IT
          HMDC labs
          RCE (Research Computing Environment)
          Buy it: educational or grad plan
   The RCE
          Research Computing Enviroment (RCE) service available to Harvard &
          MIT users
          www.iq.harvard.edu/research_computing
          Supplies persistent desktop environment accessible from any computer
          with an internet connection
          Includes SAS, Stata, R etc.



Ista Zahn (Harvard MIT Data Center)     Introduction to SAS     March 15th, 2013   38 / 38

Contenu connexe

Tendances

Understanding SAS Data Step Processing
Understanding SAS Data Step ProcessingUnderstanding SAS Data Step Processing
Understanding SAS Data Step Processingguest2160992
 
Sas Statistical Analysis System
Sas Statistical Analysis SystemSas Statistical Analysis System
Sas Statistical Analysis SystemSushil kasar
 
Introduction To Sas
Introduction To SasIntroduction To Sas
Introduction To Sashalasti
 
Base SAS Statistics Procedures
Base SAS Statistics ProceduresBase SAS Statistics Procedures
Base SAS Statistics Proceduresguest2160992
 
Data Match Merging in SAS
Data Match Merging in SASData Match Merging in SAS
Data Match Merging in SASguest2160992
 
Introduction to clinical sas programming
Introduction to clinical sas programmingIntroduction to clinical sas programming
Introduction to clinical sas programmingray4hz
 
A Roadmap for SAS Programmers to Clinical Statistical Programming
A Roadmap for SAS Programmers to Clinical Statistical ProgrammingA Roadmap for SAS Programmers to Clinical Statistical Programming
A Roadmap for SAS Programmers to Clinical Statistical ProgrammingMohammad Majharul Alam
 
Base sas interview questions
Base sas interview questionsBase sas interview questions
Base sas interview questionsDr P Deepak
 
Clinical sas programmer
Clinical sas programmerClinical sas programmer
Clinical sas programmerray4hz
 
SAS cheat sheet
SAS cheat sheetSAS cheat sheet
SAS cheat sheetAli Ajouz
 

Tendances (20)

Understanding SAS Data Step Processing
Understanding SAS Data Step ProcessingUnderstanding SAS Data Step Processing
Understanding SAS Data Step Processing
 
Sas Statistical Analysis System
Sas Statistical Analysis SystemSas Statistical Analysis System
Sas Statistical Analysis System
 
Introduction To Sas
Introduction To SasIntroduction To Sas
Introduction To Sas
 
Sas cheat
Sas cheatSas cheat
Sas cheat
 
SAS Macros
SAS MacrosSAS Macros
SAS Macros
 
Clinical sas training overview
Clinical sas training overviewClinical sas training overview
Clinical sas training overview
 
INTRODUCTION TO SAS
INTRODUCTION TO SASINTRODUCTION TO SAS
INTRODUCTION TO SAS
 
SAS basics Step by step learning
SAS basics Step by step learningSAS basics Step by step learning
SAS basics Step by step learning
 
Base SAS Statistics Procedures
Base SAS Statistics ProceduresBase SAS Statistics Procedures
Base SAS Statistics Procedures
 
Data Match Merging in SAS
Data Match Merging in SASData Match Merging in SAS
Data Match Merging in SAS
 
SAS Internal Training
SAS Internal TrainingSAS Internal Training
SAS Internal Training
 
SAS Functions
SAS FunctionsSAS Functions
SAS Functions
 
Introduction to clinical sas programming
Introduction to clinical sas programmingIntroduction to clinical sas programming
Introduction to clinical sas programming
 
A Roadmap for SAS Programmers to Clinical Statistical Programming
A Roadmap for SAS Programmers to Clinical Statistical ProgrammingA Roadmap for SAS Programmers to Clinical Statistical Programming
A Roadmap for SAS Programmers to Clinical Statistical Programming
 
Sas Plots Graphs
Sas Plots GraphsSas Plots Graphs
Sas Plots Graphs
 
SAS Proc SQL
SAS Proc SQLSAS Proc SQL
SAS Proc SQL
 
SAS - overview of SAS
SAS - overview of SASSAS - overview of SAS
SAS - overview of SAS
 
Base sas interview questions
Base sas interview questionsBase sas interview questions
Base sas interview questions
 
Clinical sas programmer
Clinical sas programmerClinical sas programmer
Clinical sas programmer
 
SAS cheat sheet
SAS cheat sheetSAS cheat sheet
SAS cheat sheet
 

Similaire à Introduction to SAS

Introduction to Stata
Introduction to StataIntroduction to Stata
Introduction to Stataizahn
 
Analytics with SAS
Analytics with SASAnalytics with SAS
Analytics with SASEdureka!
 
Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8thotakoti
 
8323 Stats - Lesson 1 - 03 Introduction To Sas 2008
8323 Stats - Lesson 1 - 03 Introduction To Sas 20088323 Stats - Lesson 1 - 03 Introduction To Sas 2008
8323 Stats - Lesson 1 - 03 Introduction To Sas 2008untellectualism
 
Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SASImam Jaffer
 
I need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdfI need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdfMadansilks
 
SAS on Your (Apache) Cluster, Serving your Data (Analysts)
SAS on Your (Apache) Cluster, Serving your Data (Analysts)SAS on Your (Apache) Cluster, Serving your Data (Analysts)
SAS on Your (Apache) Cluster, Serving your Data (Analysts)DataWorks Summit
 
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022Sprintzeal
 
Important SAS Tips and Tricks for A Grade
Important SAS Tips and Tricks for A GradeImportant SAS Tips and Tricks for A Grade
Important SAS Tips and Tricks for A GradeLesa Cote
 
Learn SAS Programming
Learn SAS ProgrammingLearn SAS Programming
Learn SAS ProgrammingSASTechies
 
Hechsp 001 Chapter 3
Hechsp 001 Chapter 3Hechsp 001 Chapter 3
Hechsp 001 Chapter 3Brian Kelly
 
Introduction to sas
Introduction to sasIntroduction to sas
Introduction to sasDr P Deepak
 
Odam: Open Data, Access and Mining
Odam: Open Data, Access and MiningOdam: Open Data, Access and Mining
Odam: Open Data, Access and MiningDaniel JACOB
 
A Primer for Relational Database Design and Use
A Primer for Relational Database Design and UseA Primer for Relational Database Design and Use
A Primer for Relational Database Design and Usemanagementstrand
 

Similaire à Introduction to SAS (20)

Introduction to Stata
Introduction to StataIntroduction to Stata
Introduction to Stata
 
Analytics with SAS
Analytics with SASAnalytics with SAS
Analytics with SAS
 
Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8
 
Sas summary guide
Sas summary guideSas summary guide
Sas summary guide
 
SAS Online Training by Real Time Working Professionals in USA,UK,India,Middle...
SAS Online Training by Real Time Working Professionals in USA,UK,India,Middle...SAS Online Training by Real Time Working Professionals in USA,UK,India,Middle...
SAS Online Training by Real Time Working Professionals in USA,UK,India,Middle...
 
R stata
R stataR stata
R stata
 
8323 Stats - Lesson 1 - 03 Introduction To Sas 2008
8323 Stats - Lesson 1 - 03 Introduction To Sas 20088323 Stats - Lesson 1 - 03 Introduction To Sas 2008
8323 Stats - Lesson 1 - 03 Introduction To Sas 2008
 
Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SAS
 
Sas training in hyderabad
Sas training in hyderabadSas training in hyderabad
Sas training in hyderabad
 
I need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdfI need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdf
 
SAS on Your (Apache) Cluster, Serving your Data (Analysts)
SAS on Your (Apache) Cluster, Serving your Data (Analysts)SAS on Your (Apache) Cluster, Serving your Data (Analysts)
SAS on Your (Apache) Cluster, Serving your Data (Analysts)
 
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022
 
Important SAS Tips and Tricks for A Grade
Important SAS Tips and Tricks for A GradeImportant SAS Tips and Tricks for A Grade
Important SAS Tips and Tricks for A Grade
 
SAS - Training
SAS - Training SAS - Training
SAS - Training
 
Learn SAS Programming
Learn SAS ProgrammingLearn SAS Programming
Learn SAS Programming
 
Hechsp 001 Chapter 3
Hechsp 001 Chapter 3Hechsp 001 Chapter 3
Hechsp 001 Chapter 3
 
Introduction to sas
Introduction to sasIntroduction to sas
Introduction to sas
 
Odam: Open Data, Access and Mining
Odam: Open Data, Access and MiningOdam: Open Data, Access and Mining
Odam: Open Data, Access and Mining
 
SAS for Beginners
SAS for BeginnersSAS for Beginners
SAS for Beginners
 
A Primer for Relational Database Design and Use
A Primer for Relational Database Design and UseA Primer for Relational Database Design and Use
A Primer for Relational Database Design and Use
 

Plus de izahn

Stata datman
Stata datmanStata datman
Stata datmanizahn
 
Graphing stata (2 hour course)
Graphing stata (2 hour course)Graphing stata (2 hour course)
Graphing stata (2 hour course)izahn
 
Stata statistics
Stata statisticsStata statistics
Stata statisticsizahn
 
Data management in Stata
Data management in StataData management in Stata
Data management in Stataizahn
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programmingizahn
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2izahn
 
R Regression Models with Zelig
R Regression Models with ZeligR Regression Models with Zelig
R Regression Models with Zeligizahn
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environmentizahn
 

Plus de izahn (8)

Stata datman
Stata datmanStata datman
Stata datman
 
Graphing stata (2 hour course)
Graphing stata (2 hour course)Graphing stata (2 hour course)
Graphing stata (2 hour course)
 
Stata statistics
Stata statisticsStata statistics
Stata statistics
 
Data management in Stata
Data management in StataData management in Stata
Data management in Stata
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
R Regression Models with Zelig
R Regression Models with ZeligR Regression Models with Zelig
R Regression Models with Zelig
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environment
 

Dernier

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Dernier (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Introduction to SAS

  • 1. Introduction to SAS Ista Zahn Harvard MIT Data Center March 15th, 2013 The Institute for Quantitative Social Science at Harvard University Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 1 / 38
  • 2. Outline 1 Introduction 2 Data Import and Export 3 Descriptive Statistics 4 Variable and Value Labels 5 Data Manipulation 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 2 / 38
  • 3. Introduction Topic 1 Introduction 2 Data Import and Export 3 Descriptive Statistics 4 Variable and Value Labels 5 Data Manipulation 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 3 / 38
  • 4. Introduction Documents for Today Log in USERNAME: dataclass PASSWORD: dataclass Find class materials on th scratch drive Includes data, presentation slides, exercises Copy materials to your desktop! Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 4 / 38
  • 5. Introduction Workshop Description This is an INTRODUCTION to SAS – assumes no knowledge of SAS! Not appropriate for people already well familiar with SAS Learning Objectives: Import and export data in a varity of formats Create and use variable and value labels Perform basic data manipulation Compoute descriptive statistics Wrap-up Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 5 / 38
  • 6. Introduction Why SAS? If you know SAS, it is likely you will not need any other software packages Among the most powerful statistical software packages available Great user community: macros, websites, etc. Free online documentation: http://support.sas.com/documentation/93/ind ex.html Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 6 / 38
  • 7. Introduction SAS history SAS was first developed in the 1970’s. The world was a lot different then! (Image source: http://en.wikipedia.org/wiki/Moore’s_law) Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 7 / 38
  • 8. Introduction The SAS Environment Five basic SAS windows: Results Explorer Editor Log Output Set up your SAS window so it fits your own preferences Viewing options (Windows only): Window > Tile Vertically Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 8 / 38
  • 9. Introduction SAS Basics SAS has point-and-click interfaces, but most users write command-driven SAS programs Command syntax is not case sensitive, but file names may be Commands can run on multiple lines (but you can’t split words across lines) Follow every command with RUN ; Comments can be written as: /* comment comment comment */ OR * comment comment comment; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 9 / 38
  • 10. Data Import and Export Topic 1 Introduction 2 Data Import and Export 3 Descriptive Statistics 4 Variable and Value Labels 5 Data Manipulation 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 10 / 38
  • 11. Data Import and Export Working With SAS Data SAS data sets are stored in files on your hard drive, usually with the extension “.sas7bdat” Under Unix/Linux SAS data file names must be all lowercase–best to observe this restriction on Windows as well In SAS there is no concept of “opening” a data set–Instead we use libraries to point to directories on the hard drive that contain SAS data sets Unless you specify otherwise, SAS copies your data temporarily in a library called “work” The work library is temporary, so use the work library for temporary data sets create and use your own library for any data you wish to save Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 11 / 38
  • 12. Data Import and Export Libraries and Datasets Tell SAS where our data is on our hard drive by creating a library – I’m going to name my library introSAS and point SAS to the dataSets folder /* initialize a SAS library in the dataSets folder */ LIBNAME introSAS "C:UsersdataclassDesktopSASintrodataSets"; RUN; /* list sas datasets in the introSAS library / dataSets folder */ PROC DATASETS library=introSAS; RUN; /* initialize a SAS library in the SASintroLib folder */ LIBNAME mylib "C:UsersdataclassDesktopSASintroSASintroLib"; RUN; /* copy pubschool to the "work" library */ DATA pubschool; SET "C:UsersdataclassDesktopSASintrodataSetspubschool"; RUN; The DATA command is actually saving our dataset as a new file called “pubschool” in the “work” library The SET command is just telling SAS what dataset to save Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 12 / 38
  • 13. Data Import and Export What if my file is not a SAS file? In SAS “import” means “convert to SAS format, copy to library/folder” Import Stata files with dbms=dta /* import from stata format */ PROC IMPORT out = pubschool DATAFILE = "C:UsersdataclassDesktopSASintrodataSetspubschool.dta" DBMS = dta replace; RUN; Importing ASCII files with (e.g.) dbms=csv /* import csv file */ PROC IMPORT out = pubschool DATAFILE = "C:UsersdataclassDesktopSASintrodataSets/pubschool.csv" DBMS = csv replace; GETNAMES = yes; DATAROW = 2; RUN; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 13 / 38
  • 14. Data Import and Export Where is my data? You can “view” you data in a couple of ways: Proc contents /* list contents of pubschool data */ PROC CONTENTS data = pubschool; TITLE "Public school contents"; RUN; Data viewer 1 Go to explorer 2 Select your “work” library 3 Click on your dataset (opens in SAS Universal Viewer) Your dataset is named in the library as “pubschool” because that’s what you named it when you originally opened the dataset Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 14 / 38
  • 15. Data Import and Export How do I get my data out of SAS? In SAS “export” means “convert to a non-SAS format” Exporting CSV files: /* export to .csv */ PROC EXPORT data = introSAS.ntcs OUTFILE = "C:UsersdataclassDesktopSASintroSASintroLibNeighCrime_NEW DBMS = csv; RUN; Exporting tab delimited files: /* export to tab delimited */ PROC EXPORT data = introSAS.ntcs OUTFILE = "C:UsersdataclassDesktopSASintroSASintroLibNeighCrime_NEW DBMS = tab; RUN; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 15 / 38
  • 16. Data Import and Export Exercise 1: Importing Data 1 Create a library named “mylib” in the SASintroLib folder if it doesn’t already exist 2 Import the Stata file, “ntcs.dta” to the mylib library 3 Import the ASCII file, “ntcs.csv” to the mylib library 4 Use “proc datasets” to list the datasets in the mylib libary 5 Use “proc contents” to review the data you imported in step 2 Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 16 / 38
  • 17. Descriptive Statistics Topic 1 Introduction 2 Data Import and Export 3 Descriptive Statistics 4 Variable and Value Labels 5 Data Manipulation 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 17 / 38
  • 18. Descriptive Statistics Means, standard deviations, etc. Compute averages for q1 and q2 using proc means /* means of vars q1 and q2 */ PROC MEANS data = pubschool; VAR q1 q2; TITLE "Public school means"; RUN; Compute averages for q1 separately by timezone /* means separatly by timezone */ /* need to sort first */ PROC SORT data = pubschool; by timezone; RUN; PROC MEANS data = pubschool; by timezone; VAR q1; TITLE "Public school means by timezone"; RUN; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 18 / 38
  • 19. Descriptive Statistics Frequency Tables Frequency tables for q1 and q2 using proc freq /* counts of responses to q3, q4, and q5 */ PROC FREQ data = pubschool; TABLE q3 q4 q5; TITLE "Public school frequencies"; RUN; Frequency tables for q1 by timezone /* counts by timezone */ PROC FREQ data = pubschool; TABLE q3*timezone; TITLE "Public school frequencies"; RUN; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 19 / 38
  • 20. Descriptive Statistics Correlation and Regression We’re interested in looking at the relationship between City Crime Rate (C_CRIMRT) and Percent of High School Grads in the City (C_HSGRAD) /* Scatterplot of relationship between high school graduation rate and crime rate */ PROC GPLOT data = introSAS.ntcs; PLOT C_HSGRAD * C_CRIMRT; TITLE "Percent of High School Graduates and Crime Rates"; RUN; /* correlation between graduation and crime rates */ PROC CORR data = introSAS.ntcs; VAR C_HSGRAD C_CRIMRT; TITLE "Percent of High School Graduates and Crime Rates"; RUN; /* Regression predicting crime rate */ PROC REG data = introSAS.ntcs; MODEL C_CRIMRT = C_HSGRAD C_PERCAP C_POVRTY; TITLE "Percent of High School Graduates and Crime Rates"; RUN; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 20 / 38
  • 21. Descriptive Statistics Exercise 2: Correlation and regression Use the ntcs data set 1 Take a look around the ntcs dataset and identify an outcome you’d like to predict and few variables (4-6) that you believe would serve as relevant predictor variables 2 Run relevant descriptive statistics on your variables and look at histograms and scatterplots 3 Test correlations leading up to ultimately testing a regression 4 Run and interpret a regression using your selected variables Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 21 / 38
  • 22. Variable and Value Labels Topic 1 Introduction 2 Data Import and Export 3 Descriptive Statistics 4 Variable and Value Labels 5 Data Manipulation 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 22 / 38
  • 23. Variable and Value Labels Variable and Value Labels Variable labels refer to the titles associated with each variable Value labels refer to the titles you assign to the different levels (i.e., values) of each variable EXAMPLE: Variable name: Marital Variable label: Marital status of participant Value labels: 1 = Married, 2 = Separated, 3= Divorced, 4 = Single, etc. Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 23 / 38
  • 24. Variable and Value Labels Variable Labels Adding variable labels is a data step command: /* copy pubschool to pubschool2 and label resp and status */ DATA pubschool2; SET pubschool; LABEL resp = "Participant Identifier" status = "Did participant complete survey?" RUN; /* Check output */ PROC CONTENTS data = pubschool2; TITLE "pubschool2 contents"; RUN; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 24 / 38
  • 25. Variable and Value Labels Creating Value Labels Creating value labels is a proc command Start by creating the label format /* create value label named q1label */ PROC FORMAT; VALUE q1label 1 = "best" 2 = "top 5" 3 = "top 10" 4 = "top 20" 5 = "Bottom 80" 9 = "Don’t know"; RUN; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 25 / 38
  • 26. Variable and Value Labels Using Value labels Now we can use this value scheme creating tables,output, etc. /* display fequencies, using value labels */ PROC FREQ data = pubschool; TABLES q1; FORMAT q1 q1label.; RUN; /* NOTE: There is a "." after q1label. This alerts SAS that you’re referring to a value scheme Saving Value Labels in your Dataset */ We can also save a value label in a data set, /* add value label to SAS data set */ PROC DATASETS library = work; MODIFY pubschool2; format q1 q1label.; RUN; /* confirm that our formats were correctly applied: */ PROC FREQ data = pubschool2; TABLES q1 q3; RUN; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 26 / 38
  • 27. Variable and Value Labels Dropping and Renaming Variables Keeping a subset of variables is simple–use the “drop” or “keep” command: /* create new dataset "pubschoolKeep" with only q1-q5 */ DATA pubschoolKeep (keep = q1 q2 q3 q4 q5); SET pubschool; RUN; /* create new dataset pubschoolDrop, excluding q1-q5 */ DATA pubschoolDrop (drop = q1 q2 q3 q4 q5); SET pubschool; RUN; Renameing variables is done in a data step: just put the rename syntax right after your data command /* change the name of q2 to q2newName */ DATA pubschool2 (rename = (q2 = q2newName)); SET pubschool2; RUN; /* View the dataset: */ PROC CONTENTS data = pubschool2; RUN; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 27 / 38
  • 28. Variable and Value Labels SAS variable names Must be <= 32 characters in length Must start with a letter or underscore Can contain only numbers, letters or underscores No special characters: @#$%^&* Can contain upper and lowercase letters Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 28 / 38
  • 29. Data Manipulation Topic 1 Introduction 2 Data Import and Export 3 Descriptive Statistics 4 Variable and Value Labels 5 Data Manipulation 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 29 / 38
  • 30. Data Manipulation Logic Statements Useful For Data Manipulation = (EQ) equal to = (NE ) not equal to > (GT) greater than < (LT) less than >= (GE) greater than or equal to <= (LE) less than or equal to (AND) and | (OR) or Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 30 / 38
  • 31. Data Manipulation Generate New Variables A data command - simply put new variable name followed by variable condition. DATA pubschool2; SET pubschool; /* create variable "myvar" equal to q1 */ myvar = q1; /* create variable newvar2 equal to 1 */ newvar2 = 1; RUN; Create new variable based on values of existing variable /* generate newvar4 based on q1 */ DATA pubschool2; SET pubschool; if q1=1 then newvar3=1; else if q1=2 then newvar3=2; else if q1=3 then newvar3=3; else if q1=4 then newvar3=4; else newvar3=.; RUN; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 31 / 38
  • 32. Data Manipulation Saving Subsets of Data Subsets are created with in a data step Create a subset of data including only participants who had a child attending a public school /* keep only rows where q10 is 1 */ DATA CurrentPublic; SET pubschool; if q10=1; RUN; Create a subset of data including participants in timezone 1 or 2 /* keep only rows where timezone is 1 or 2 */ DATA CurrentPublic; SET pubschool; if timezone=1 | timezone=2; RUN; Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 32 / 38
  • 33. Data Manipulation Missing Values SAS’s symbol for a missing value is “.” SAS interprets “.” as a small value Need to be aware of this when you are manipulating data! What will happen when you use the < or <= commands? Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 33 / 38
  • 34. Data Manipulation Exercise 3: Data Manipulation Use the ntcs.sas7bdat 1 Attach variable labels using the following codebook: // REGION = “Region in United States”, CITY = “Name of City and State” 2 Create formats in SAS using the codebook below: C_SOUTH: 1 = Southern City, 0=Non-Southern City C_WEST: 1=Western City 0=Non-Western City 3 Run “proc freq” on C_SOUTH and C_WEST and use formats from step 2 4 Assign the formats for C_SOUTH and C_WEST permanently in your ntcs dataset 5 Confirm with “proc freq” that your labels were correctly assigned 6 Generate new variables based on the city-level crime variables “C_MURDRT” and “C_ROBBRT”. Choose values on which to dichotomize each variable and create new variables that have a score of “1” if the original variable is above that value, and a score of “0” otherwise. 7 Confirm that your new variables were properly created using “proc freq” Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 34 / 38
  • 35. Wrap-up Topic 1 Introduction 2 Data Import and Export 3 Descriptive Statistics 4 Variable and Value Labels 5 Data Manipulation 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 35 / 38
  • 36. Wrap-up Help us make this workshop better! Please take a moment to fill out a very short feedback form These workshops exist for you – tell us what you need! http://tinyurl.com/akyvzle Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 36 / 38
  • 37. Wrap-up Other Services Available Institute for Quantitative Social Science www.iq.harvard.edu Computer labs www.iq.harvard.edu/facilities Research Technology Consulting www.iq.harvard.edu/researchconsulting Training http://projects.iq.harvard.edu/rtc/filter_by/workshops Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 37 / 38
  • 38. Wrap-up Additional Resources How do I get SAS? Your Department IT HMDC labs RCE (Research Computing Environment) Buy it: educational or grad plan The RCE Research Computing Enviroment (RCE) service available to Harvard & MIT users www.iq.harvard.edu/research_computing Supplies persistent desktop environment accessible from any computer with an internet connection Includes SAS, Stata, R etc. Ista Zahn (Harvard MIT Data Center) Introduction to SAS March 15th, 2013 38 / 38