SlideShare une entreprise Scribd logo
1  sur  14
2008 Summer North American Stata Users Group meeting
             Chicago, 24-25 July 2008



    Using SPSS files in Stata




               Sergiy Radyakin
               The World Bank
How to get the data in?
• Use Stata to manipulate the data and read it in
• Use the data producing application to export the data in the
  proper format that Stata can later import
• Use specialized conversion software to convert to a proper
  format
• Use another statistical package that supports both formats to
  make it convert the dataset
•   Write own conversion program in/as:
    – Stata (slow, portable)
    – Mata (faster, portable)
    – Plugin (very fast, not portable, dependent on Stata’s bit-width)
    – Standalone (very fast, not portable, independent of Stata’s bitwidth)

                                                                         2
Which data formats does Stata
        support (as of v10)?

•   Stata native formats (use)
•   ASCII data with dictionaries (insheet/infix)
•   SAS XPORT format (fdause)
•   Data import via ODBC, provided that a
    required driver is installed and configured

• But, no SPSS support

                                                   3
When SPSS is available:
• SPSS v14 and later supports exporting data to Stata
  format
• SPSS_to_Stata_00.sbs script by Alasdair Crockett is
  available for earlier releases, requires both SPSS
  and Stata for conversion
      Data Services Guides: SPSS_to_Stata Conversion Utility Guide
      http://www.data-archive.ac.uk/support/conversionguide.pdf

• This can be automated with an .ado wrapper similar
  to USESAS by Dan Blanchette, which requires SAS
  to be installed to import data to Stata
• These are not “true readers”, since they require
  SPSS or SAS to be installed (with license costs, etc.)
                                                                     4
Specialized Conversion Software
• Stat/Transfer
   – http://www.stattransfer.com/
   – $295 (New unit, Windows)

• DBMS/Copy
   – http://www.dataflux.com/Product-
     Services/Products/dbms.asp
   – $495 (New individual, Windows)

• Both support command line parameters to convert in
  a batch-mode and thus can be “wrapped” for use with
  Stata, see e.g. STCMD by Roger Newson

                                        (as of July 13, 2008)
                                                        5
USESPSS
• USESPSS is a new command for Stata to
  read in SPSS data (*.sav files)
• It is a “true reader ” – does not require any
  other software (other than OS Windows)
• Free
• Implemented as a plugin, with portions of
  code (e.g. file decompression) written in
  assembler for performance optimization
• Note: SPSS format documentation is not
  released, and only fragmented information is
  available in the Internet
                                              6
USESPSS Features
• Reads *.sav files originating from both Windows and UNIX
  versions of SPSS (LoHi and HiLo byte orders)

• Supports compressed and non-compressed SPSS files

• Preserves variable and value labels

• Optimizes data storage types (2-pass)

• Supports long variable names

• Automatically renames not allowed variable names and
  resolves naming collisions
• Preserves number of decimals in numeric formats.
• Transfers, but does not format date/time variables

                                                             7
USESPSS Syntax
usespss can be used as any other command in the
command line, user’s .do files and .ado programs:

  usespss [using] “filename.sav”
    [,clear
    saving(“filename.dta”)
    iff(condition)
    inn(condition)
    memory(memsize)
    lowmemory(memsize)]
                                               8
Memory Tradeoff
• Stata and plugins share the same address space
• As a consequence, plugins can read Stata’s data
  directly (if they know where it is located) and call
  Stata’s subroutines (if exposed).
• However, the more memory is allocated for Stata
  data, the less memory is available to the plugins,
  because the size of the address space is limited
  (typically 2GB on a 32-bit Windows system). In other
  words, plugins compete for memory between
  themselves and with Stata.

                                                         9
Memory Tradeoff
• Similarly to Stata, usespss attempts to load the whole
  data file into memory; this speeds up the 2-pass
  processing (1st pass – optimization of the storage
  types, 2nd pass – actual conversion)
• But, when user loads the SPSS data Stata data (if
  any) is discarded. So Stata’s memory use can be
  temporarily decreased within usespss.ado
• It is important to do this when working with large files,
  otherwise the plugin will not be able to allocate
  enough memory to load the SPSS data file.

                                                       10
Memory Use
                         Consider the following code:
                         set mem 800m
                         usespss using “mydata.sav”, lowmemory(10) memory(800)
Limit, e.g.
2GB

                                         Plugin code


                  Free memory                                                         Free memory
                                               Plugin data
       Memory




                                                                                                    800m

10m             Stata data

                            Stata code


       usespss.ado   Any dataset in     Stata memory is      Stata memory is set   usespss.ado         time
          starts        Stata’s       temporarily set to a    to a higher value       ends
                      memory is            low value
                       cleared                                                                        11
DESSPSS
• desspss is a new Stata command to describe
  the contents of an SPSS system *.sav file
• does not destroy data in the memory
• works much faster than
     usespss using filename.sav, saving(filename.dta)
     describe

  because no optimization/conversion is
  actually performed, but does not list the
  variable types (these are determined after
  optimization)
• saves all descriptive information in r()
                                                        12
DESSPSS Example Report
. desspss using artificial.sav

DESSPSS Report
==============
SPSS System file: artificial.sav
Created (date): 17-Jul- 8
Created (time): 22: 4: 0
SPSS product: SPSS-X SYSTEM FILE. SPSS 5.0 MS/Windows made by DBMS/COPY
File label (if present):
File size (as stored on disk): 382692 bytes
Data size: 381432 bytes
Data stored in compressed format
This file is likely to originate from a Windows platform (LoHi byte order)

Number of cases (observations): 10000
Number of variables: 10
Case size: 88 bytes
----------------------------------------------------------------------

Variables:

GENDER    MARRIED    B_YEAR   W_HOURS    CITY_COD
AGE       EMP_STAT   WAGE     FULLTIME   CITY_NAM


                                                                         13
Demonstration:
• Embedded artificially created dataset in SPSS format:

 Click on the icon opens the SPSS file in Stata if:
                                                                  artificial.sav
5. usespss is installed in Stata, and
6. file assosiation was set:

          --------------------- beginning of sav_file.reg --------------------
          Windows Registry Editor Version 5.00
          [HKEY_CLASSES_ROOT.sav]
          @="sav_auto_file"
          [HKEY_CLASSES_ROOTsav_auto_file]
          @="SPSS Dataset"
          [HKEY_CLASSES_ROOTsav_auto_fileshell]
          [HKEY_CLASSES_ROOTsav_auto_fileshellopen]
          [HKEY_CLASSES_ROOTsav_auto_fileshellopencommand]
          @=""C:Stata10sewsestata.exe" usespss "%1"“
          ---------------------- end of sav_file.reg -------------------------


       Substitute with the full name of the Stata’s executable

• Questions?
                                                                            14

Contenu connexe

Tendances

HBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table designHBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table designphanleson
 
Google Bigtable Paper Presentation
Google Bigtable Paper PresentationGoogle Bigtable Paper Presentation
Google Bigtable Paper Presentationvanjakom
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 ReleaseNick Dimiduk
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBaseCloudera, Inc.
 
Google Bigtable paper presentation
Google Bigtable paper presentationGoogle Bigtable paper presentation
Google Bigtable paper presentationvanjakom
 
Hbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBaseHbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBasephanleson
 
Apache HBase - Just the Basics
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the BasicsHBaseCon
 
HBase: Just the Basics
HBase: Just the BasicsHBase: Just the Basics
HBase: Just the BasicsHBaseCon
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconYiwei Ma
 
Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignCloudera, Inc.
 
Implementing the Databese Server session 02
Implementing the Databese Server session 02Implementing the Databese Server session 02
Implementing the Databese Server session 02Guillermo Julca
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
 
Big Data Fundamentals in the Emerging New Data World
Big Data Fundamentals in the Emerging New Data WorldBig Data Fundamentals in the Emerging New Data World
Big Data Fundamentals in the Emerging New Data WorldJongwook Woo
 
[B5]memcached scalability-bag lru-deview-100
[B5]memcached scalability-bag lru-deview-100[B5]memcached scalability-bag lru-deview-100
[B5]memcached scalability-bag lru-deview-100NAVER D2
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And HbaseEdward Yoon
 

Tendances (20)

HBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table designHBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table design
 
Google Bigtable Paper Presentation
Google Bigtable Paper PresentationGoogle Bigtable Paper Presentation
Google Bigtable Paper Presentation
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBase
 
Google Bigtable paper presentation
Google Bigtable paper presentationGoogle Bigtable paper presentation
Google Bigtable paper presentation
 
Hbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBaseHbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBase
 
Apache HBase - Just the Basics
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the Basics
 
HBase: Just the Basics
HBase: Just the BasicsHBase: Just the Basics
HBase: Just the Basics
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
 
Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema Design
 
Implementing the Databese Server session 02
Implementing the Databese Server session 02Implementing the Databese Server session 02
Implementing the Databese Server session 02
 
Hadoop architecture by ajay
Hadoop architecture by ajayHadoop architecture by ajay
Hadoop architecture by ajay
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
 
Bigtable
BigtableBigtable
Bigtable
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
Big Data Fundamentals in the Emerging New Data World
Big Data Fundamentals in the Emerging New Data WorldBig Data Fundamentals in the Emerging New Data World
Big Data Fundamentals in the Emerging New Data World
 
[B5]memcached scalability-bag lru-deview-100
[B5]memcached scalability-bag lru-deview-100[B5]memcached scalability-bag lru-deview-100
[B5]memcached scalability-bag lru-deview-100
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And Hbase
 

Similaire à Radyakin usespss

DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax
 
Improving Effeciency with Options in SAS
Improving Effeciency with Options in SASImproving Effeciency with Options in SAS
Improving Effeciency with Options in SASguest2160992
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimizationLouis liu
 
Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Boni Bruno
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network ProcessingRyousei Takano
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09Chris Purrington
 
GCP Data Engineer cheatsheet
GCP Data Engineer cheatsheetGCP Data Engineer cheatsheet
GCP Data Engineer cheatsheetGuang Xu
 
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaTechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaOpenNebula Project
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answersKalyan Hadoop
 

Similaire à Radyakin usespss (20)

Radyakin usespss
Radyakin usespssRadyakin usespss
Radyakin usespss
 
Readme
ReadmeReadme
Readme
 
spss
spssspss
spss
 
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
 
SAS Programming Notes
SAS Programming NotesSAS Programming Notes
SAS Programming Notes
 
Improving Effeciency with Options in SAS
Improving Effeciency with Options in SASImproving Effeciency with Options in SAS
Improving Effeciency with Options in SAS
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimization
 
Stata tutorial university of princeton
Stata tutorial university of princetonStata tutorial university of princeton
Stata tutorial university of princeton
 
Migration from 8.1 to 11.3
Migration from 8.1 to 11.3Migration from 8.1 to 11.3
Migration from 8.1 to 11.3
 
Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810
 
SAS - Training
SAS - Training SAS - Training
SAS - Training
 
Stata tutorial
Stata tutorialStata tutorial
Stata tutorial
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
GCP Data Engineer cheatsheet
GCP Data Engineer cheatsheetGCP Data Engineer cheatsheet
GCP Data Engineer cheatsheet
 
Gcp data engineer
Gcp data engineerGcp data engineer
Gcp data engineer
 
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaTechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
 
Performance Whackamole (short version)
Performance Whackamole (short version)Performance Whackamole (short version)
Performance Whackamole (short version)
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
 
Hadoop Research
Hadoop Research Hadoop Research
Hadoop Research
 

Dernier

Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Call Girls in Nagpur High Profile
 
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdfFinTech Belgium
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptxFinTech Belgium
 
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Delhi Call girls
 
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptxFinTech Belgium
 
20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdfAdnet Communications
 
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...ssifa0344
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptxFinTech Belgium
 
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )Pooja Nehwal
 
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...ssifa0344
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfGale Pooley
 
The Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdfThe Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdfGale Pooley
 
Gurley shaw Theory of Monetary Economics.
Gurley shaw Theory of Monetary Economics.Gurley shaw Theory of Monetary Economics.
Gurley shaw Theory of Monetary Economics.Vinodha Devi
 
The Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfThe Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfGale Pooley
 
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...Pooja Nehwal
 
The Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdfThe Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdfGale Pooley
 
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...dipikadinghjn ( Why You Choose Us? ) Escorts
 
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikHigh Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 

Dernier (20)

Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
 
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx
 
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
 
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
 
20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf
 
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
 
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
 
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
 
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdf
 
The Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdfThe Economic History of the U.S. Lecture 23.pdf
The Economic History of the U.S. Lecture 23.pdf
 
Gurley shaw Theory of Monetary Economics.
Gurley shaw Theory of Monetary Economics.Gurley shaw Theory of Monetary Economics.
Gurley shaw Theory of Monetary Economics.
 
The Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfThe Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdf
 
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
 
The Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdfThe Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdf
 
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
 
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikHigh Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
 

Radyakin usespss

  • 1. 2008 Summer North American Stata Users Group meeting Chicago, 24-25 July 2008 Using SPSS files in Stata Sergiy Radyakin The World Bank
  • 2. How to get the data in? • Use Stata to manipulate the data and read it in • Use the data producing application to export the data in the proper format that Stata can later import • Use specialized conversion software to convert to a proper format • Use another statistical package that supports both formats to make it convert the dataset • Write own conversion program in/as: – Stata (slow, portable) – Mata (faster, portable) – Plugin (very fast, not portable, dependent on Stata’s bit-width) – Standalone (very fast, not portable, independent of Stata’s bitwidth) 2
  • 3. Which data formats does Stata support (as of v10)? • Stata native formats (use) • ASCII data with dictionaries (insheet/infix) • SAS XPORT format (fdause) • Data import via ODBC, provided that a required driver is installed and configured • But, no SPSS support 3
  • 4. When SPSS is available: • SPSS v14 and later supports exporting data to Stata format • SPSS_to_Stata_00.sbs script by Alasdair Crockett is available for earlier releases, requires both SPSS and Stata for conversion Data Services Guides: SPSS_to_Stata Conversion Utility Guide http://www.data-archive.ac.uk/support/conversionguide.pdf • This can be automated with an .ado wrapper similar to USESAS by Dan Blanchette, which requires SAS to be installed to import data to Stata • These are not “true readers”, since they require SPSS or SAS to be installed (with license costs, etc.) 4
  • 5. Specialized Conversion Software • Stat/Transfer – http://www.stattransfer.com/ – $295 (New unit, Windows) • DBMS/Copy – http://www.dataflux.com/Product- Services/Products/dbms.asp – $495 (New individual, Windows) • Both support command line parameters to convert in a batch-mode and thus can be “wrapped” for use with Stata, see e.g. STCMD by Roger Newson (as of July 13, 2008) 5
  • 6. USESPSS • USESPSS is a new command for Stata to read in SPSS data (*.sav files) • It is a “true reader ” – does not require any other software (other than OS Windows) • Free • Implemented as a plugin, with portions of code (e.g. file decompression) written in assembler for performance optimization • Note: SPSS format documentation is not released, and only fragmented information is available in the Internet 6
  • 7. USESPSS Features • Reads *.sav files originating from both Windows and UNIX versions of SPSS (LoHi and HiLo byte orders) • Supports compressed and non-compressed SPSS files • Preserves variable and value labels • Optimizes data storage types (2-pass) • Supports long variable names • Automatically renames not allowed variable names and resolves naming collisions • Preserves number of decimals in numeric formats. • Transfers, but does not format date/time variables 7
  • 8. USESPSS Syntax usespss can be used as any other command in the command line, user’s .do files and .ado programs: usespss [using] “filename.sav” [,clear saving(“filename.dta”) iff(condition) inn(condition) memory(memsize) lowmemory(memsize)] 8
  • 9. Memory Tradeoff • Stata and plugins share the same address space • As a consequence, plugins can read Stata’s data directly (if they know where it is located) and call Stata’s subroutines (if exposed). • However, the more memory is allocated for Stata data, the less memory is available to the plugins, because the size of the address space is limited (typically 2GB on a 32-bit Windows system). In other words, plugins compete for memory between themselves and with Stata. 9
  • 10. Memory Tradeoff • Similarly to Stata, usespss attempts to load the whole data file into memory; this speeds up the 2-pass processing (1st pass – optimization of the storage types, 2nd pass – actual conversion) • But, when user loads the SPSS data Stata data (if any) is discarded. So Stata’s memory use can be temporarily decreased within usespss.ado • It is important to do this when working with large files, otherwise the plugin will not be able to allocate enough memory to load the SPSS data file. 10
  • 11. Memory Use Consider the following code: set mem 800m usespss using “mydata.sav”, lowmemory(10) memory(800) Limit, e.g. 2GB Plugin code Free memory Free memory Plugin data Memory 800m 10m Stata data Stata code usespss.ado Any dataset in Stata memory is Stata memory is set usespss.ado time starts Stata’s temporarily set to a to a higher value ends memory is low value cleared 11
  • 12. DESSPSS • desspss is a new Stata command to describe the contents of an SPSS system *.sav file • does not destroy data in the memory • works much faster than usespss using filename.sav, saving(filename.dta) describe because no optimization/conversion is actually performed, but does not list the variable types (these are determined after optimization) • saves all descriptive information in r() 12
  • 13. DESSPSS Example Report . desspss using artificial.sav DESSPSS Report ============== SPSS System file: artificial.sav Created (date): 17-Jul- 8 Created (time): 22: 4: 0 SPSS product: SPSS-X SYSTEM FILE. SPSS 5.0 MS/Windows made by DBMS/COPY File label (if present): File size (as stored on disk): 382692 bytes Data size: 381432 bytes Data stored in compressed format This file is likely to originate from a Windows platform (LoHi byte order) Number of cases (observations): 10000 Number of variables: 10 Case size: 88 bytes ---------------------------------------------------------------------- Variables: GENDER MARRIED B_YEAR W_HOURS CITY_COD AGE EMP_STAT WAGE FULLTIME CITY_NAM 13
  • 14. Demonstration: • Embedded artificially created dataset in SPSS format: Click on the icon opens the SPSS file in Stata if: artificial.sav 5. usespss is installed in Stata, and 6. file assosiation was set: --------------------- beginning of sav_file.reg -------------------- Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT.sav] @="sav_auto_file" [HKEY_CLASSES_ROOTsav_auto_file] @="SPSS Dataset" [HKEY_CLASSES_ROOTsav_auto_fileshell] [HKEY_CLASSES_ROOTsav_auto_fileshellopen] [HKEY_CLASSES_ROOTsav_auto_fileshellopencommand] @=""C:Stata10sewsestata.exe" usespss "%1"“ ---------------------- end of sav_file.reg ------------------------- Substitute with the full name of the Stata’s executable • Questions? 14