\\Win2k Aub Edu Lb\Files\Homet\Tfc01\My Documents\L01 Preparing Data
1. CHAPTER 1: PREPARING DATA
1. Types of Data
Three kinds of datasets are distinguished:
Cross-Sections (Unit of observation varies, time of observation fixed)
Observations Year Growth Savings Middle East
Afghanistan 2005 -5 0.3 0
: 2005 2 0.2 0
Lebanon 2005 1 0.1 1
: 2005 2 0.4 0
Zimbabwe 2005 -1 0.2 0
Time series (Unit of observation fixed, time of observation varies)
Observations Year Growth Savings
Lebanon 2001 3.1 0.3
Lebanon 2002 2.8 0.2
Lebanon 2003 1.4 0.2
Lebanon 2004 2.1 0.3
Lebanon 2005 2.5 0.3
1
2. Panels (Unit of observations varies, time of observations varies)
Observations Year Growth Savings Middle East
Afghanistan 2000 -5 0.3 0
: : 2 0.2 0
Afghanistan 2005 1 0.1 0
: 2 0.4 0
Lebanon 2000 -1 0.2 1
: : 1 0.4 1
Lebanon 2005 1 0.3 1
: : 3 0.2 0
Zimbabwe 2000 2 0.1 0
: : -1 0.05 0
Zimbabwe 2005 -5 0.01 0
2. Preparing Datasets – Practical Tips
In practice, you always want to keep a dataset as an Excel file, which
you copy into a statistical analysis program.
In preparing your datasets, you often retrieve data in table form and
you need to stack your observations.
For example, what you download looks like this
2006 2007 2008
Afghanistan a b c
: : : :
Lebanon d e f
: : : :
Zimbabwe g h i
2
3. but you need it like this
Country Year Observation
Afghanistan 2006 a
Afghanistan 2007 b
Afghanistan 2008 c
: : :
Lebanon 2006 d
Lebanon 2007 e
Lebanon 2008 f
: : :
Zimbabwe 2006 g
Zimbabwe 2007 h
Zimbabwe 2008 i
which means that you need to transpose your data first and then stack
it.
Dataset building needs a little bit of practice.
The following macro does the stacking for you in Excel.
3
4. “Stacking Macro” for Excel (Example: 45 rows, 220 columns)
Sub SORTY()
'
' Macro1 Macro
' Macro recorded 3/19/2004 by mm53
'
' Keyboard Shortcut: Ctrl+d
'
For i = 1 To 220
Range("B1:B45").Select
Selection.Copy
Dim x As Object
Set x = ActiveCell
x.Offset(45 * i, -1).Select
ActiveSheet.Paste
Application.CutCopyMode = False
Columns("B:B").Select
Selection.Delete Shift:=x1ToLeft
Range("B1").Select
Next i
End Sub
4
5. A note on different software programs.
In preparing datasets for use in “gretl”, make sure that all columns
except the first contain numbers. Other programs, such as NCSS, are
more tolerant in this regard and will also read columns with strings.
This means that “gretl” requires for each regional dummy a separate
column while NCSS could read different regions from one column.
5