6. Data Forms
Human communication
Includes language, images and sounds
Computers
Process and store all forms of data in binary
format
Conversion to computer-usable
representation using data formats
Define the different ways human data may be
represented, stored and processed by a
computer
6
8. Common Data Representations
Type of Data
Standard(s)
Alphanumeric
BCD, ASCII, EBCDIC, Unicode
Image (bitmapped)
GIF
Image (object)
PostScript, JPEG, SWF
(Macromedia Flash), SVG
Outline graphics and
fonts
PostScript, TrueType
Sound
WAV, AVI, MP3, MIDI, WMA
Page description
PDF (Adobe Portable Document
Format), HTML, XML
Video and Sound
Quicktime, MPEG-2, MPEG-4,
RealVideo, WMV
(graphical image format)
TIF (tagged image file format)
PNG (portable network graphics)
9
9. Alphanumeric Data
Groups of data:
Characters: A, B, …, Z and a, b,…, z
Numbers/digits: 0 … 9
Punctuations: !, ;, :, ? etc
Special purpose characters: $, @, #, *, …, &
Four coding systems /standards to represent above
types:
BCD (Binary-Coded Decimal)
ASCII (American Standard Code for Information Interchange)
EBCDIC (Extended Binary Coded Decimal Interchange Code)
Unicode
10
14. ASCII Features
Developed by ANSI (American National Standards Institute)
Defined in ANSI document X3.4-1977
7-bit code
8th bit is unused (or used for a parity bit or to indicate “extended”
character set)
27 = 128 different codes
Two general types of codes:
95 are “Printing” codes (displayable on a console)
33 are “Control” codes (control features of the console or
communications channel)
Represents
Latin alphabet, Arabic numerals, standard punctuation characters
Plus small set of accents and other European special characters (Latin-I
ASCII)
15
17. ASCII Table
e.g., ‘a’ = 1100001
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
000
NULL
SOH
STX
ETX
EDT
ENQ
ACK
BEL
BS
HT
LF
VT
FF
CR
SO
SI
001
DLE
DC1
DC2
DC3
DC4
NAK
SYN
ETB
CAN
EM
SUB
ESC
FS
GS
RS
US
010
!
"
#
$
%
&
'
(
)
*
+
,
.
/
011
0
1
2
3
4
5
6
7
8
9
:
;
<
=
>
?
100
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
101
P
Q
R
S
T
U
V
W
X
Y
Z
[
]
^
_
110
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
111
p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
DEL
18
18. ASCII Table
95 Printing codes
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
000
NULL
SOH
STX
ETX
EDT
ENQ
ACK
BEL
BS
HT
LF
VT
FF
CR
SO
SI
001
DLE
DC1
DC2
DC3
DC4
NAK
SYN
ETB
CAN
EM
SUB
ESC
FS
GS
RS
US
010
!
"
#
$
%
&
'
(
)
*
+
,
.
/
011
0
1
2
3
4
5
6
7
8
9
:
;
<
=
>
?
100
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
101
P
Q
R
S
T
U
V
W
X
Y
Z
[
]
^
_
110
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
111
p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
DEL
19
19. ASCII Table
33 Control codes
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
000
NULL
SOH
STX
ETX
EDT
ENQ
ACK
BEL
BS
HT
LF
VT
FF
CR
SO
SI
001
DLE
DC1
DC2
DC3
DC4
NAK
SYN
ETB
CAN
EM
SUB
ESC
FS
GS
RS
US
010
!
"
#
$
%
&
'
(
)
*
+
,
.
/
011
0
1
2
3
4
5
6
7
8
9
:
;
<
=
>
?
100
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
101
P
Q
R
S
T
U
V
W
X
Y
Z
[
]
^
_
110
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
111
p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
DEL
20
28. EBCDIC
8-bit code
ASCII
EBCDIC
Space
2016
4016
Character codes differ from
ASCII
A
4116
C116
Conversion software to/from
ASCII available
b
6216
8216
Developed by IBM for
mainframes computers
Rarely used today, common in
archival data
31
32. Unicode
Most common 16-bit form represents 65,536 characters
EASCII is a subset of Unicode
Values 0 to 255 in Unicode table
Multilingual: defines codes for
Nearly every character-based alphabet
Chinese, Japanese and Korean alphabets
Allows software modifications for local-languages
representations
35