data-structures_unit-01.pdf

MR. JAYANAND KAMBLE
WELCOMETO JAK’STUTORIAL
Prof. Jayanand Kamble
1

SUBJECT NAME: BTCOC303 DATA STRUCTURES
2
WeeklyTeaching hrs Evaluation Scheme
Credit
L T P CA MSE ESE
3 4 8 20 20 60 4

CONTENTS: UNIT 1 [6 HRS]
• Introduction: Data, Data types, Data structure,
• Abstract Data Type (ADT), representation of Information,
• characteristics of algorithm, program,
• analyzing programs.
3

INTRODUCTION: DATA
• Computer data is information processed or stored by a computer.
• This information may be in the form of text documents, images, audio clips, software
programs, or other types of data.
• Computer data may be processed by the computer's CPU and is stored in files and
folders on the computer's hard disk.
4

• Data can be defined as a representation of facts, concepts, or instructions in a formalized
manner, which should be suitable for communication, interpretation, or processing by
human or electronic machine.
• Data is represented with the help of characters such as alphabets (A-Z, a-z), digits (0-9)
or special characters (+,-,/,*,<,>,= etc.)
5

• Information is organized or classified data, which has some meaningful values for the
receiver.
• Information is the processed data on which decisions and actions are based.
6

DATA TYPES
7
A data type is a
collection of objects
and a set of operations
that act on those
objects.

DATA STRUCTURE
• A data structure is a way of storing data in a computer so that it can be used efficiently
and it will allow the most efficient algorithm to be used.
• A data structure should be seen as a logical concept that must address two fundamental
concerns.
• First, how the data will be stored, and
• Second, what operations will be performed on it.
8

TYPES OF DATA STRUCTURES
• There are two types of data structures:
• Primitive data structure
• Non-primitive data structure
9

PRIMITIVE DATA STRUCTURE
• The primitive data structures are primitive data types.
• The int, char, float, double, and pointer are the primitive data structures that can hold a
single value.
10

NON-PRIMITIVE DATA STRUCTURE
• The non-primitive data structure is divided into two types:
• Linear data structure
• Non-linear data structure
11

LINEAR & NON-LINEAR DATA STRUCTURE
• The arrangement of data in a sequential manner is known as a linear data structure.
• The data structures used for this purpose are Arrays, Linked list, Stacks, and Queues.
• In these data structures, one element is connected to only one another element in a
linear form.
• When one element is connected to the 'n' number of elements known as a non-linear
data structure.
• The best example is trees and graphs.
• In this case, the elements are arranged in a random manner.
12

DIFFERENCE BETWEEN LINEAR AND NON LINEAR
DATA STRUCTURE
Linear Data Structure Non-Linear Data Structure
Every item is related to its previous and next time. Every item is attached with many other items.
Data is arranged in linear sequence. Data is not arranged in sequence.
Data items can be traversed in a single run. Data cannot be traversed in a single run.
Eg.Array, Stacks, linked list, queue. Eg. tree, graph.
Implementation is easy. Implementation is difficult.
13

DATA STRUCTURES CAN ALSO BE CLASSIFIED AS:
• Static data structure:
• It is a type of data structure where the size is allocated at the compile time.
• Therefore, the maximum size is fixed.
• Dynamic data structure:
• It is a type of data structure where the size is allocated at the run time.
• Therefore, the maximum size is flexible.
14

• Few areas in which data structures are used.
• Compiler Design,
• Operating System,
• Database Management System,
• Statistical analysis package,
• Numerical Analysis,
• Graphics,
• Artificial Intelligence,
• and Simulation.
15

• Data structures are used with the following areas: RDBMS, Network data model and
hierarchical data model.
• RDBMS implements Array data structure.
• Network data model uses Graph.
• Hierarchal data model usesTrees.
16

ABSTRACT DATA TYPE (ADT)
• The DataType is basically a type of data that can be used in different computer program.
• It signifies the type like integer, float etc, the space like integer will take 4-bytes, character
will take 1-byte of space etc.
• The abstract datatype is special kind of datatype, whose behavior is defined by a set of
values and set of operations.
• An abstract data type(ADT) is a data type that is organized in such a way that the
specification of the objects and the operations on the objects is separated from the
representation of the objects and the implementation of the operations.
17

• The keyword “Abstract” is used as we can use these datatypes, we can perform different
operations. But how those operations are working that is totally hidden from the user.
• The ADT is made of with primitive datatypes, but operation logics are hidden.
• Some examples of ADT are Stack, Queue, List etc.
18

ADT NAT_NO
19
structure Natural_Number is
objects: an ordered subrange of the integers starting at zero and ending
at the maximum integer (INT_MAX) on the computer
functions:
for all x, y  Nat_Number; TRUE, FALSE  Boolean
and where +, -, <, and == are the usual integer operations.
Nat_No Zero ( ) ::= 0
Boolean Is_Zero(x) ::= if (x) return FALSE
else return TRUE
Nat_No Add(x, y) ::= if ((x+y) <= INT_MAX) return x+y
else return INT_MAX
Boolean Equal(x,y) ::= if (x== y) return TRUE
else return FALSE
Nat_No Successor(x) ::= if (x == INT_MAX) return x
else return x+1
Nat_No Subtract(x,y) ::= if (x<y) return 0
else return x-y
end Natural_Number

SOME COMMON ADTS,WHICH HAVE PROVED
USEFUL IN A GREATVARIETY OF APPLICATIONS,ARE
• Container
• Deque
• List
• Map
• Multimap
• Multiset
20
• Priority queue
• Queue
• Set
• Stack
• String
• Tree

• Each of these ADTs may be defined in many ways and variants, not necessarily equivalent.
• For example, a stack ADT may or may not have a count operation that tells how many
items have been pushed and not yet popped.
• This choice makes a difference not only for its clients but also for the implementation.
21

REPRESENTATION OF INFORMATION
• Information:
• Any type of knowledge that can be exchanged. In an exchange, it is represented by data.
• An example is a string of bits (the data) accompanied by a description of how to interpret a
string of bits as numbers representing temperature observations measured in degrees Celsius
• Data:
• A reinterpretable representation of information in a formalized manner suitable for
communication, interpretation, or processing
22

• Data Representation refers to the form in which data is stored, processed, and
transmitted.
• Devices such as smartphones, iPods, and computers store data in digital formats that can
be handled by electronic circuitry.
• Digitization is the process of converting information, such as text, numbers, photo, or
music, into digital data that can be manipulated by electronic devices.
23

• The 0s and 1s used to represent digital data are referred to as binary digits — from this
term we get the word bit that stands for binary digit.
• A bit is a 0 or 1 used in the digital representation of data.
• A digital file, usually referred to simply as a file, is a named collection of data that exits on
a storage medium, such as a hard disk, CD, DVD, or flash drive
24

REPRESENTING NUMBERS
• Numeric data consists of numbers that can be used in arithmetic operations.
• Digital devices represent numeric data using the binary number system, also called base 2.
• The binary number system only has two digits: 0 and 1.
• No numeral like 2 exists in the system, so the number “two” is represented in binary as 10
(pronounced “one zero”).
26

REPRESENTING NUMBERS
27

REPRESENTINGTEXT
• Character data is composed of letters, symbols, and numerals that are not used in
calculations.
• Examples of character data include your name, address,and hair color.
• Character data is commonly referred to as “text.”
• Digital devices employ several types of codes to represent character data, including ASCII,
Unicode, and their variants.
• ASCII (American Standard Code for Information Interchange, pronounced “ASKee”) requires
seven bits for each character.
• The ASCII code for an uppercase A is 1000001.
28

• Extended ASCII is a superset of ASCII that uses eight bits for each character.
• For example, Extended ASCII represents the uppercase letter A as 01000001.
• Using eight bits instead of seven bits allows Extended ASCII to provide codes for 256
characters.
• Unicode (pronounced “YOU ni code”) uses sixteen bits and provides codes or 65,000
characters.
• This is a bonus for representing the alphabets of multiple languages.
• UTF-8 is a variable-length coding scheme that uses seven bits for common ASCII characters
but uses sixteen-bit Unicode as necessary
29

• ASCII codes are used for numerals, such as Social Security numbers and phone numbers.
• Plain, unformatted text is sometimes called ASCII text and is stored in a so-called text file
with a name ending in .txt.
• On Apple devices these files are labeled “PlainText.” InWindows, these files are labeled
“Text Document”.
• ASCII text files contain no formatting.To create documents with styles and formats,
formatting codes have to be embedded in the text.
30

• MicrosoftWord produces formatted text and creates documents in DOCX format.
• Apple Pages produces documents in PAGES format.
• Adobe Acrobat produces documents in PDF format.
• HTML markup language used forWeb pages produces documents in HTML format.
31

BITES AND BYTES
• All the data stored and transmitted by digital devices is encoded as bits.
• Terminology related to bits and bytes is extensively used to describe storage capacity and
network access speed.
• The word bit, an abbreviation for binary digit, can be further abbreviated as a lowercase b.
• A group of eight bits is called a byte and is usually abbreviated as an uppercase B.
• When reading about digital devices, you’ll frequently encounter references such as 90
kilobits per second, 1.44 megabytes, 2.8 gigahertz, and 2 terabytes.
• Kilo, mega, giga, tera, and similar terms are used to quantify digital data.
32

• Use bits for data rates, such as Internet connection speeds, and movie download speeds.
• Use bytes for file sizes and storage capacities.
• 104 KB: Kilobyte (KB or Kbyte) is often used when referring to the size of small
computer files.
• 56 Kbps: Kilobit (Kb or Kbit) can be used for slow data rates, such as a 56 Kbps (kilobits
per second) dial-up connection.
• 50 Mbps: Megabit (Mb or Mbit) is used for faster data rates, such as a 50 Mbps (megabits
per second) Internet connection.
33

CHARACTERISTICS OF ALGORITHM,
• Algorithms:
• A finite sequence of instructions, each of which has a clear meaning and can be performed
with a finite amount of effort in a finite length of time.
• An algorithm is a finite set of instructions that accomplishes a particular task
34

STRUCTURE AND PROPERTIES OF ALGORITHM:
• An algorithm has the following structure
1. Input Step
2.Assignment Step
3. Decision Step
4. Repetitive Step
5. Output Step
35

PROPERTIES OF ALGORITHM
• An algorithm is endowed with the following properties:
1. Finiteness:An algorithm must terminate after a finite number of steps.
2. Definiteness:The steps of the algorithm must be precisely defined or unambiguously specified.
3. Generality:An algorithm must be generic enough to solve all problems of a particular class.
4. Effectiveness: the operations of the algorithm must be basic enough to be put down on pencil
and paper. They should not be too complex to warrant writing another algorithm for the
operation.
5. Input-Output: The algorithm must have certain initial and precise inputs, and outputs that may
be generated both at its intermediate and final steps.
36

PRACTICAL ALGORITHM DESIGN ISSUES:
• To save time (Time Complexity):A program that runs faster is a better program.
• To save space (Space Complexity): A program that saves space over a competing program
is consider as desirable.
37

EFFICIENCY OF ALGORITHMS
• The performances of algorithms can be measured on the scales of time and space.
• The performance of a program is the amount of computer memory and time needed to
run a program.
• We use two approaches to determine the performance of a program.
• One is analytical and the other is experimental.
• In performance analysis we use analytical methods, while in performance measurement
we conduct experiments.
38

• Time Complexity:
• The time complexity of an algorithm or a program is a function of the running time of the
algorithm or a program.
• In other words, it is the amount of computer time it needs to run to completion.
• Space Complexity:
• The space complexity of an algorithm or program is a function of the space needed by the
algorithm or program to run to completion.
39

• The time complexity of an algorithm can be computed either by an empirical or
theoretical approach.
• The empirical or posteriori testing approach calls for implementing the complete
algorithms and executing them on a computer for various instances of the problem.
• The time taken by the execution of the programs for various instances of the problem
are noted and compared.
• The algorithm whose implementation yields the least time is considered as the best
among the candidate algorithmic solutions.
40

ANALYZING ALGORITHMS
• Suppose M is an algorithm, and suppose n is the size of the input data.
• Clearly the complexity f(n) of M increases as n increases.
• It is usually the rate of increase of f(n) with some standard functions.
• The most common computing times are
• O(1), O(𝑙𝑜𝑔 𝑛), O(n), O(𝑛 𝑙𝑜𝑔 𝑛), O(𝑛 ), O(𝑛 ), O(2 ).
41

EXAMPLES
42

• The total frequency counts of the program segments A, B and C given by 1, (3n+1) and
(3𝑛 +3n+1) respectively are expressed as O(1), O(n) and O(𝑛 ).
• These are referred to as the time complexities of the program segments since they are
indicative of the running times of the program segments.
• In a similar manner space complexities of a program can also be expressed in terms of
mathematical notations, which is nothing but the amount of memory they require for
their execution.
44

REASONS FOR ANALYZING ALGORITHMS
• To predict the resources that the algorithm requires
• Computational Time(CPU consumption).
• Memory Space(RAM consumption).
• Communication bandwidth consumption.
• To predict the running time of an algorithm
• Total number of primitive operations executed.
45

PROGRAM: HOW TO CREATE PROGRAMS?
• Requirements:
• Make sure you understand the information you are given (the input) and what results you are
to produce (the output).
• Try to write down a rigorous description of the input and output which covers all cases.
• Design:
• You may have several data objects (such as a maze, a polynomial, or a list of names).
• For each object there will be some basic operations to perform on it (such as print the maze,
add two polynomials, or find a name in the list.)
• Assume that these operations already exist in the form of procedures and write an algorithm
which solves the problem according to the requirements.
• Use a notation which is natural to the way you wish to describe the order of processing.
46

• Analysis:
• If you can think of another algorithm, then write it down.
• Next try to compare the two algorithms you have in hand.
• It may already be possible to tell if one will be more desirable than the other.
• If you cannot distinguish between two, choose one to work on for now.
• Refinement and coding:
• You must now choose representations for your data objects and write algorithms for each of
the operations on these objects
47

• Verification:
• Verification consists of three distinct aspects: program proving, testing and debugging.
• Each of these is an art.
• Before executing your program you should attempt to prove it is correct.
• Testing is the art of creating sample data upon which to run your program.
• If the program fails to run correctly then debugging is needed to determine what went wrong
and how to correct it.
48

EXAMPLE OF CREATING PROGRAM
• Suppose we devise a program for sorting a set of n>=1 integers
• "from those integers which remain unsorted, find the smallest and place it next in the
sorted list."
49
for i:=1 to n do
begin
examine a[i] to a[n] and suppose the smallest integer is at a[j];
interchange a[i] and a[j];
end;

• There now remain two clearly defined subtasks:
• (i) to find the minimum integer and
• (ii) to interchange it with a[i].
• This latter problem can be solved by the code
• t:=a[i]; a[i]:=a[j]; a[j]:=t;
50

• The first subtask can be solved by assuming the minimum is a[i], checking a[i] with a[i+1],
a[i+2],...and whenever a smaller element is found, regarding it as the new minimum.
• Eventually a[n] is compared to the current minimum and we are done.
• Putting all these observations together we get the procedure sort.
51

ANALYZING PROGRAMS
• There are many criteria upon which we can judge a program, for instance:
• Does it do what we want it to do?
• Does it work correctly according to the original specifications of the task?
• Is there documentation which describes how to use it and how it works?
• Are procedures created in such a way that they perform logical sub-functions?
• Is the code readable?
52

• There are other criteria for judging programs which have a more direct relationship to
performance.
• These have to do with computing time and storage requirements of the algorithms.
• Performance evaluation can be loosely divided into 2 major phases:
• (a) a priori estimates and
• (b) a postpriori testing.
53

CONSIDER THE EXAMPLES
54
x:=x+1;
We assume that the statement
x:=x+1 is not contained within any
loop either explicit or implicit. Then
its frequency count is one.
for i:=1 to n do
x:=x+1;
Now, the same statement will be
executed n times.
for i:=1 to n do
for i:=1 to n do
x:=x+1;
It will be executed (n*n) times now

END OF UNIT 01
• References:
• “Fundamentals of Data Structure” book by Ellis Horowitz and Sartaj Sahni.
55

data-structures_unit-01.pdf

Recommandé

Recommandé

Contenu connexe

Similaire à data-structures_unit-01.pdf

Similaire à data-structures_unit-01.pdf (20)

Dernier

Dernier (20)

data-structures_unit-01.pdf