Abstract - this document gives a general idea of what a CPU is, and about its design, how they are implemented, the beginning of the CPUs, a brief history of them, the problem that they presents and some new investigations.
THE CPU or Central Processing Unit, is the main component of the computer, and it the component that interpret the instructions and process data in the programs stored in the computer.
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
New Developments in the CPU Architecture
1. 1
Abstract - this document gives a general idea of what a CPU is,
and about its design, how they are implemented, the beginning of
the CPUs, a brief history of them, the problem that they presents
and some new investigations.
I. INTRODUCTION
HE CPU or Central Processing Unit, is the main
component of the computer, and it the component that
interpret the instructions and process data in the programs
stored in the computer.
A. Basic CPU Design
First of all, I am going to start explaining the beginnings of
the CPU architecture for a better comprehension of the new
architectures.
A good question to start could be: How does a CPU perform
chores?
At the beginning, CPU designers constructed their
processors using logic gates to execute a set of instructions to
work on. In order to use a reasonably small number of logic
gates they had to restrict the number and complexity of the
commands that their CPUs could recognize. This small set of
commands is the CPU’s instruction set.
B. The beginning
Programs in early, before the von Neumann architecture,
were hard - wired into the circuitry.
The first advance in the computer design was the
programmable computer system, which allowed to easily
rewiring the computer system using a sequence of sockets and
plug wires. A program was a set of rows of sockets, and each
one represented one operation during the execution of the
program. With this old scheme, the number of possible
instructions was limited by the number of sockets one could
physically place on each row.
Fig 1: Patch Panel Programming1
However, CPU designers quickly discovered that with a
small amount of additional logic circuitry, they could reduce
1 CPU Architecture. Chapter 4.
http://webster.cs.ucr.edu/AoA/Linux/PDFs/CPUArchitecture.pdf
the number of sockets required considerable. They did this by
assigning a numeric code to each instruction, and then
encoding that instruction as a binary number, decreasing
considerably the time of execution of a program.
Fig 2: Patch Panel Programming. Chapter 42
This provides the first advance before the Von Neumann
Architecture that consists in the concept of a stored program.
II. THE VON NEUMANN ARCHITECTURE
The architecture in which a single instruction is fetched into
the CPU, then decoded and executed is called the von
Neumann architecture, after that John von Neumann described
it in the year 1945.
Every electronic computer has been rooted in this
architecture since virtually.
A. Brief History
The first computer built with this type of architecture was
the Manchester Mark, that ran his first program in the year
1948, executing it out of its ninety six word memory, and it
executed an instruction in one point two milliseconds (some
current computers are rated in excess of one thousand million
of instructions per second).
Fig 3: The Manchester Mark.3
2 CPU Architecture. Chapter 4.
http://webster.cs.ucr.edu/AoA/Linux/PDFs/CPUArchitecture.pdf
3 The Manchester Mark I.
http://www.histoire-informatique.org/musee/2_3_7.html
New Developments in the CPU Architecture
María–Almudena García–Fraile Fraile, Student of the North East Wales Institute, NEWI
T
2. 2
Over the years, a number of computers have been claimed to
be “non-Von Neumann”, and many have been at least partially
so. More and more emphasis is being put on the necessity of
this, in order to achieve more usable and productive systems.
B. Explanation
Firstly, it is necessary to understand at all this architecture,
in order to comprehend and appreciate what new choices must
be found.
Von Neumann describes the general-purpose computing
machine containing four main organs, which are:
The Arithmetic Unit.
The Memory Unit.
The Control Unit
The Connections between them.
To von Neumann, the key to build this device was in its
ability to store the data and the intermediate results of
computation, and the instructions that brought about the
computation also.
Therefore, why not to encode the instructions into numeric
form and store instructions and data in the same memory? This
is frequently viewed as the principal contribution provided by
the new von Neumann Architecture.
He defined the control organ as that which would
automatically execute the coded instructions stored in memory.
He said that the orders and data can reside in the same memory
“if the machine can in some fashion distinguish a number from
an order”. And yet, there is no distinction between the two in
memory.
Von Neumann was actually very interested in the design of
the arithmetic unit. The capabilities of the arithmetic unit were
limited to the performance of some arbitrary subset of the
possible arithmetic operations. He observed that there is a
compromise between the desire for speed of operation and the
desire for simplicity, and this issue continued to dominate
design decisions for many years being still now a problem.
All these concepts that von Neumann gives us, provide the
foundations for all of the early computers developed, and some
of them are still with us today.
In the year 1982, Myers defines four properties that
characterize the von Neumann architecture. The first property
of his definition is that “instructions and data are distinguished
only implicitly through usage”, the second one is that “the
memory is a single memory, sequentially addressed”, the third
property is that “the memory is one-dimensional”, and finally,
the fourth property is that “the meaning of data is not stored
with it”.
C. Von Neumann Inconsistences
These aspects present inconsistencies with the high level
languages that are used today, and it is the cause of why we
can think that it is required something different to the von
Neumann Architecture.
A problem that the Von Neumann Architecture presents is
that all the data, the locations of the data and the operations
must travel between memory and CPU a word a time, and it
involves the problem of the von Neumann bottleneck.
Some advances have been introduced to solve this problem,
for example, it were introduced some concepts, as index
registers and general purpose registers, indirect addressing,
hardware interrupts, input and output in parallel with CPU
execution, virtual memory, cache memory and the use of
multiple modules.
Spite of the numerous improvements that have been
introduced to solve it this problem persists today.
D. “Non-von Neumann” Machines
The characteristics that a “non-von Neumann machine may
present are the following ones:
McKeeman proposed the “language directed”
design, in which the instructions themselves will
determine with a set of bits if they must be
operated as an integer, real, character or other data
type. Then, the computer will only need one ADD
operation, for example, and this provides more
simplified programs in terms of the bottleneck
problem (that means more expensive hardware).
Another proposal to avoid the problem of the von
Neumann bottleneck is the use of programs that
operate on structures or conceptual units, not on
words. Functions are defined without naming data,
and these functions are combined to provide a
program. An example of one language designed to
this type of architecture is LISP.
A third proposal, tries to replace the notion of
defining computation in terms of a sequence of
discrete operations. In this architecture, the
programmer defines the order in which the
operations will be executed, and the program
counter follows this order as the control executes
the instruction.
But the most difficult task connected with adapting new
architecture is that it is hard to think about them using von
Neumann oriented minds.
III. THE PIPELINE ARCHITECTURE
This architecture divides the instructions in stages in the
execution, and while one is in its stage of execution another
one can be decoded, which is a big improvement into the
computers systems architecture.
3. 3
Fig 4: Generic 4-stage pipeline4
With the pipeline architecture, the processing speed is
decreased, but data are not processes faster. It means, that the
pipeline architecture improves the processing speed of all the
work, but not the latency for each task.
IV. MULTIPROCESSOR SYSTEMS
Parallel processing usually means that we will have more
than one processor.
There are different ways to organize the processor and the
memory, some of them are the explained following.
A. Flyn’s Taxonomy
He classifies the parallel computer architectures in terms of
the concurrent instructions and data streams available in the
architecture.
He gives four categories:
SISD: Single-instruction, single-data. A single
instruction stream is executed in a single processor
to operate on data in a single memory.
MISD: Multiple-instruction, single-data. Each
processor has its control unit and its local memory,
and each one of them operates under the control of
an instruction stream.
SIMD: Single-instruction, multiple-data. Many
simple processors, each one with its local memory,
have all of them the same single computer
instructions.
MIMD: Multiple-instruction, multiple-data.
Multiple computer instructions performs actions
simultaneously on two or more pieces of data.
B. CC-NUMA
Known as cache coherent Non-Uniform Memory Access.
4 Instruction pipeline.
www.wikipedia.org
This architecture maintains cache coherence across shared
memory. It takes place using inter-processor communication
between cache controllers.
V. MEMORY PROTECTION
A. Definition
Into a computer, the memory is the location where
information that is in use by the operating system, software
programs, hardware devices, etcetera is stored.
But having memory shared between them implies collisions,
so it is necessary to protect the memory.
B. Concurrency
It is way of computing, where many instructions are carried
out simultaneously. This implies the necessity of
communication and synchronization in order to get good
parallel program performance.
This way of computing becomes some advantages, as for
example, it decreases the time necessary to process a program
and reduce the size of the memory necessary to do it, but it
also presents some disadvantages, as complexity and higher
costs.
The architectures that support concurrent programming are:
Single Processor: only one CPU.
Several Processors: two or more CPUs within a
single computer system.
Distributed Programming: different parts of a
program run at the time on two or more computers
communicated over a network.
C. Solutions for concurrency
There are many solutions offered to the problem of the
concurrency, some of them are hardware solutions and others
are implemented in software.
Hardware solutions:
Test and Set: it is used to test and write (if the
condition allows it) a memory location as part of an
operation. Returns current data and set one.
Compare and Swap: it takes two addresses and an
integer. If the first address hold a determined
integer the swaps occurs.
Software solutions:
Semaphores: its mission is to restrict access to
shared resources.
Critical Sections: we will have critical sections
that will be controlled, in order to avoid collisions.
Monitors: its mission is to synchronize two or
more tasks that use a shared resource, locking and
unlocking determined parts of the code.
4. 4
VI. SOME NEW PROCESSORS
A. Intel Core 2 Quad
Intel Core Quad is a number of processors of Intel (2007)
with 4 cores of 64 bit. Actually are 2 Core Duo in the same
socket, and it gives us four real cores.
Fig 5: Intel Core 2 Extreme QX670005
B. AMD Quad Core Chip
This microprocessor is oriented to servers and it has eight
cores in the same piece, in order to give them more power.
Fig 6: AMD Quad Core Chip6
C. Sun presents Niagara
This microprocessor is oriented to servers and it has eight
cores in the same piece, in order to give them more power.
5 Quad Core. http://www.behardware.com/art/imprimer/642/
6 Quad Core Chip.
http://www.wired.com/techbiz/it/news/2007/09/barcelona
Fig 7: Niagara Microprocessor7
VII. THE FUTURE
It is well known that there is a limit to how many cores can
be in the same chip, so it is the reason of why processor
designers are looking for new generations of chips.
Tile is the next generation of chips design, and it consists in
a number of processor cores and a router that are connected
end to end and looking like a grip map of a city. Instructions
go along their route back and forth across the chips, and
different instructions can run in parallel simultaneously
without having to wait for one another.
Intel detailed a prototype 80-core processor made up of
tiles, but it is a prototype with no immediate plans of
developing a product with it.
Nowadays, chip makers are studying parallel computing
because having so many core processors in the same chipset
will limit its capacity.
Alan Jay Smith, from the University of California says that
“everyone’s got the same problem. They have got more real
estate on the chip than they can usefully spend on a uni-
processor, and a uni-processor runs very hot”. He thinks that
“everyone is working on parallelism because you can build it
now more effectively”. And finally he adds that “people think
in a linear way. Most programs out there are linear. Converting
the software into a parallel form where you can have
computation going on in multiple processors at once is hard”.
Development tools, compilers and programmers need to
make an effort to start programming in a parallel way. The best
way should be smart compilers that automatically divide a
linear program in several parallel threads. For example C# 3.0
has some kind of automatically parallels thread code.
But it is very complicated for compiler developers to make
automatic or single instructions to use parallelism in normal
linear program. So programmers need to upgrade from Object
Oriented Programming techniques to Parallel Programming.
Service Oriented Architecture (S.O.A) helps effectively in
this effort as every user is served by a thread allowing
effectively use of each core.
VIII. CONCLUSIONS
It is true that nowadays nearly all the computer that we
know there are based into the von Neumann Architecture and
7 Niagara Microprocessor. http://bblfish.net/blog/page6.html
5. 5
it in the last years a lot of advances were made into the subject,
in order to solve the problems that this architecture presents.
There were some attempts to introduce some “non-von
Neumann” Architectures, but finally the programs developed
into them finally ran into the von Neumann Architecture.
The last investigations and developments gives as to the run
of finding new architectures to develop the parallelism, and it
gives us some new microprocessors as the Intel Core 2 Quad,
from Intel, the Quad-Core Chip from AMD and the Niagara
from Sun Microsystems.
But all of this gives as to the obstacle of how many chips
can be in the same CPU avoiding the problems of the increase
of the temperature, the increase of the complex of their circuits
and the communications between them, and it brings us to the
new investigations, the Tiles.
Even efforts in upgrading the ability of computers to handle
CPU temperature have a limit. Air cooling improved in
previous years. Nowadays there are some computers, even
personal computers, with water cooling systems. But these
improvements in heat dissipation are reaching his limit
between cost and efficiency.
A new CPU architecture is needed, others solutions do not
solve the main problem, only extends the life of Von Neumann
architecture.
Integrating more cores into a CPU chip is a complex
engineering task that only leads to a small efficiency upgrade.
High budget on Research and Development is needed for CPU
companies in order to get small amount of improvement. The
budget should be better expended in new architectures.
To sump up, I want to add, that actually some very big
corporations have the need of very powerful CPUs for their
servers. Day after day there are a lot of CPU designers
working to solve it but always, having powerful CPUs
increases the complexity and the costs.
REFERENCES
[1] CPU Architecture. Chapter Four
http://webster.cs.ucr.edu/AoA/Linux/PDFs/CPUArchitecture.pdf
[2] AMD. Next CPU Architecture will be completely different.
http://www.custompc.co.uk/news/602511/amd-next-cpu-architecture-
will-be-completely-different.html
[3] Tile is the next hot multicore chip design.
http://pcworld.about.com/od/cpuarchitecture/Tile-is-the-next-hot-
multicore.htm
[4] CS 6220: Concurrency in Hardware.
http://www.cs.usu.edu/~jerry/Classes/6220/Notes/hardware.html
[5] Concurrency Solutions.
http://www.ayende.com/Blog/archive/2008/01/08/Concurrency-
Solutions.aspx
[6] Concurrency Control.
http://en.wikipedia.org/wiki/Concurrency_control
[7] CPU Socket
http://en.wikipedia.org/wiki/List_of_CPU_sockets
[8] List of Intel Microprocessors
http://en.wikipedia.org/wiki/List_of_Intel_microprocessors
[9] List of AMD Microprocessors
http://en.wikipedia.org/wiki/List_of_AMD_microprocessors
[10] Intel Core 2 Quad
http://es.wikipedia.org/wiki/Core_2_Quad
[11] Intel Core 2 Extreme QX6700 (Quad Core) – BeHardware by Marc
Prieur
http://www.behardware.com/art/imprimer/642/
[12] AMD Pins Hopes on Barcelona. Quad-Core Chips
http://www.wired.com/techbiz/it/news/2007/09/barcelona
[13] Sun presents Niagara
http://www.vnunet.es/Actualidad/Noticias/Infraestructuras/Hardware/20
051115023
[14] The BabelFish Blog
http://bblfish.net/blog/page6.html
[15] Welcome to Hot Chips 19
http://pcworld.about.com/gi/dynamic/offsite.htm?site=http://www.hotch
ips.org/hc19/main_page.htm
[16] Application-Customisez CPU Design by Jeffrey Brown
http://www-128.ibm.com/developerworks/power/library/pa-
fpfxbox/?ca=dgr-lnxw07XBoxDesign
[17] Processor Design. An introduction.
http://www.gamezero.com/team-
0/articles/math_magic/micro/index.html