1. 1
Focus on the engineering paradigm in IR.
Start with some historical context.
2
I think several of you will have heard of this man, even if you don’t recognize him by photo.
3
Claude Shannon (1916-2001)
an American mathematician, electronic engineer, and cryptographer who is credited as
founding "information theory“ and made major contributions to the fields of electrical
engineering and computer science. Did graduate work at MIT where he worked with
4
Vannevar Bush
And for the purposes of our story, more specifically with Bush differential calculator.
An analog calculatorused in World War II to calculate ballistics tables.
Containing more than a thousand gears, the machine took up an entire room. It was tediously
programmed by physically changing the gears with a screwdriver and wrench, and the output
was displayed as graphs. Shannon, however, was a tinkerer by nature and liked working with
the machine and so he would program the calculator with other scientist’s equations.
While studying the relay switches on the Differential Equalizer as they went about formulating
an equation, Shannon noted that the switches were always either open or closed, or on and
off. This led him to think about a mathematical way to describe the open and closed states,
and he recalled the logical theories of mathematician George Boole (which he had studied as
an undergrad at Michigan).
2. 5
George Boole
Who in the middle 1800s advanced what he called the logic of thought, in which all equations
were reduced to a binary system consisting of zeros and ones.
By reducing information to a series of ones and zeros, Shannon wrote, information could be
processed by using on-off switches. He also suggested that these switches could be connected
in such a way to allow them to perform more complex equations that would go beyond simple
‘yes’ and ‘no’ statements to ‘and’, ‘or’ or ‘not’ operations.
In his thesis, A Symbolic Analysis of Relay and Switching Circuits, Shannon proved that Boolean
algebra could be used to simplify the arrangement of the relays that were the building blocks
of the electromechanical automatic telephone exchanges of the day.
6
If you’re building a long distance telephone system, a more efficient and systemic way to
arrange these relays is a huge deal. Before this, the design of switches in circuits had been
designed by individuals ad hoc. This was a huge early success for Shannon.
7
Also fundamental to digital circuit design and the development of computers when his work
became widely known among the electrical engineering community during and after World
War II.
It is while working at Bell Labs that Shannon introduced the term ‘bit’ (short for binary digit) as
a measure of information. A bit is the amount of information stored by a digital device or
other physical system that exists in one of two possible distinct states: open or closed,
punched or not. More precisely, you can define a bit as the information that is gained when
the value of such a binary variable becomes known.
You could now quantify how much information was in a message by how many bits of
information it contained.
During the war he joined Bell Labs where he wrote…
3. 8
The Mathematical Theory of Communication, which was declassified and published in 1948.
And this brings us to Shannon’s conclusion, as an enormously successful engineer who had
written what is arguably the most important masters thesis in the 20th Century, who crossed
paths with Einstein as a Research Fellow and worked with Turing during the war.
Viewing information from a communication perspective he wrote: “Frequently the messages
have meaning; that is they refer to or are correlated according to some system with certain
physical or conceptual entities. These semantic aspects of communication are irrelevant to
the engineering problem.”
Shannon founded the field of Information Theory and his works underlies at lot of modern
Natural Language Processing.
9
This should also bring to mind something we’ve seen in class.
“As pointed out earlier, the method to be developed here is a probabilistic one based on the
physical properties of written texts. No consideration is to be given to the meaning of words
or the arguments expressed by word combinations.”
This reduces information retrieval to an engineering problem.
Argue: There is a critical difference between these two. It’s important to remember what
engineering problem Shannon was working on: how to electrically transmit messages quickly
and cheaply. The phone line doesn’t care about the meaning of a conversation.
Your word processors don’t care about the meaning of words.
4. 10
Is the paradigm of science and engineering applicable to information studies and particularly
IR?
11
To illustrate this, Wilson quotes a sentence from Gibbon’ The history of the decline and fall of
the Roman empire
“The Latins of Constantinople were on all sides and pressed their sole hope the last delay of
ruin was in the division of their Greek and Bulgarian and of this hope they were deprived by
the arms and policy of John Vataces emperor of Nice.”
"Classification by subjects would be an exceedingly useful method if it were practicable, but
experience shows it to be a logical absurdity." W. Stanley Jevons, 1905