SlideShare une entreprise Scribd logo
1  sur  87
Télécharger pour lire hors ligne
2014
MASTERS ENGINERING PROJECT
RICHARD DENNIS
500198 | University of Portsmouth
Development of
a Tor library
Supervisor: Dr Gareth Owen
Moderator: Dr Nick Savage
Abstract
Given the increasing popularity of Tor following the Edward Snowdon revelations, the dark net has
been the subject of much debate, both amongst academics and in the media. Existing Tor libraries
require Tor to be installed on the user's PC and are currently incapable of conducting an attack against
Tor, which would be advantageous to law enforcement agencies on local, national and international
levels. This leaves room for further development within the field and it was with these reasons in mind
that the current study was undertaken.
Aiming to create an intuitive application that is not reliant on an existing Tor client, this project aims
to radicalise the field by creating a Tor library that is capable of providing a foundation for further
development that could lead to the de-anonymization of Tor users. The outcome of this project was
that a fully functioning independent library was created that, following extensive testing, was capable
of conducting an initial Denial of Service attack against a Tor node.
The results of this attack were inconclusive, however this serves to demonstrate that the developer
has fulfilled his objective of creating a forward-looking library that provides a solid base for future
development in the field. It is hoped that the research and code developed throughout this project
will contribute to the development of an attack on Tot that can de-anonymise users of the network
and thus make a valuable contribution to law enforcement and the field of cyber security.
Acknowledgements
I wish to thank Dr. Gareth Owen for providing valuable guidance and support throughout the course
of the project in his role as project supervisor
Contents
1. Introduction ........................................................................................................................................1
1.1. Introduction ............................................................................................................................1
1.2. Rationale.................................................................................................................................1
1.3. The problem............................................................................................................................1
1.4. Objectives................................................................................................................................1
1.5. Constraints..............................................................................................................................2
1.6. Project Deliverables ................................................................................................................2
1.7. Report Structure overview......................................................................................................2
2. Literature Review............................................................................................................................4
2.1. Introduction ............................................................................................................................4
2.2. Analysis ...................................................................................................................................4
What is Tor?....................................................................................................................................4
History of Tor ..................................................................................................................................4
How Tor works................................................................................................................................5
Hidden services...............................................................................................................................6
Who uses Tor and for what?...........................................................................................................7
Current Tor development projects .................................................................................................8
Current attacks on Tor ....................................................................................................................8
Conclusion.......................................................................................................................................9
3. Project Management ....................................................................................................................10
3.1. Brief description of chapter..................................................................................................10
3.2. Select methodology ..............................................................................................................10
3.3. Requirements elicitation.......................................................................................................11
3.4. Risk analysis ..........................................................................................................................13
3.5. Schedule................................................................................................................................14
3.6. Professional issues................................................................................................................14
3.7. Conclusion of the section......................................................................................................14
4. Application Development .............................................................................................................15
4.1. Iteration 1 .............................................................................................................................15
4.1.1. Requirements................................................................................................................15
4.1.2. Design............................................................................................................................17
4.1.3. Implementation ............................................................................................................19
4.1.4. Testing...........................................................................................................................21
4.1.5. Moving forward from Iteration 1..................................................................................23
4.2. Iteration 2 .............................................................................................................................24
4.2.1. Requirements................................................................................................................24
4.2.2. Design............................................................................................................................24
4.2.3. Implementation ............................................................................................................25
4.2.4. Testing...........................................................................................................................29
4.2.5. Moving forward from Iteration 1..................................................................................32
4.3. Iteration 3 .............................................................................................................................33
4.3.1. Requirements................................................................................................................33
4.3.2. Design............................................................................................................................33
4.3.3. Implementation ............................................................................................................34
4.3.4. Testing...........................................................................................................................36
4.3.5. Moving forward from iteration 3..................................................................................38
4.4. Iteration 4 .............................................................................................................................39
4.4.1. Requirements................................................................................................................39
4.4.2. Design............................................................................................................................39
4.4.3. Implementation ............................................................................................................40
4.4.4. Testing...........................................................................................................................44
4.5. Iteration 5 .............................................................................................................................47
4.5.1. Requirements................................................................................................................47
4.5.2. Design............................................................................................................................47
4.5.3. Implementation ............................................................................................................49
4.5.4. Testing...........................................................................................................................51
5. Evaluation .....................................................................................................................................52
6. Summary, conclusion and recommendations ..............................................................................54
7. Bibliography ..................................................................................................................................55
8. Appendixes....................................................................................................................................59
Appendix 1 – Project Initialization document...................................................................................59
Appendix 2 – Ethical checklist...........................................................................................................66
Appendix 3 – Original planned project schedule..............................................................................73
Appendix 4 – Project schedule updated due to iteration 1 overrunning .........................................74
Appendix 5 – Project schedule updated due to iteration 2 overrunning .........................................75
Appendix 6 – Suitability analysis questionnaire ...............................................................................76
Appendix 7 – Black box testing questionnaire..................................................................................77
Appendix 8 – Evaluating functional requirements ...........................................................................78
Appendix 9 - Evaluating Non-Functional requirements....................................................................80
Appendix 10 – Actual project schedule ............................................................................................81
List of Figures
Figure 1 - Onion routing circuit...............................................................................................................6
Figure 2 - Hidden service architecture....................................................................................................7
Figure 3 - Agile methodology diagram..................................................................................................10
Figure 4 - Table showing possible elicitation methods.........................................................................12
Figure 5 - Table containing potential risks and countermeasures........................................................13
Figure 6 - Table containing requirements for iteration 1 .....................................................................15
Figure 7 - Non-functional requirements ...............................................................................................16
Figure 8 - Code snippet showing connection to a Tor node.................................................................19
Figure 9 - Testing results for iteration 1................................................................................................22
Figure 10 - Functional requirements for iteration 2.............................................................................24
Figure 11 - Code snippet showing the creation of Diffie Hellman public and private keys..................26
Figure 12 - Code snippet showing the Tor Circuit Class........................................................................26
Figure 13 - Code snippet showing the decoding of a Create cell..........................................................27
Figure 14 - Code snippet showing the decryption function..................................................................27
Figure 15 - Code snippet showing the creation of the Stream cell.......................................................28
Figure 16 - Code snippet showing the conversion of error codes to english .......................................29
Figure 17 - Testing results iteration 2 ...................................................................................................31
Figure 18 - Requirements iteration 3....................................................................................................33
Figure 19 - Code snippet showing the creation of a rendezous point..................................................35
Figure 20 - Code snippet showing how the service is downloaded and saved to a text file ................36
Figure 21- Testing results iteration 3....................................................................................................38
Figure 22 - Requirements for iteration 4 ..............................................................................................39
Figure 23 - Code snippet showing how the service descriptor data is extracted.................................41
Figure 24 - Code snippet showing base 64 decoding............................................................................41
Figure 25 - Code snippet showing dynamic methods to extract data from a service descriptor.........42
Figure 26 - Code snippet showing how to connect to an induction point............................................43
Figure 27- Code snippet showing the packet creation of stream to a hidden service .........................44
Figure 28 - Testing results iteration 4...................................................................................................45
Figure 29 - Diagram showing the bandwidth saturation attack ...........................................................48
Figure 30 - Configuration screenshot showing the Max bandwidth allowed per day on a Tor node ..49
Figure 31 - Code snippet showing a function to allow code to run every 24 hours.............................49
Figure 32 - Code snippet showing the DoS attack ................................................................................50
1 | P a g e
1. Introduction
1.1. Introduction
This introduction discusses the motivation and reasons behind this project, as well the project
objectives and constraints before concluding with a brief summary of all the chapters in this report.
1.2. Rationale
Since the Edward Snowden revelations regarding government surveillance, more and more people
have been wanting privacy and anonymity when using the internet. The Onion Router (hereafter
known as Tor) provides such a service, and is widely used by people wanting to remain anonymous on
the internet and in countries where internet censorship is a major issue.
As well as providing anonymity to internet users, Tor also allows a website or service to be hosted
within the Tor network, these are known as hidden services. A normal website is hosted on the
internet; the hosting location can be found and visitors to the site can be monitored, but a hidden
service provides the person hosting the service and the site’s users with anonymity. This can lead to
websites containing illegal material such as child pornography or drugs being hosted on Tor as both
the users and the host are guaranteed complete anonymity. This demonstrates that being anonymous
on the internet brings with it a lot of issues, one of which is accountability. For example, in the event
that somebody had used the internet for nefarious means, how can we prove they accessed a certain
website or illegal service?
1.3. The problem
Despite the increased exposure and popularity of Tor, development in the field is currently almost
non-existent. Although many research papers look at theoretical attacks on Tor (Borisov et al, 2013;
Biryukov et al; 2013b; Jansen, 2014), there is not currently a Tor application, library or framework that
would facilitate these theoretical attacks.
Currently, to use Tor, a user downloads and installs it to their machine. The application is very limited
in terms of its use; it is currently only designed to connect a user to the Tor network and allow them
to use a pre-configured web browser to use the internet anonymously.
Although is an open-source project, meaning development of the Tor client is possible, the sheer size
and complexity of the application makes developing it extremely challenging. This means there is no
room to easily build upon or further develop the current application, for example to increase the
security of Tor or to program attacks to de-anonymise users.
It is clear that there is a need for an application that can connect a user to the Tor network, which
provides all the functionality of the current Tor application but which is also able to be expanded and
built up, with a final goal of implementing an attack such as a denial of service or de-anonymization
attack.
1.4. Objectives
The objective of the project is to create a Tor library which, unlike previous Tor libraries, does not
require the user to install Tor. This will make this project unique. It will also make the project more
challenging as the application will need to use the Tor protocol to communicate with the Tor network.
This involves challenges such as encrypting and decrypting packets to Tor nodes and communicating
with hidden services. This project aims to develop a fully functional library whose functionality is
comparable to that of the Tor client and, in the wider context, which will be able to be used for future
development for projects such as attacking or hacking Tor.
2 | P a g e
1.5. Constraints
With all projects many constraints faced during the development of the project, both internally and
externally.
 Hard deadline – 12th
September 2014. On this day all project deliverables must be handed in
to the client.
 Zero budget
 Availability of software – Although the software used will be open source due to the budget
constraint, support or access to the software may be limited.
 Knowledge of the problem is limited, and in order to gain the necessary understanding of Tor,
as well as learning new skills required for the project, will require a large amount of time.
 Other commitments will interfere with the project; good project and time management are
required.
1.6. Project Deliverables
The list of project deliverables for this project is:
 A Tor library that can connect to Tor, Tor’s hidden services and which can be expanded on
 A report to document the development of the application consisting of:
o Literature review
o Project Management
o Specification and discussions of the requirements
o Application Development
o Summary of the project
o Evaluation against requirements
o Conclusion of the project
o Bibliography
1.7. Report Structure overview
Chapter Title Synopsis
1 Introduction
2 Literature review Firstly this chapter analyses what methods are available to
ensure the most appropriate documents were chosen to review.
Next is the literature review itself, this will cover the topic areas
of:
 What is Tor? (Historical and technical discussion)
 How does Tor provide its users with anonymity and
privacy?
 Uses of Tor
 What is the social impact of Tor?
 Alternatives to Tor
3 Project
management
Discusses why the method of project management chosen for
this project was the most appropriate choice and the process
that would be used to elicit requirements for the project. Next, a
risk analysis and various countermeasures are discussed before
concluding with a brief discussion of the intended schedule and
any relevant professional issues that could arise.
4 Application
Development
This chapter looks at each iteration stage of development. The
requirements of each iteration are discussed, followed by a
description of the design process and the implementation
3 | P a g e
procedure. The testing of each iteration is explained as well as
concluding which features will be carried forward or dropped
from the application in the following iteration.
5 Evaluation against
requirements
Evaluates whether the project has met the requirements
outlined earlier in the development. It will look into how the
requirements have been meet or exceeded, before looking at
requirements that have not been met, providing an explanation
of why these have not been met and a recommendation of how
these may be achieved.
6 Conclusion of the
project
Reflects on the project as a whole. Looks at how the project has
been developed, what mistakes were made during the
development as well as what has been achieved. This chapter
discusses what the project has brought to the field of
information security and possible future development for the
project.
7 Bibliography Lists all the references used throughout this report, as well as
sources used throughout the development of the project.
8 Appendixes Contain all the figures, tables, and section of code along with any
other information such as testing strategies that accompany the
report.
4 | P a g e
2. Literature Review
2.1. Introduction
Tor used to be unheard of, except within the tech community and amongst users of illegal sites.
However, since 2013, when the Edward Snowden leaks revealed that GCHQ and the NSA are unable
to de-anonymize all Tor users, Tor has been cast into the public eye. Now with 200,000 daily connected
users (Tor Project, 2014) Tor and its use is debated more than ever.
This report examines Tor in great detail; what it is, who uses it and for what reasons. It will then discuss
the illegal side of Tor, before looking into why Tor needs to be investigated and what has been done
so far.
Further research and a review of the relevant literature are carried out throughout the project, as
research into the Tor protocol as well as research into current attacks, not necessarily against Tor, will
be required.
2.2. Analysis
What is Tor?
Tor is an open source project that is a decentralized low latency mix network of specially configured
nodes, commonly called relays or bridges, which transmits only TCP traffic through virtual tunnels
from a client to a destination, usually, but not exclusively, the Internet.
Tor has been incorrectly described as having a single authority and being a Virtual Private Network
(VPN) or a peer-to-peer (P2P) network (Hurley et al., 2013, p. 1); but this is not the case.
Tor is described by McCoy et al. as a “privacy enhancing system, designed to protect the privacy of
Internet users from traffic analysis” (McCoy et al., 2008, p. 63). Syverson elaborates on this, stating
that Tor provides anonymous connections to the Internet providing protection against traffic analysis
as well as eavesdropping (Syverson, Goldschlag & Reed, 1997, p. 44). Both groups of scholars agree
that Tor provides anonymity.
As well providing users with anonymity and privacy, Tor can be a tool for anti-censorship. Endorsed
by the Electronic Frontier Foundation and other civil liberties groups, one use of Tor is as a means of
communication between journalists, whistle-blowers and human rights workers. (Levine, 2014)
Tor is also the network supporting what the media commonly refer to as the 'Darknet', due to the
encrypted nature of the network and its association with illegal, or ‘dark’, activities.
History of Tor
Roger Dingledine is largely credited for being one of Tor’s creators, in 2004 he was part of the team
that released the paper Tor: The Second-Generation Onion Router (Dingledine, Matthewson &
Syverson, 2004). Levine explains how the concept behind Tor, “Onion routing”, can be traced back to
1995, and this should be considered the origin of Tor (Levine, 2014).
Michael Reed, one of Tor’s creators, revealed Tor was originally created for military intelligence usage,
such as open source intelligence gathering, and the reason for releasing it open source was to provide
better cover traffic to hide what the network was really being used for (Reed, 2011). Reed expands
upon this, adding that Tor was not designed for helping dissidents in repressive countries or criminals
to avoid law enforcement (Reed, 2011).
5 | P a g e
Until 2013, Tor was almost unheard of expect amongst technology and criminal circles. In 2013,
Edward Snowden made a series of high profile disclosures about several global surveillance programs.
One leaked document, “Tor Stinks”, made international news, and describes how the NSA tried to
compromise Tor anonymity (The Guardian, 2013).
Silk Road, a Tor hidden service which is also known as the eBay for drugs, was an online marketplace
where users could buy and sell drugs around the world (Barratt, 2012, p. 683) and was hailed as being
a “criminal innovation” (Aldridge & Décary-Hétu, 2014). It was taken offline in 2013 by the FBI. This
was the first public demonstration of a government agency taking down a website hosted on Tor, and
attracted significant media attention internationally (Greenberg, A., 2013b).
However, the effectiveness of the closure of the Silk Road has been questioned by Greenberg, who
reports that Silk Road 2.0, an updated and improved Silk Road, was online just one month after the
closure of Silk Road (Greenberg, A., 2013a).
Since these leaks and the FBI takedown of the Silk Road, Tor has never been more popular (Jeffries,
2013); its popularity has sky-rocketed and it now has around 2.5 million monthly users (Tor Project,
2014).
How Tor works
Tor is built on the second generation onion routing. Syverson et al. (2000, p. 1) states one of the major
reasons in the change of design from gen 1 to gen 2 is to be able to release the source code for public
distribution as the patent of generation 1 onion routing preventing this. Dingledine et al. (2004, p. 1)
describe how another of the major reasons for changing the design of onion routing was to solve
“many critical design and deployment issues that were never resolved” as well as stating the “design
has not been updated in years”.
Tor creates a “circuit” through several, usually three, Tor nodes; a Guard (Entry) node, a relay, and an
exit node. Borisov et al. (Borisov et al., 2007) strengthen the case for selecting three nodes, by
demonstrating how increased circuit lengths compromises anonymity.
With the nodes selected, and the user wishing to send a message through the Tor network, the
message is encrypted like an onion with all three of the nodes Keys.
The Diffie-Hellman (DH) handshake is used between each relay and the client to create the session
key (Jagerman et al., 2014, p. 4). However the DH handshake has also been criticized, with some
scholars proposing an ElGamal key agreement based protocol, although this has currently not been
implemented (Øverlier & Syverson, 2007, p. 4). Catalano is also critical of the current onion routing
protocol, claiming it has a high round complexity which affects the running time, although he agrees
that Tor onion routing provides forward secrecy and is secure (Catalano et al., 2011, p. 255). Loesing
is argues that using the DH key exchange makes “building circuits in Tor is a time consuming task”
although he fails to fully justify his reasons for this statement (Loesing, 2009, p. 48).
Dingledine et al. (2004), Øverlier and Syverson (2007), and Loesing (2009) all acknowledge that using
DH to create the circuits in Tor achieves perfect forward secrecy, with Loesing explaining how if an
attacker were to collect and store all traffic, to try and force the nodes to decrypt it would be
ineffective due to the “telescoping” approach (2009, p. 45). This means that once the session keys are
deleted a relay cannot be forced to decrypt old traffic. In a similar vein, Øverlier & Syverson (2007, p.
3), demonstrate how, due to the forward secrecy feature of Tor, attacks do not succeed.
The message is encrypted with the last node’s key first, then the encrypted message is encrypted again
with the second node’s key, before finally encrypting this message with the first node’s key.
6 | P a g e
Onion routing is shown in the image below:
Figure 1 - Onion routing circuit
Dingledine et al. (2004, p. 5) provide an explanation of how a message that is to be sent through an
onion routing network is then decrypted, with the message arriving at and being decrypted by the first
node; here the only information that will be known is which node to send the packet to. This will be
done at each node through the circuit, until the message is sent out to the Internet. The removal of a
layer of encryption is like peeling the layers of an onion, hence the name onion routing.
Apart from the three main nodes, Entry, relay, and exit, there is another type of relay, a bridge; this is
a relay that is not listed on the consensus, and is used as the first hop in the circuit to route traffic out
of a certain country.
Ling et al. (2012, p. 2381) conclude that Tor bridges are critical to counter censorship blocking. Winter
and Lindskog reinforce this viewpoint by explaining how China is blocking all public relays in the
consensus then stating that only “1.6% of public relays are able to be connected to” (2012, p. 11)
which shows that, since the bridges are unknown, they cannot be connected to.
Hidden services
Services such as websites can also be hosted within the Tor network. These are known as “Hidden
Services” and can only be visited through Tor. Tor Hidden Services were added in 2004, when the
second generation of onion routing was developed (Dingledine et al., 2004).
Loesing (2009, p. 36) describes how Tor offers a TCP-based service to be accessed while concealing
the identity of the hosting servers IP address. This enables a user (Alice) to connect to another user
(Bob’s) server without knowing where, or who, it is. Loesing (2009, p. 40) also brilliantly sums up the
hidden service design as, “connecting two circuits created by client and server on a common
rendezvous point”.
Biryukov et al. (2013, p. 82) expand on Loesing’s explanation, stating that all communication between
a client and the hidden service is done through a rendezvous point (RP) which connect circuits from
the user to the hidden service. These RPs are mutually agreed points (Dingledine et al, 2004).
Data Data Data Dat Unencrypted
unless using
HTTPS
Guard
node
Relay
node
Exit
node
Tor protocol
7 | P a g e
Dingledine et al. (2004) also discuss how to
connect to a hidden service; the user tells
the hidden service what RP will be used,
first using a hidden service descriptor to
search the distributed hash table, a lookup
service to inform the user of which
induction points (IP) are servicing the
hidden service. The client will then
communicate with the IP of the RP which
will then be used to communicate with the
hidden service.
Who uses Tor and for what?
Tor was originally designed for military
intelligence gathering (Reed, 2011),
however it has now become more diverse;
whilst it is still used by the military and law
enforcement, its users now include activists
reporting abuse from danger zones (Tor
Project, 2014b).
This is particularly relevant given the
continued fighting in areas such as Iran and Syria; Tor has been instrumental in getting reports of
information from within these countries to the outside world. Tor was even awarded the FSF's Award
for Projects of Social Benefit for its role in the revolutions in the Middle East (Sullivan, 2011).
Since the Snowden revelations about the surveillance program PRISM, a data mining program used by
the NSA and GCHQ to store Internet communications from companies such as Google and Yahoo, Tor
has been increasingly used by normal people seeking privacy online and wanting to prevent
government agencies from monitoring them. Munson validated this assumption by showing how Tor
usage has increased dramatically since the Snowden leaks, which he feels suggests that Tor is being
used by users seeking privacy online (Munson, 2013). Arma (2013) however contradicts Munson’s
conclusion stating that, due to the increase of “ESTABLISH_RENDEZVOUS requests", this growth is
instead likely to be from a botnet, although he also acknowledges some growth can be attributed to
activists in Syria and the United States.
The distribution of copyrighted digital material has moved from peer-to-peer to the “Darknet” (Wood,
2010, p. 1); this contradicts the Tor project message that users are only using Tor for good, and legal,
means. Wood believes users of the “Darknet” exploit the anonymity provided by Tor to illegally
download material without being able to be traced. Biryukov et al. (2013b, p. 2) strengthen the
argument that Tor is used for illegal purpose; they show that 44% of hidden services were hosting
illegal content such as drugs, pornography, illegal copyright material etc.
Arguably the most famous hidden service is Silk Road, also known as the eBay for drugs (Barratt, 2012,
p. 683). Reportedly, the Silk Road had between 30,000 and 150,000 active customers (Christin, 2012,
p. 2). The Silk Road was used to make significant numbers of transactions; Konrad (2013) reports that
the Silk Road’s owner and operator Ross William Ulbricht handled $1.2 billion of transactions in the
2.5 years before FBI seizure. This clearly shows how popular illegal activities are on Tor, greatly
contradicting the Tor project’s stance that Tor is used exclusively for good and instead showing that
Figure from Dingledine et al., 2004, p. 3
Figure 2 - Hidden service architecture
8 | P a g e
Tor is being used to conduct illegal activity anonymously, thus making traceability and accountability
extremely difficult.
Cox (2014) cites Bartlett’s radical views about Tor. Bartlett sets out to discover “the range of things
that people do under the conditions of anonymity” (Cox, 2014) and implies that, under the cover of
anonymity, people will do more distressing and ‘dark’ things, for example talking to “trolls on pro-
anorexia forums” (Cox, 2014). He goes on to suggest that human nature finds a haven on the Internet
and uses the tools available to enable this; in this example Tor allows these illegal and immoral
activities to be conducted anonymously.
While this view regarding the reason why Tor is being used for illegal purposes may be a radical one,
it does not hide the fact that a large percentage of Tor is used for illegal activities by criminals using
the blanket of Tor’s anonymity as protection from prosecution.
Current Tor development projects
With Tor being open source, developers are free to use the source code to modify and develop their
own applications using Tor. As well as being able to modify the source code, there are libraries that
exist which allow developers to use Tor for their projects.
Stem, a Python controller library for Tor, requires Tor to be installed on the machine and controls Tor
through the control port (usually port 9051). Winter demonstrates how this can be used to create
circuits as well as stream in these circuits, although he implies the lack of features and functionality of
Stem inspired him to create his own application rather than using the existing Stem library (2014, p.
3). However Atagar praises Stem for the friendly API and documentation, while simultaneously
criticizing the lack of backward compatibility it offers (Atagar, 2012).
Txtorcon, previously TorCtl, is another Python controller library. It works in the exact same way, using
the control port to control Tor. TorCtl library was used for the development of the Torbutton Firefox
extension. Meejah (2014) describes how Txtorcon communicates with the Tor network as being “an
asynchronous API to speak the Tor client protocol in Python” and believes the main goal of Txtorcon
is to enable applications to use the Tor network to improve people's privacy and anonymity on the
Internet.
Atagar draws a comparison between both libraries, praising both for their “extensive test suites and
are being very actively maintained” (Atagar, 2012) and noting that both just control the Tor client.
Current attacks on Tor
With such a diverse client base, it is easy to see who may want to attack Tor, and their reasons for
doing so. For example, law enforcement will want to catch paedophiles who access hidden services
containing child abuse, whilst regimes such as China or Iran want to prevent users accessing Tor and
so may try to take down Tor altogether.
Denial-of-Service (DoS)
Wood and Stankovic (2002, p. 55) describe a DoS attack is “any event that diminishes or eliminates a
network’s capacity to perform its expected function”. Borisov et al. agree with Wood’s statement and
also claim that as well as the blanket DoS which effects the whole network, a selective DoS attack can
target just one small section of the network (Borisov et al, 2007).
Unlike Wood and Stankovic (2002, p. 55), Wang et al. (2004, p. 193) accurately describes a DoS attack
to be the flooding of a node with traffic such as requests until the node is unable to function.
9 | P a g e
Jansen (2014) details a DoS which implements a selective DoS attack. This attack targets either the
entry or exit node. The attack works by requesting data from a source such as a file server, and then
for the client to stop reading from the TCP connection, thus exploiting Tors control flow before
requesting more data from the source, causing the memory of the target node to increase, and then
the OS terminate the Tor application on the relay. Jansen (2014) acknowledges the attack, and its
effectiveness, however is critical of the attack; he discusses several simple defences that would
prevent this attack such as the implementation of authenticated SendMe cells to prevent the control
flow being misused.
De-anonymization of hidden services
Rob Jansen expands on his aforementioned DoS attack in order for it to be able to conduct the de-
anonymization of hidden services. Jansen expands on the attack first developed by Biryukov,
Pustogarov and Weinmann (2013). This attack requires the current guard of a selected hidden service
to be known and taken offline, forcing the HS to select a new guard node; this is repeated until the
attacker’s guard node is picked. However, this required the attacker to run a compromised guard relay
and Jansen is critical of this attack’s success. He suggests that if middle guards was used, then the
attack would not work. Jansen also criticises the attack’s method of taking a node offline, stating if the
node was correctly configured to prevent a sniper attack, then this would fail. He goes on to further
suggest that if the node were to simply reboot after it was taken offline, this alone would be enough
to render the attack ineffective (Jansen, 2014).
ASN (2013) frame this attack in a new light, instead of trying to prevent the attack, they suggest the
core design of Hidden Services is flawed and in need of a redesign if it is to continue to be secure and
effective.
Conclusion
The review has enabled the developer to gain a solid understanding of what Tor is, how it works and
who it is used by. In addition, it has provided the opportunity existing Tor libraries to be reviewed. By
comparing Stem and Txtorcon it became clear that both offer the same level of functionality, but in
some areas this is at such a low level that they can be considered to be severely lacking, as they only
allow an application to control the Tor client and not control Tor directly. From this section it is clear
to see this is an area of research that is significantly lacking, and would benefit greatly from more
research and development.
The current attacks on Tor section is also extremely relevant to this project, this section analysed two
different attacks, a DoS attack and an attack which de-anonymised hidden services. This section in
particular featured analysis from several authors who were not in agreement over the effectiveness
of the attacks demonstrated, but all agree on one thing: Tor is vulnerable to attacks.
Overall this literature review has demonstrated that while Tor is a network that has been around for
over 10 years, and has been heavily researched, there are still many opposing viewpoints regarding
certain topics such as attacks. It has shown why Tor is more popular today than it ever has been, and
the impact that the anonymity of Tor has on the usage of Tor.
Tor development libraries were shown to be an area in which little research has been conducted, and
where the current solutions have been met heavily with criticism and opposing views.
10 | P a g e
3. Project Management
3.1. Brief description of chapter
This chapter discusses the project and requirement elicitation methodologies that were used for this
project. It also considers the project’s potential risks and the countermeasures that were implemented
to negate these, as well as discussing the original schedule designed for the project.
3.2. Select methodology
Project management has been used in various forms for centuries, but only since the 1950s has the
modern concept of project management existed (Kwak, 2003, p. 1). The origin of modern project
management is disputed, although there are records of it being used in the 1950s by US defence,
Xerox, Bell Laboratories and even NASA.
Modern project management can be defined as the “application of knowledge, skills and techniques
to execute projects effectively and efficiently” (PMI, 2014).
Since the 1950s, project management has become a necessity on all projects; it reduces the risk of the
project failing, or the wrong features being developed and ensures that the project is completed
efficiently and on time.
The Agile Method of project management was chosen for this project. The Agile Method favours
“working software over comprehensive documentation” (Paulk, 2002, p. 15). In this project the client
only expects a working application and this report to be produced, no other documentation is
required. This makes the Agile Method a good choice for this project, as it permits development to be
started immediately and provides the client with what they want, whereas methods such as the
Waterfall method focus a lot more on documentation, thus delaying the start of development.
Another reason for choosing this method was that it promotes adaptability throughout the project’s
lifecycle. Unlike other methods, such as the Waterfall Model, the Agile Method allows and encourages
changes to be made as and when they are required. This ensures development is continuous and will
not be stopped by a requirement that cannot be implemented. This method also allows for regular
testing and for working software to be shown frequently to the client; this promotes client feedback
any necessary changes can be immediately implemented with little cost to the design and
development of the project. This would not be possible with other traditional methodologies, in which
changes can be costly and can usually only be implemented at the end of the development. This
feature also mitigates risks and ensures that a quality application is produced; any bugs or issues are
identified within an iteration (a short timescale of around three
weeks) and can be dealt with at that point, rather than having
long-lasting effects on the application as would be the case with
the Waterfall model.
A further benefit of showing the client frequent iterations of the
project is that they get to see the progress that is being made,
which on a complex project allows them to gain a sense of the
challenges faced by the developer and to feel that they are
having an input in the application’s development. If Waterfall or
Spiral models are used, the client only receives the end product
and thus does not gain the same understanding of the project
and the issues that the developer faced. The Agile Method can
therefore be argued to lead to better customer relations.
Figure from: Stack Exchange, 2013.
Figure 3 - Agile methodology diagram
11 | P a g e
The Agile Method also promotes continual improvement of the application by taking positive, and
negative, aspects of the current iteration forward to be expanded upon in the next iteration. In other
methods, this can only be achieved at the end of development, ready for the next major release. This
ensures that positive features are capitalised on and promoted, optimising the quality of the
application.
A further benefit of this method over other traditional methods, such as the Waterfall or Spiral models,
is that if the development is running behind schedule and the deadline is likely to be missed, it is
possible to liaise with the client and choose to focus on core requirements. Dropping any non-critical
requirements from the development plan ensures that a working application, albeit one with reduced
functions, can be handed to the client instead of having an application that is only half-developed.
This means that using this method increases the chances of the developer meeting the deadline by
having the project moving consistently forward and not grinding to a halt by trying to achieve goals
and requirements that cannot be implemented.
3.3. Requirements elicitation
Well thought out, well-structured requirements generally lead to more successful project which meets
client expectations and the delivered application is fit for purpose (Hickey & Davies, 2002). It is
therefore important to ensure that the requirements gathered provide sufficient detail, are realistic
and achievable within the timeframe of the project.
“Requirements elicitation is the process of seeking, uncovering, acquiring and elaborating
requirements for computer based systems” (Zowghi & Coulin, 2005, p. 1).
This process may be time consuming, but it helps to prevent a project going over-budget, being
delivered late or failing to provide the required functionality (Jones, 1995, p. 86)
Using the Agile Method means that requirements will be gathered at the start of iteration. This allows
the requirements to take into account any issues faced or lessons learned during the previous iteration
and differs from the Waterfall Method, where the requirements for the entire project are thought of
prior to starting development. Using the Agile Method means that the requirements elicitation
method will need to be run several times throughout the course of the project. This makes choosing
an elicitation method critical; a method that takes a long time to get results would be an inappropriate
choice as it would delay the development of the project. For this reason a questionnaire would be
considered an inappropriate method of requirement elicitation for this project, as the time spent
waiting for replies to the questionnaire could eat into valuable development time.
There is more to requirement elicitation than the client simply telling the developer what they want;
a more detailed research process is required. This involves finding exactly what the client realistically
expects from the application as well as looking at similar projects that have been developed to see
what features these offer that could be integrated into the development of the application. A
combination of this information can be condensed into well-structured and achievable requirements
that can be implemented into the project.
12 | P a g e
The elicitation methods that were considered for this project are shown in the table below, along with
a summary of their appropriateness for the project:
Method Outcome Appropriate?
Interview the client
(may be formal or
informal, in person
or via online
correspondence)
Greater understanding of what is
expected, issues with current solution
etc., however the quality of answers is
dependent on questions asked.
Yes – This is a core method that will be
used, this will allow us to understand
exactly what the client is expecting, the
reasoning behind the project and the
time scale in which it is required. Keeping
in constant communication with the
client will promote customer
satisfaction.
Interview the end
users (may be formal
or informal, in
person or via online
correspondence)
Understand what the users actually
want, if the client and end users
expectations are the same, again the
quality of answers is dependent on the
questions asked.
Yes – another core method that will be
used to see if the user wants the same
features etc. as the client, also allows
information to be gathered that could
have been missed from the client
interview.
Prototyping Rapid prototyping would develop a
small section of functionality; this
could be used to get feedback on the
section from users or the client, and
could be used to estimate a timescale
for the project.
No – High cost of failed prototypes, not
required due to the chosen agile
methodology, however a prototype
would allow evaluation of the proposed
approach to development.
Case study A report which will allow the
understanding of the current
system/application
An example of a case study model is
the critical incident technique which
observes the human interaction of the
current system. (Woolsey, 1986)
No – This project does not build upon an
existing application, and as case studies
are almost always retrospective, it is not
appropriate for this project.
Brainstorming Could be conducted alongside the
interview process; a way for all ideas
that may not necessarily have been
discussed during the interview to be
presented.
Yes – Will allow for the members
involved to discuss ideas that may not be
core to the project, but ideas that are
revolutionary, or never before done, but
would be welcomed if possible.
Figure 4 - Table showing possible elicitation methods
As the above table shows, three different methods were selected. These are interviews both with the
client and with the potential end-users of the application, as well as brainstorming sessions which will
be run in conjunction with the interviews. Using several different methods, and focusing on the end-
user as well as the client, ensures that the functions of the application meet the client’s expectations
and results in a usable application that meets the requirements of the end-user. By brainstorming with
the client, allows the developer to pick up on any implicit requirements that have not be explicitly
stated by the client but that would still greatly benefit the application and ensure the client is satisfied
with the final product.
13 | P a g e
3.4. Risk analysis
As with any project there is risk involved, and although using the Agile method mitigates risk, there
are still some risks to the project. The table below shows the possible risks, their potential impact level
and finally what reduction strategy will be in place to ensure they are mitigated as much as possible
during the project development.
Risk Impact level Reduction strategy
Tor network unavailable – Internet
access or the Tor network
unavailable
High Use a Tor simulator such as the Tor
Path Simulator (TorPS).
Poor productivity – Developer’s
motivation inhibits the project’s
development
High Set 20 hours a week minimum for
the project, more when needed.
Setting small milestones will
increase motivation and
productivity. Regular meetings
with the client will ensure the
milestones are met.
Technical risk – Project is too
complex to implement
High Regular meetings with the client
ensuring they are kept up to date
with the development, and adjust
the requirements to allow for a
work around if possible.
Programmatic risk – Customer
changes their mind about wanting
the project developed
High Find another client or adapt the
project to cater for the client’s
change of heart.
Inherent schedule flaws – due to the
uniqueness of the project, it is
difficult to estimate and schedule.
Medium Better to overestimate than
underestimate timescales; use the
Agile methodology to renegotiate
the schedule with client.
Requirements Inflation - more
features that were not identified at
the beginning of the project emerge
that threaten estimates and
timelines.
Medium Keep in constant contact with the
client with regular meetings etc.,
only accept more features if
timescale allows.
Specification Breakdown – Only
during the development does a
conflicting requirement become
apparent.
Medium Contact the client, work out a
solution that would have the
lowest impact.
Insufficient resources – Unable to
develop the project due to not
having access to a required resource.
Medium See if the resource is really
required, look for ways to reduce
resource use previously in the
project, and try to gain the
required resource.
Incorrect budget estimation –
Overall cost of the project starts to
increase and spiral
Low There is a budget of zero for this
project, to maintain this open
source software and libraries will
be used.
Figure 5 - Table containing potential risks and countermeasures
14 | P a g e
3.5. Schedule
The project has a hard deadline of September 12th
2014; at which point the application, all
documentation and the accompanying report must be completed. A Gantt chart showing the planned
schedule can be found in Appendix 3. As the chart shows, some additional time has been allowed to
factor in potential delays during the project. However due to the nature of the project and
methodology used, it is possible some iterations will take less time than others, and more or less
iterations can be added. This is a very adaptable schedule, and is only used as a base as it will likely to
change once development begins.
3.6. Professional issues
Appendix 2 contains the ethical checklist that accompanies this project. This document revealed that
there were no ethical concerns raised by this project. To ensure copyrighted code is not used, only
open source libraries and code will be used and, should code be required from other sources, it will
only be used after getting the express written permissions from the author/owner. Should any
questionnaires or user feedback be required during the testing phases of the project then this will be
conducted anonymously, and all respondents will be under no pressure from the developer to take
part.
User information will never be put at risk and the creation of this application will at no point
compromise users data or identity. Any attacks that are developed as part of this project will be
conducted on a closed network where the developer has complete control of the Tor node. This
ensures that users of the Tor network are not, at any point, affected by the development of this
project.
Should a situation arise during the development of the application in which a potential professional
issue arises, this will be dealt with before it occurs to ensure that the project never breaks any ethical
codes or laws.
3.7. Conclusion of the section
This section has justified the use of the Agile Method as a project methodology, having fully considered
its advantages and disadvantages over more traditional models, like the Waterfall model.
Furthermore, the importance of requirements was considered and the methods used to elicit the
requirements for this projects were discussed. Potential risks of the project were analysed and
countermeasures were implemented to mitigate the possible effects of these risks. Finally an
estimated schedule for the development of the application was drawn up, which factors in some
unforeseen delays during the development stages. With these aspects of project management in
place, a smoother development should be possible and the application should meet the requirements
and be delivered to the client by the deadline.
15 | P a g e
4. Application Development
4.1. Iteration 1
4.1.1. Requirements
All requirements for this project will be split into two categories: functional requirements and non-
functional requirements. Functional requirements describe what the software should do whilst non-
functional requirements judge the operation of the software. By their nature, non-functional
requirements can be difficult to evaluate because they tend to be based on the subjective opinion of
the assessor rather than being fact-based.
During the requirements elicitation for this iteration, all of the non-functional requirements were
elicited. Unless otherwise stated, these will be presumed to apply to each iteration of the project,
although they will only be discussed in this section of the report. In the case of this project, the non-
functional requirements can be considered as principles based on the ISO 9126-1 software quality
model (ISO, 2001) which the project should aim to meet and can therefore not be attributed to one
specific iteration.
The functional requirements for this iteration were elicited using the aforementioned methods and
are shown below:
Requirement Importance Level
Connect to the Tor network High
Send and receive a version cell High
Decode NetInfo cell to extract data from it High
Handle errors from destroy cells Medium
Figure 6 - Table containing requirements for iteration 1
The non-functional requirements for this project can be seen in the table below:
Quality Characteristics Requirement Importance Level
Portability Able to run the application without
installing Tor
High
Able to run on multiple platforms
(Windows, Mac, Linux)
Medium
Not require the application to be
installed to run
Medium
Reliability No more than 10 bugs on delivery High
Efficiency Use as little computational resources as
possible such as RAM. (No more than a
1gb of RAM)
Low
Usability No GUI High
Precise and constructive error messages High
Documentation High
Universal naming standard High
Dependability Able to operate normally or
abnormally without threat to life or
environment
Med
Legal Only use open source software High
16 | P a g e
Maintainability Able to expand the system to
incorporate new features, fix defects or
deal with new technology.
High
Adaptability Able to change the system to handle
additional domain concepts
Med
Figure 7 - Non-functional requirements
The importance of considering both functional and non-functional requirements when developing the
application can be seen from the first functional requirement: to be able to connect to the Tor
network. Clearly, this is a critical requirement, failure to connect to the network will prevent the
project from being continued. One simple way to connect to the Tor network would be to install Tor
and allow the application to use the Tor client. However, this would inhibit the first non-functional
requirement: to not need to install a Tor client in order to use the application. Failure to consider both
functional and non-functional requirements during the development of the application could result in
some of the requirements being contradictory and thus not all of the requirements would be able to
be met.
The second functional requirement, to be able to send a version cell, requires a packet to be sent to a
Tor node informing it of the current version we wish to communicate using. This packet must fulfil the
criteria outlined in the Tor protocol specification document (Dingledine & Matthewson, n.d.). This
should be a simple requirement to achieve. This requirement is critical to the development of the
project as it sets up the communication between the client and a Tor node.
The third requirement, to decode a NetInfo cell, will likely prove to be challenging. The data contained
within the NetInfo cell must be extracted accurately and in the correct order.
The first three functional requirements were all considered to be critical; these requirements provide
the base upon which the application can be developed. Failure to meet these requirements at this
stage of the project could jeopardise the entire project as they provide key functionality to the
application. The fourth functional requirement - to handle data from a destroy cell - is also important,
but is not a critical requirement as, although it is desirable, it will not affect the application’s
functionality. Therefore, in this iteration, the first three functional requirements should be prioritised.
As already established, the non-functional requirements (NFRs) will affect the entire project and their
importance should not be under-estimated. The portability NFRs may seem simple to achieve, but
fulfilling these requirements will have major impacts on the project, and will, for example, have an
effect on the programming language chosen as it must be cross-platform compatible and be capable
of being used to achieve the functional requirements.
Usability, maintainability and adaptability NFRs should be simple to implement and can be said to be
of critical importance to the project. This project intends to create a library suitable for further
development, an application with poor usability features would not be chosen over existing Tor
libraries and therefore if this application is to be successful the usability NFRs need to be met.
The legal NFR of only using open source software needs to be achieved as the project has a budget of
zero. This is therefore a simple, but critical, requirement to implement.
The reliability NFR, to have no more than ten bugs on delivery, was explicitly mentioned by the client.
However, this could prove to be a challenging requirement to assess the success of. Whilst testing may
17 | P a g e
show that there are little or no bugs in the application, this might not be a true representation of the
application because there may be bugs in the application that did not show up during testing.
4.1.2. Design
By using the Agile Method, the upfront design is minimized; the developer only designs what is
required for each iteration, which dramatically reduces the large upfront design cost that other
methodologies incur. Moreover, by only implementing the design as and when it is required, risk is
reduced and the developer ensures that all the necessary features are designed. Implementing the
entire design in one go could lead to features not being used etc. making it confusing to the end user.
This does not mean that features designed in earlier iterations will not be carried over to later
iterations of the application.
Despite being the first iteration, some design decisions made here will impact the rest of the
application. An example of a design decision that will affect the entire application is the programming
language used as this will not be able to be changed after the first iteration without dramatic
consequences. This makes the choice of programming language a critical design decision.
There were three key contenders for programming languages: C, Python and Java. C was discounted
as the author has considerably less experience in this language than either Python or Java. To decide
which of these languages was more suitable for this project, the advantages and disadvantages of
each were considered. Python was found to be the more suitable language for this project, as existing
Tor libraries use Python and it makes sense to use the language that Tor developers are already using
as it will help to achieve the application’s goal of being used for future development. Another reason
for choosing Python over Java was that using Python it is much easier and more effective to extract
bytes from packets of network data than it is using Java. Despite this, Java was a serious contender
due to the developer’s considerable experience in the language and the speed in which Java can run
– which can be up to ten times quicker than Python. The decision was further complicated by the fact
that both languages are cross-platform compatible and therefore would both be able to achieve the
non-functional portability requirements. Python is not without its disadvantages in relation to this
project; at the start of the project the developer was relatively inexperienced in this language, and
threading in Python is extremely hard and has been strongly criticised as being “fundamentally
broken” (Wittber, 2009). The deciding factor was that the client implied that he had a preference for
Python being used for this application.
The version of Python to be used was also seriously considered, with the final choice being Python 2.x.
Despite being the older version of the language, this was deemed the most appropriate version of the
language to use as several existing libraries anticipated to be used to provide functionality are
currently only fully compatible with Python 2.x. While some libraries have 3.x versions available, these
still contain bugs and tend to be considered to be in Beta mode.
The operating system used to develop the application is an unimportant decision, as the portability
NFRs state that the application must be compatible with all operating systems and Python can be used
on all operating systems. The only potential issue is that Python will have to be installed on Mac and
Windows operating systems, although it comes preinstalled on Linux. This also applies to the libraries
that the developer expects to use throughout the project. However, the developer’s personal
preference for developing applications is to use Linux and consequently this operating system will be
used to develop the entire application.
As discussed in the requirements and specification section of this report, the application does not
require a GUI as it would bring no benefit to the application. Designing a user-friendly, efficient and
18 | P a g e
scalable GUI would take considerable time and the absence of a GUI significantly reduces the
complexity of the design section. The time saved by not having to design and develop a GUI will be
invested into further increasing the quality of the code, as well using the additional time to try and
implement more of the requirements.
4.1.2.1. Design features of a library
To achieve the usability NFR, it is important to consider the way that the library will be designed. A
poorly designed library would likely not be used for future development as developers would probably
opt for one of the existing libraries if it were significantly easier to use. It is therefore important to
ensure that design structure is simple to use, is intuitive and promotes efficiency.
To enhance usability, the single responsibility principle will be implemented; this means that each
component implemented in the library should only be responsible for a single section of functionality
or a single feature. This makes it easier for the user to understand precisely what they can expect from
each function of the application, which should help to make users feel confident in further developing
the application in the future.
Two popular naming conventions are used for Python, these are mixed case and lower separated with
an underscore. The Python PEP 8 documentation recommends that the words be separated by an
underscore as it is claimed that this facilitates readability (Van Roussum, 2014). Therefore, this naming
convention was chosen as it would further achieve the usability NFR.
The names of the relevant functions and variables also needed to be considered. Variable names such
as x, y, etc. are extremely poor names - they do not give any information about the data they contain.
It was decided that all names should provide as much data as possible whilst remaining a sensible
length. This will facilitate easy development and usability as there should be no confusion over what
a variable contains or what a function will do; the name should make this information clear to the
user.
Both the chosen naming convention and the descriptive variable names help to meet the NFR of
maintainability – making it easier for the current developer to work on the application as well as for
users to further develop it in the future.
To further achieve the usability NFRs, detailed comments about all functions within the application
will be required. These should provide the user with information concerning the required input, what
the function does and what the function will return.
It could be argued that the above features are not strictly necessary as a good application would
always be favoured over a lesser application, however the amount of effort required to make these
significant improvements is negligible and the implementation of these decisions could potentially
increase the speed of development by making it easier for the developer to identify pre-existing
functions.
4.1.2.2. Version control
Version control is an essential feature to be implemented in the application. Although it will not affect
the development of the project, it is a safeguard that means were anything to go wrong the code can
be retrieved from a specified point. It also offers the ability to track all changes made to the code,
which will help locate bugs within the application. For this project, Git was selected over Subversion.
This decision was based on the personal preference of both the developer and the client, who also
uses Git and therefore it was easy to share code between the parties involved in the application’s
creation.
19 | P a g e
4.1.2.3. Design conclusions
This section has required more time than was previously anticipated, this was because so many of the
design decisions that needed to be made in the first iteration would have effects upon the entire
development of the project. It was therefore essential that sufficient thought and consideration was
put into these decisions, as failure to make the right choice would lead to greater delays later in the
project development.
It could also be argued that creating such a detailed design in the first iteration will speed up the
development process and ensure that potential issues will be averted as a result of the decisions made
in this section.
4.1.3. Implementation
The aim of this iteration was to implement all four functional requirements, as detailed at the start of
this section. Furthermore it was hoped that as many of the NFRs as possible would also be achieved
during this time.
Developing the application required several pieces of software to be selected. The most important
tool was the code editor Sublime Text 2, this is a text editor which enables code to be written. While
it is argued that an Integrated Development Environment (IDE) is more appropriate for developing
code, due to the extensive testing and debugging functionality that they provide, they are more
complicated to use than a text editor and the testing environment may not be suitable for this
application. It is also the developer’s preference to use a text editor, as he has more experience of this
method. By using the extensive testing functionality that Python provides, no negative effects of using
a text editor over an IDE will be present in the final application.
The developer tried to implement the functional requirements in order of their importance; for
example, connecting to Tor was the first step undertaken.
To do this an SSL connection was made to the Tor node. It used the Tor node’s IP address and ORPort.
This was easily implemented and only required the three lines of code shown below:
Figure 8 - Code snippet showing connection to a Tor node
While this is a simple method to connect to the Tor node, it works and is less complicated than other
methods and it was felt best to avoid over-complicating things where possible. However, an
improvement was almost immediately thought of. This method requires the user to know the IP
address and ORPort of the Tor node, which may not be easy to find out. To simplify the method, and
increase usability, it was decided that users should be able to enter either the nickname or the IP
address and ORPort of the node that they want to connect to. This was not implemented during this
iteration, as it was felt that this should be suggested to the client at the end of this iteration and, if
approved, implemented in the following iteration.
The second requirement, to be able to send and receive a version cell, was the next requirement to
be implemented. It was decided that, because all cells need to be created in the same format and
following the same protocol instructions, a function would be created to automatically pack a cell to
20 | P a g e
the correct format, thus preventing any code duplication. This achieves the usability and
maintainability NFRs. A build cell function was therefore created, which takes the command to be
used and the payload and correctly packs this into the correct format of the cell. This is shown in the
code below:
21 | P a g e
The decoding of the NetInfo cell was perhaps the most challenging requirement to be implemented
in this iteration. This was because the developer is still relatively inexperienced with Python and the
Tor protocol documentation is not very clear and contains several ambiguities. However, despite these
challenges, the NetInfo cell was able to be decoded, although the process overran the estimated
timescale dedicated to this section as a result of the aforementioned challenges.
To promote efficient code, the developer used an ‘If’ statement to dynamically extract data contained
within the packet. The NetInfo cell could contain multiple IP addresses or multiple formats of IP
addresses (i.e. IPV4 or IPV6), an appropriate but somewhat inefficient method, would be to run
multiple ‘If’ statements for every possible eventuality. This, however, would mean at least eight ‘If’
statements would be required just to extract the client’s IP address. As the code below shows, the
developer managed to use a single ‘If-elif’ statement, by doing so dramatically reducing the chances
of errors in the code and increasing readability for users.
Due to the complexity of decoding the NetInfo cell, this iteration was already starting to fall behind
schedule. The decision was therefore made not to implement the handling of the destroy cell as part
of this iteration as it does not affect the core functionality and was merely a desirable, rather than a
core, requirement. However, it was mentioned to the client and it was agreed that this feature will be
implemented in a future iteration.
4.1.4. Testing
Although testing may “Often feel like an exercise in futility or at best a waste of time” (Arbuckle, 2010,
p. 1), it is a critical area of development. Testing ensures the software functions according to the
expectations defined by the requirements/specifications. The overall aim of testing is to find bugs or
issues that would negatively affect the functionality of the application, its usability and/or
maintainability.
22 | P a g e
For this iteration functional testing, which verifies a function performs as expected using a small subset
of inputs as well as white box testing, where the tester has full knowledge of the implementation will
be conducted.
To enable the application to be thoroughly tested, several testing methods were considered. It was
decided that a combination of the unittest framework and testing manually were the most appropriate
methods of testing for this application. This is because unittest makes it possible to quickly test a large
number of input values and it is also heavily integrated with Python. The results from unittest are also
displayed in a very comprehensible manner, making it easy to locate and fix bugs. Manual testing also
has its advantages, such as being able to test features of the application that a unittest might not be
able to do and manually testing each function will allow a realistic user scenario to be tested.
To ensure that there are no anomalies in the results, for each testing round each test will be run three
times. Should any issues be presented, these will be investigated and corrected before re-running the
tests to ensure the bugs have been removed. This process will be continued until all bugs are
eradicated from the application.
Test No. Test Test method Succeeded? Comments
1 Can connect to a node Unittest Yes
2 Passes correct value to
version function
Unittest and
Manually
Yes
3 Creates the correct
version cell
Unittest Yes
4 Sends the version cell Manual Yes
5 Receives the Netinfo
cell
Unittest Yes
6 Able to extract the
payload of a Netinfo
cell
Unittest Yes
7 Successfully able to
extract the data
contained within the
payload
Unittest Yes
8 Store the extracted
data as a dictionary
Unittest Yes
9 Create the payload of a
NetInfo cell to be sent
Unittest and
Manually
First round:
No
Second
round: Yes
First round of testing showed
up an error where the IP
addresses was being
displayed as negatives, this
was because there were not
being formatted correctly,
once this was fixed, the test
was able to be passed
10 Builds the NetInfo cell
correctly to be sent
Unittest and
Manually
Yes
11 Send the NetInfo cell to
the first node
Unittest and
Manually
Yes
Figure 9 - Testing results for iteration 1
As can be seen in the testing results in above, eleven tests were conducted for this iteration. All
functions were thoroughly tested, with ten tests being passed first time. One test, however, failed.
23 | P a g e
This test was to ensure that the correct payload of the NetInfo cell was created. The creation of the
NetInfo cell proved to be incorrect as negative IP addresses were being passed. This obviously cannot
be allowed, and was found to be a result of the formatting of the IP addresses had been done using
the signed char method rather than the required unsigned char method. Once this had been changed,
the test was rerun and was successfully passed.
4.1.5. Moving forward from Iteration 1
While this iteration has overrun the allotted time by two weeks, and not all of the functional
requirements have been met, for the most part it can be considered a success. A new project plan has
been created to showing this, and how this delay has been taken into account for the future iterations,
to still insure the project is completed on time, this can be found in appendix 4
They delay is because a detailed design section was developed and this should enable future design
development to be achieved quicker and more efficiently. The three functional requirements that
were implemented have been implemented successfully and to a high standard, for example the use
of functions to reduce code duplication was implemented to help achieve many of the usability NFRs.
The majority of the NFRs have already been achieved, which is a significant achievement in such a
small amount of time.
The testing of the implemented features was a success, despite one function requiring a bug to be
dealt with. Moving forward to iteration 2, the recommendation of using an Onion router nickname as
well as the IP address to connect to a Tor node will be suggested to the client and the timescale will
be altered to take the complexity of the Tor protocol documentation into account.
24 | P a g e
4.2. Iteration 2
4.2.1. Requirements
During the demonstration of the previous iteration to the client, he was generally pleased with the
development to date. The unachieved requirement of handling data in the destroy cells was
mentioned to him and he explicitly requested that this be completed in this iteration. It was also
decided to implement the use of Onion router nicknames to identify Tor nodes in addition to the
existing IP addresses to facilitate usability. This discussion, as well as several informal interviews
conducted with potential end-users of the application, elicited the following requirements.
Requirement Importance Level
Create a circuit through the Tor Network High
Create a circuit of any length High
Create a stream through Tor to a web server High
Able to retrieve webpages from an internet web server
through Tor
High
Create a circuit using specified nodes Medium
Create multiple streams through Tor to a web server Medium
Handle errors from destroy cells Low
Figure 10 - Functional requirements for iteration 2
It was evident that no further NFRs needed to be added to the original specification and that the
existing NFRs should be carried forward into this iteration.
The requirement of creating a circuit through the Tor network is perhaps the most challenging
requirement faced in the project to date. To achieve this requirement, the calculation of shared keys
between nodes will need to be achieved. The encryption of packets will also need to be implemented
if this requirement is to be achieved. This is extremely difficult and the Tor documentation is, once
again, full of ambiguities and proves a major challenge to developers.
In light of this challenging requirement, the predicted timescale has increased from three weeks to
four and the hours allocated to the project have been increased in order to develop this aspect of the
project.
However, once this requirement is implemented it will provide a base upon which the application can
be developed. It is therefore critical that this requirement be fulfilled during this iteration.
The requirement of being able to create a circuit of any length should be easily achieved once the
requirement of being able to create a circuit has been successfully implemented, as it expands on the
code used for this process.
The creation of a stream through Tor to a web server is also likely to be a simple requirement to fulfil,
as this will, once again, expand on the circuit that has been created. Overall, this iteration contains
some very difficult requirements to achieve, with the majority of the requirements being dependent
on the successful creation of a circuit through Tor.
4.2.2. Design
The design for this iteration builds on the design section from the previous iteration, however one
small design feature was needed to be considered before implementation. With the greater use of
cells, all requiring different commands to be associated with a specified cell, it was important that the
25 | P a g e
method used to identify the cell was clear. There was two possible solutions, to use the command id
number of the cell or to use the English command. This was thought at great length, as by
implementing the English command for the cell command would make the code easier to use,
however it would also increase the chance of errors being implemented into a program, I.E a spelling
mistake. Also with so many different commands with similar names this might make it more confusing
to the user to decide on which command to use. It was therefore decided to implement the number
based commands for setting the commands in packets, due to the lower risk of a user entering the
incorrect number, reduction in the chances of errors due to spelling mistakes, and being easier to
implement as no formatting issues need to be considered. It was also chosen above the English string
version as this is what is currently used in the Tor protocol.
4.2.3. Implementation
The key requirement for this iteration was to build a circuit through the Tor network. This needed to
be split into two parts. The first challenge was to connect to the first node, and once this was successful
the developer had to connect to the second and subsequent nodes. This was split into two parts
because the cells needing to be sent are different. For example, the cell sent to the first node needs
to be a ‘create’ cell, and a ‘created’ cell would be expected back. For the second node onwards, an
‘extend’ cell would be sent and an ‘extended’ cell would be received back.
Here an implementation decision needed to be made. Tor currently uses two encryption methods
(Dingledine & Matthewson, n.d.): the NTOR and TAP protocols. There is no particular advantage to
either method, however the TAP handshake is slightly easier to implement and is the original Tor
encryption method which means that it is able to be used on nodes running older versions of Tor,
whereas the newer NTOR protocol may not. In addition, more documentation is available for the TAP
handshake and for these reasons this was chosen as the encryption method to be used in the project.
To create a ‘create’ cell, the Diffie-Hellman protocol must first be used to create a shared key which
only the client and the first node knows. To calculate the client’s data for the handshake to be
completed, a function was created to prevent code duplication as this will be required to be used
extensively for circuit creation throughout the application.
The code below shows the DH protocol being conducted, with the creation of x (the private key) and
X the public key that will be sent to the Tor node. The public key is encrypted with the onions remote
key. To prevent an errors being implemented into the application the decision was made to use a
hybridEncryption that has already been created by Dr Gareth Owen. This was decided over creating
our own because the time for this iteration was fast running out due to the complexity. It allowed
more time to be spent on other sections rather than trying to re develop something that already has
a proven success rate. Finally it was chosen as this is thoroughly tested, and thus will not introduce
any bugs in the application regarding its use.
26 | P a g e
Figure 11 - Code snippet showing the creation of Diffie Hellman public and private keys
Once the payload and the client’s half of the Diffie-Hellman key had been calculated using the above
function, the packet was created using the build cell function as previously implemented, thus
dramatically reducing code duplication. The client’s private key x, is important to be stored as this will
need to be used to decrypt the packets that is received. For this a variable within the TorCircuit class
has been used. This method of storing the key was chosen over storing all keys in an array for example,
is this makes the key easier to be used later on in the application, with less chance of error.
The Tor circuit class is shown below:
Figure 12 - Code snippet showing the Tor Circuit Class
It was then important to receive and decode the received ‘created’ cell as this would complete the
handshake and provide the server’s public key, and shared key. This means that packets could be
encrypted to the first node, the first step of the circuit. To retrieve the data contained within the
packet, the payload of the data was extracted and, by using the Tor documentation, which states that
where in the packet each piece of data is located, we were able to extract the public key, the derived
key data as well as a unique key shared between the client and the first node.
27 | P a g e
This is shown in the code below:
Figure 13 - Code snippet showing the decoding of a Create cell
As shown in the code above, the calculated KH, Df, Db, Kf and Kb are returned to TorHop. TorHop is a
class which handles the creations of the circuits. By being able to save the values of these variables in
the class, they will be able to easily call later on, when they will be needed for encrypting and
decrypting packets.
This method of extracting the shared key data, as well as various other keys, could be reused for
decoding received ‘extended’ cells, as these share the same methods of encryption. The only
difference is that the ‘extended’ cells would first require decryption as they would be encrypted with
other Onion node keys.
The method to decrypt the cells is shown below:
Figure 14 - Code snippet showing the decryption function
This was extremely challenging to implement as it required the public and shared keys of the node as
well as other encryption variables to be correctly stored as these would later be used in decryption. It
28 | P a g e
was vital to get these in the right order, as otherwise the packets would not be correctly decrypted or
encrypted.
During the implementation of this requirement, the Tor protocol became a hindrance to the
development. This was mostly down to the feature of Tor that does not send back a cell if an
incorrectly configured cell has been sent, thus not giving the user any feedback about what has gone
wrong. It was also discovered that, if a cell was received, it would a be a destroy cell that contained
little to no information, or finally a relay cell that could not be decrypted to provide any useful
information due to not extracting the data properly. This was a very frustrating time for the developer
as many days were spent trying to chase down an error without knowing where to even start looking
for it.
After several weeks chasing down errors, a circuit can finally be created and thus the first two
requirements of this iteration were successfully completed. By this point, however, the project was
running behind schedule – this can be attributed to the complexity of Tor and the accompanying
documentation. Fortunately, from this point onwards, this iteration’s remaining requirements proved
simple to implement as creating a stream used the circuit and encryption previously implemented and
it was simply a case of sending a specially configured relay cell through the Tor network. This is shown
in the code below:
Figure 15 - Code snippet showing the creation of the Stream cell
As shown in the above code section, the creation of a stream simply required the host name or IP
address along with the port. It is then passed to the build cell which builds a relay cell containing the
data, before being encrypted with the selected nodes encryption keys and sent. Once again this was
created as function within the TorCircuit class.
The regularly received destroy cells provided the perfect opportunity to achieve the requirement
carried forward from Iteration 1 - to handle the destroy cells, which now inform the user of the cause
of the error.
29 | P a g e
Figure 16 - Code snippet showing the conversion of error codes to English
The above code snippet shows how the error code contained within the destroy cell is passed to the
above function, comparing this code to a dictionary containing all error codes and there meaning
before returning to the user the English error.
This method was chosen over just simply informing the user that a destroy cell has been received, is
because this aims to achieve the NFR of usability. By providing the user with a more detailed and in
depth error message, will allow for easier debugging.
4.2.4. Testing
As with the first iteration, testing will be conducted using unittest and manually. However unlike the
first iteration which only looked at functional and white box testing, this testing section also needs to
consider Integration.
An Integration test verifies that the parameters passed between modules are handled correctly, used
when a module is developed at a later stage than the module it is interacting with. This is an important
testing area to complete as this will ensure functions that were development in the first iteration are
capable of being used for functionality development in this iteration. Although the effectiveness of
this test can be argued, with some suggesting it is a waste of time, the little time it adds to the testing
makes it worth it, especially if a bug is found as this can be quickly fixed, and thus not affecting
functions later on during the development that may use it.
The testing strategy can be seen below, this shows the test to be conducted, how it was conducted
and if the test was successful. As with the first iteration in each round, the tests was run three times
to ensure no anomalies where present in the results.
Test No. Test Test method Succeeded? Comments
12 Connect to the first hop Unittest Yes
13 Calculate the shared
key between client and
node
Unittest
14 Create a CREATE cell
containing the relevant
data
Unittest Yes
15 Send the CREATE cell to
the first node.
Unittest Yes
30 | P a g e
16 Receive the CREATED
cell back from the first
node
Unittest Yes
17 Ensure a cell of cmd 3 is
received
Unittest Yes
18 Extract the payload of
the CREATED cell
Unittest Yes
19 Extract the first node
half of the key
Unittest Yes
20 Calculate KH, Df, Db,Kf,
Kb from the payload
Unittest Yes
21 Check the derived key
data is the same as KH
Unittest Yes
22 Calculate the shared
key
Unittest Yes
23 Ensure the nodes
entered in the array are
correctly passed to the
function
Unittest /
Manually
First round:
No
Second
round: Yes
Failed the first round do to
being passed as a single value
for all nodes, rather than a
value for each node, this was
a simple change to make and
by doing so it passed the
second round of testing.
24 Search the consensus
by a nodes nickname
for their IP address and
OR port
Unittest Yes
25 Packs the IP address
and OR port of a
selected node in the
right format
Unittest Yes
26 Calculate the shared
key half
Unittest Yes
27 Build the EXTEND cell Unittest Yes
28 Correctly encrypt the
packet
Unittest Yes
29 Ensure the packet
count is correct
Unittest Yes
30 Send the packet to the
correct node
Unittest Yes
31 Receive an EXTENDED
packet back
Unittest Yes
32 Handle a destroy cell
correctly
Unittest Yes
33 Extract the payload of
the EXTENDED packet
Unittest Yes
34 Decrypt correctly the
payload of an
EXTENDED packet
Unittest Yes
35 Ensure a
RELAY_EXTENDED cell
is received
Unittest Yes
31 | P a g e
36 Calculate Shared key Unittest Yes
37 Extract derivative key
data from the payload
Unittest Yes
38 Ensure KH and
derivative key data are
the same
Unittest Yes
39 Ensure KH, Df, Db, Kf,
Kb are updated to the
TorHop object
Unittest Yes
40 Ensure a stream can be
created to the a
specified webserver
Unittest /
Manually
First round:
No
Second
round: Yes
This failed first time due to
incorrect formatting of the
target webserver, but by
correcting the ip address or
web address and correct
port, this issue was able to be
solved, and the second test
was passed
41 Ensure the payload of
the stream packet is
correctly formatted
Unittest /
Manually
Yes
42 Correctly create the
stream relay cell
Unittest Yes
43 Correctly encrypts the
packet to allow it to be
sent through the
network
Unittest Yes
44 Ensure a packet is
received back from the
Stream request and
handled appropriately
Unittest Yes
45 Ensure a
RELAY_CONNECTED
cell is received
Unittest Yes
46 Check the data (GET
request) is correctly
formatted to a packet
Unittest /
Manually
Yes
47 Ensure the packet is
encrypted correctly
Unittest Yes
48 Ensure a packet is
received back and
handled appropriately
Unittest Yes
49 Ensure all the data is
received
Unittest /
Manually
First round:
No
Second
round: Yes
In the first round only a single
packet was received and did
not contain all the data. This
showed we must look for
more than a single packet,
which was implemented by
using a while true loop,
which allowed this to pass
the second round of testing
Figure 17 - Testing results iteration 2
32 | P a g e
As shown from the above test results, there were several tests that failed first time. This was to be
expected on such a complex iteration, however fortunately the three tests that did fail were easily
corrected. For example test 49 - Ensure all the data is received failed as it was wrongly assumed all
data would be received in a single packet, once this was found not to be the case, a simple loop was
implemented to ensure all packets was received. Once completed the test was re run and the test was
passed.
Overall the tests that failed, was not due to issues in the functionality of the application, but rather
developer error, this shows the importance of testing so issues such as these can be picked up early
on during development and fixed.
4.2.5. Moving forward from Iteration 1
As with the first iteration this iteration also over run the predicted timescale but two week, it will
therefore be necessary in the next iterations to increase the amount of time dedicated to the
development. A new project plan has been created to showing this, and how this delay has been taken
into account for the future iterations, to still insure the project is completed on time, this can be found
in appendix 5.
The main reason for the delay is due to the project being much harder and trickier than first expected;
the lack of error messages provided by the Tor is proving to be the most difficult feature of the Tor
protocol to handle and days have been spent trying to debug the software, despite not knowing what
is going on. With normal development of an application, if something goes wrong an error message
would be presented to the developer to indicate the area where the error occurred, but with the Tor
protocol, this does not happen. A further delay is caused by the confusing Tor protocol
documentation, which contains many inconsistences, making understanding exactly what is required
for each function difficult to understand, and on several occasions help from the Tor community has
been required to understand certain points.
However all requirements in this iteration was completed, including the requirement that was not
completed in iteration 1. All functional requirements have been completed, while also satisfying the
NFR of maintainability and usability. It would have been quicker to develop the functions without
considering these requirements, but by considering those during development will ensure an
application that meets and exceeds the client’s expectation.
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library
Development of a Tor library

Contenu connexe

En vedette (6)

Global pricing
Global pricingGlobal pricing
Global pricing
 
Factors influencing international price
Factors influencing international priceFactors influencing international price
Factors influencing international price
 
International Pricing Decisions
International Pricing Decisions International Pricing Decisions
International Pricing Decisions
 
Global Pricing
Global PricingGlobal Pricing
Global Pricing
 
International Pricing
International PricingInternational Pricing
International Pricing
 
International marketing pricing
International marketing pricingInternational marketing pricing
International marketing pricing
 

Similaire à Development of a Tor library

Thesis-Vamsi
Thesis-VamsiThesis-Vamsi
Thesis-Vamsichagari
 
IRJET -Securing Data in Distributed System using Blockchain and AI
IRJET -Securing Data in Distributed System using Blockchain and AIIRJET -Securing Data in Distributed System using Blockchain and AI
IRJET -Securing Data in Distributed System using Blockchain and AIIRJET Journal
 
Iaona handbook for network security - draft rfc 0.4
Iaona   handbook for network security - draft rfc 0.4Iaona   handbook for network security - draft rfc 0.4
Iaona handbook for network security - draft rfc 0.4Ivan Carmona
 
Internet Of Things and Open Source
Internet Of Things and Open SourceInternet Of Things and Open Source
Internet Of Things and Open SourceMobodexter
 
Firewall final (fire wall)
Firewall final (fire wall)Firewall final (fire wall)
Firewall final (fire wall)JIEMS Akkalkuwa
 
Architecture for a Extended/Virtual Enterprise
Architecture for a Extended/Virtual EnterpriseArchitecture for a Extended/Virtual Enterprise
Architecture for a Extended/Virtual Enterprisephermar
 
ICE2009 - An Architecture for a Extended/Virtual Enterprise
ICE2009 - An Architecture for a Extended/Virtual EnterpriseICE2009 - An Architecture for a Extended/Virtual Enterprise
ICE2009 - An Architecture for a Extended/Virtual Enterprisephermar
 
Preprint-WCMRI,IFERP,Singapore,28 October 2022.pdf
Preprint-WCMRI,IFERP,Singapore,28 October 2022.pdfPreprint-WCMRI,IFERP,Singapore,28 October 2022.pdf
Preprint-WCMRI,IFERP,Singapore,28 October 2022.pdfChristo Ananth
 
«Руководство по безопасности и защите персональных данных при использовании п...
«Руководство по безопасности и защите персональных данных при использовании п...«Руководство по безопасности и защите персональных данных при использовании п...
«Руководство по безопасности и защите персональных данных при использовании п...Victor Gridnev
 
5G And Future Communications Network Architecture
5G And Future Communications Network Architecture5G And Future Communications Network Architecture
5G And Future Communications Network ArchitectureJim Jimenez
 
Security Solutions for Hyperconnectivity and the Internet of Things
Security Solutions for Hyperconnectivity and the Internet of ThingsSecurity Solutions for Hyperconnectivity and the Internet of Things
Security Solutions for Hyperconnectivity and the Internet of ThingsMaurice Dawson
 
Whitepaper: Video Conferencing in the Classroom
Whitepaper: Video Conferencing in the ClassroomWhitepaper: Video Conferencing in the Classroom
Whitepaper: Video Conferencing in the ClassroomVideoguy
 
A novel programmable attenuator based low Gm-OTA for biomedical applications
A novel programmable attenuator based low Gm-OTA for biomedical applicationsA novel programmable attenuator based low Gm-OTA for biomedical applications
A novel programmable attenuator based low Gm-OTA for biomedical applicationsHoopeer Hoopeer
 

Similaire à Development of a Tor library (20)

Thesis-Vamsi
Thesis-VamsiThesis-Vamsi
Thesis-Vamsi
 
IRJET -Securing Data in Distributed System using Blockchain and AI
IRJET -Securing Data in Distributed System using Blockchain and AIIRJET -Securing Data in Distributed System using Blockchain and AI
IRJET -Securing Data in Distributed System using Blockchain and AI
 
IP TV
IP TVIP TV
IP TV
 
Iaona handbook for network security - draft rfc 0.4
Iaona   handbook for network security - draft rfc 0.4Iaona   handbook for network security - draft rfc 0.4
Iaona handbook for network security - draft rfc 0.4
 
OpenCryptoTrust vision deck
OpenCryptoTrust vision deckOpenCryptoTrust vision deck
OpenCryptoTrust vision deck
 
SDN-based Inter-Cloud Federation for OF@TEIN
SDN-based Inter-Cloud Federation for OF@TEINSDN-based Inter-Cloud Federation for OF@TEIN
SDN-based Inter-Cloud Federation for OF@TEIN
 
Internet Of Things and Open Source
Internet Of Things and Open SourceInternet Of Things and Open Source
Internet Of Things and Open Source
 
Firewall final (fire wall)
Firewall final (fire wall)Firewall final (fire wall)
Firewall final (fire wall)
 
Fcn assignment ( second mold )
Fcn assignment ( second mold )Fcn assignment ( second mold )
Fcn assignment ( second mold )
 
Architecture for a Extended/Virtual Enterprise
Architecture for a Extended/Virtual EnterpriseArchitecture for a Extended/Virtual Enterprise
Architecture for a Extended/Virtual Enterprise
 
ICE2009 - An Architecture for a Extended/Virtual Enterprise
ICE2009 - An Architecture for a Extended/Virtual EnterpriseICE2009 - An Architecture for a Extended/Virtual Enterprise
ICE2009 - An Architecture for a Extended/Virtual Enterprise
 
Steganography.pdf
Steganography.pdfSteganography.pdf
Steganography.pdf
 
Preprint-WCMRI,IFERP,Singapore,28 October 2022.pdf
Preprint-WCMRI,IFERP,Singapore,28 October 2022.pdfPreprint-WCMRI,IFERP,Singapore,28 October 2022.pdf
Preprint-WCMRI,IFERP,Singapore,28 October 2022.pdf
 
IEEE GLOBECOM'14 Industry Program
IEEE GLOBECOM'14 Industry ProgramIEEE GLOBECOM'14 Industry Program
IEEE GLOBECOM'14 Industry Program
 
«Руководство по безопасности и защите персональных данных при использовании п...
«Руководство по безопасности и защите персональных данных при использовании п...«Руководство по безопасности и защите персональных данных при использовании п...
«Руководство по безопасности и защите персональных данных при использовании п...
 
5G And Future Communications Network Architecture
5G And Future Communications Network Architecture5G And Future Communications Network Architecture
5G And Future Communications Network Architecture
 
Security Solutions for Hyperconnectivity and the Internet of Things
Security Solutions for Hyperconnectivity and the Internet of ThingsSecurity Solutions for Hyperconnectivity and the Internet of Things
Security Solutions for Hyperconnectivity and the Internet of Things
 
Fire brochure2013
Fire brochure2013Fire brochure2013
Fire brochure2013
 
Whitepaper: Video Conferencing in the Classroom
Whitepaper: Video Conferencing in the ClassroomWhitepaper: Video Conferencing in the Classroom
Whitepaper: Video Conferencing in the Classroom
 
A novel programmable attenuator based low Gm-OTA for biomedical applications
A novel programmable attenuator based low Gm-OTA for biomedical applicationsA novel programmable attenuator based low Gm-OTA for biomedical applications
A novel programmable attenuator based low Gm-OTA for biomedical applications
 

Development of a Tor library

  • 1. 2014 MASTERS ENGINERING PROJECT RICHARD DENNIS 500198 | University of Portsmouth Development of a Tor library Supervisor: Dr Gareth Owen Moderator: Dr Nick Savage
  • 2. Abstract Given the increasing popularity of Tor following the Edward Snowdon revelations, the dark net has been the subject of much debate, both amongst academics and in the media. Existing Tor libraries require Tor to be installed on the user's PC and are currently incapable of conducting an attack against Tor, which would be advantageous to law enforcement agencies on local, national and international levels. This leaves room for further development within the field and it was with these reasons in mind that the current study was undertaken. Aiming to create an intuitive application that is not reliant on an existing Tor client, this project aims to radicalise the field by creating a Tor library that is capable of providing a foundation for further development that could lead to the de-anonymization of Tor users. The outcome of this project was that a fully functioning independent library was created that, following extensive testing, was capable of conducting an initial Denial of Service attack against a Tor node. The results of this attack were inconclusive, however this serves to demonstrate that the developer has fulfilled his objective of creating a forward-looking library that provides a solid base for future development in the field. It is hoped that the research and code developed throughout this project will contribute to the development of an attack on Tot that can de-anonymise users of the network and thus make a valuable contribution to law enforcement and the field of cyber security.
  • 3. Acknowledgements I wish to thank Dr. Gareth Owen for providing valuable guidance and support throughout the course of the project in his role as project supervisor
  • 4. Contents 1. Introduction ........................................................................................................................................1 1.1. Introduction ............................................................................................................................1 1.2. Rationale.................................................................................................................................1 1.3. The problem............................................................................................................................1 1.4. Objectives................................................................................................................................1 1.5. Constraints..............................................................................................................................2 1.6. Project Deliverables ................................................................................................................2 1.7. Report Structure overview......................................................................................................2 2. Literature Review............................................................................................................................4 2.1. Introduction ............................................................................................................................4 2.2. Analysis ...................................................................................................................................4 What is Tor?....................................................................................................................................4 History of Tor ..................................................................................................................................4 How Tor works................................................................................................................................5 Hidden services...............................................................................................................................6 Who uses Tor and for what?...........................................................................................................7 Current Tor development projects .................................................................................................8 Current attacks on Tor ....................................................................................................................8 Conclusion.......................................................................................................................................9 3. Project Management ....................................................................................................................10 3.1. Brief description of chapter..................................................................................................10 3.2. Select methodology ..............................................................................................................10 3.3. Requirements elicitation.......................................................................................................11 3.4. Risk analysis ..........................................................................................................................13 3.5. Schedule................................................................................................................................14 3.6. Professional issues................................................................................................................14 3.7. Conclusion of the section......................................................................................................14 4. Application Development .............................................................................................................15 4.1. Iteration 1 .............................................................................................................................15 4.1.1. Requirements................................................................................................................15 4.1.2. Design............................................................................................................................17 4.1.3. Implementation ............................................................................................................19 4.1.4. Testing...........................................................................................................................21 4.1.5. Moving forward from Iteration 1..................................................................................23 4.2. Iteration 2 .............................................................................................................................24
  • 5. 4.2.1. Requirements................................................................................................................24 4.2.2. Design............................................................................................................................24 4.2.3. Implementation ............................................................................................................25 4.2.4. Testing...........................................................................................................................29 4.2.5. Moving forward from Iteration 1..................................................................................32 4.3. Iteration 3 .............................................................................................................................33 4.3.1. Requirements................................................................................................................33 4.3.2. Design............................................................................................................................33 4.3.3. Implementation ............................................................................................................34 4.3.4. Testing...........................................................................................................................36 4.3.5. Moving forward from iteration 3..................................................................................38 4.4. Iteration 4 .............................................................................................................................39 4.4.1. Requirements................................................................................................................39 4.4.2. Design............................................................................................................................39 4.4.3. Implementation ............................................................................................................40 4.4.4. Testing...........................................................................................................................44 4.5. Iteration 5 .............................................................................................................................47 4.5.1. Requirements................................................................................................................47 4.5.2. Design............................................................................................................................47 4.5.3. Implementation ............................................................................................................49 4.5.4. Testing...........................................................................................................................51 5. Evaluation .....................................................................................................................................52 6. Summary, conclusion and recommendations ..............................................................................54 7. Bibliography ..................................................................................................................................55 8. Appendixes....................................................................................................................................59 Appendix 1 – Project Initialization document...................................................................................59 Appendix 2 – Ethical checklist...........................................................................................................66 Appendix 3 – Original planned project schedule..............................................................................73 Appendix 4 – Project schedule updated due to iteration 1 overrunning .........................................74 Appendix 5 – Project schedule updated due to iteration 2 overrunning .........................................75 Appendix 6 – Suitability analysis questionnaire ...............................................................................76 Appendix 7 – Black box testing questionnaire..................................................................................77 Appendix 8 – Evaluating functional requirements ...........................................................................78 Appendix 9 - Evaluating Non-Functional requirements....................................................................80 Appendix 10 – Actual project schedule ............................................................................................81
  • 6. List of Figures Figure 1 - Onion routing circuit...............................................................................................................6 Figure 2 - Hidden service architecture....................................................................................................7 Figure 3 - Agile methodology diagram..................................................................................................10 Figure 4 - Table showing possible elicitation methods.........................................................................12 Figure 5 - Table containing potential risks and countermeasures........................................................13 Figure 6 - Table containing requirements for iteration 1 .....................................................................15 Figure 7 - Non-functional requirements ...............................................................................................16 Figure 8 - Code snippet showing connection to a Tor node.................................................................19 Figure 9 - Testing results for iteration 1................................................................................................22 Figure 10 - Functional requirements for iteration 2.............................................................................24 Figure 11 - Code snippet showing the creation of Diffie Hellman public and private keys..................26 Figure 12 - Code snippet showing the Tor Circuit Class........................................................................26 Figure 13 - Code snippet showing the decoding of a Create cell..........................................................27 Figure 14 - Code snippet showing the decryption function..................................................................27 Figure 15 - Code snippet showing the creation of the Stream cell.......................................................28 Figure 16 - Code snippet showing the conversion of error codes to english .......................................29 Figure 17 - Testing results iteration 2 ...................................................................................................31 Figure 18 - Requirements iteration 3....................................................................................................33 Figure 19 - Code snippet showing the creation of a rendezous point..................................................35 Figure 20 - Code snippet showing how the service is downloaded and saved to a text file ................36 Figure 21- Testing results iteration 3....................................................................................................38 Figure 22 - Requirements for iteration 4 ..............................................................................................39 Figure 23 - Code snippet showing how the service descriptor data is extracted.................................41 Figure 24 - Code snippet showing base 64 decoding............................................................................41 Figure 25 - Code snippet showing dynamic methods to extract data from a service descriptor.........42 Figure 26 - Code snippet showing how to connect to an induction point............................................43 Figure 27- Code snippet showing the packet creation of stream to a hidden service .........................44 Figure 28 - Testing results iteration 4...................................................................................................45 Figure 29 - Diagram showing the bandwidth saturation attack ...........................................................48 Figure 30 - Configuration screenshot showing the Max bandwidth allowed per day on a Tor node ..49 Figure 31 - Code snippet showing a function to allow code to run every 24 hours.............................49 Figure 32 - Code snippet showing the DoS attack ................................................................................50
  • 7. 1 | P a g e 1. Introduction 1.1. Introduction This introduction discusses the motivation and reasons behind this project, as well the project objectives and constraints before concluding with a brief summary of all the chapters in this report. 1.2. Rationale Since the Edward Snowden revelations regarding government surveillance, more and more people have been wanting privacy and anonymity when using the internet. The Onion Router (hereafter known as Tor) provides such a service, and is widely used by people wanting to remain anonymous on the internet and in countries where internet censorship is a major issue. As well as providing anonymity to internet users, Tor also allows a website or service to be hosted within the Tor network, these are known as hidden services. A normal website is hosted on the internet; the hosting location can be found and visitors to the site can be monitored, but a hidden service provides the person hosting the service and the site’s users with anonymity. This can lead to websites containing illegal material such as child pornography or drugs being hosted on Tor as both the users and the host are guaranteed complete anonymity. This demonstrates that being anonymous on the internet brings with it a lot of issues, one of which is accountability. For example, in the event that somebody had used the internet for nefarious means, how can we prove they accessed a certain website or illegal service? 1.3. The problem Despite the increased exposure and popularity of Tor, development in the field is currently almost non-existent. Although many research papers look at theoretical attacks on Tor (Borisov et al, 2013; Biryukov et al; 2013b; Jansen, 2014), there is not currently a Tor application, library or framework that would facilitate these theoretical attacks. Currently, to use Tor, a user downloads and installs it to their machine. The application is very limited in terms of its use; it is currently only designed to connect a user to the Tor network and allow them to use a pre-configured web browser to use the internet anonymously. Although is an open-source project, meaning development of the Tor client is possible, the sheer size and complexity of the application makes developing it extremely challenging. This means there is no room to easily build upon or further develop the current application, for example to increase the security of Tor or to program attacks to de-anonymise users. It is clear that there is a need for an application that can connect a user to the Tor network, which provides all the functionality of the current Tor application but which is also able to be expanded and built up, with a final goal of implementing an attack such as a denial of service or de-anonymization attack. 1.4. Objectives The objective of the project is to create a Tor library which, unlike previous Tor libraries, does not require the user to install Tor. This will make this project unique. It will also make the project more challenging as the application will need to use the Tor protocol to communicate with the Tor network. This involves challenges such as encrypting and decrypting packets to Tor nodes and communicating with hidden services. This project aims to develop a fully functional library whose functionality is comparable to that of the Tor client and, in the wider context, which will be able to be used for future development for projects such as attacking or hacking Tor.
  • 8. 2 | P a g e 1.5. Constraints With all projects many constraints faced during the development of the project, both internally and externally.  Hard deadline – 12th September 2014. On this day all project deliverables must be handed in to the client.  Zero budget  Availability of software – Although the software used will be open source due to the budget constraint, support or access to the software may be limited.  Knowledge of the problem is limited, and in order to gain the necessary understanding of Tor, as well as learning new skills required for the project, will require a large amount of time.  Other commitments will interfere with the project; good project and time management are required. 1.6. Project Deliverables The list of project deliverables for this project is:  A Tor library that can connect to Tor, Tor’s hidden services and which can be expanded on  A report to document the development of the application consisting of: o Literature review o Project Management o Specification and discussions of the requirements o Application Development o Summary of the project o Evaluation against requirements o Conclusion of the project o Bibliography 1.7. Report Structure overview Chapter Title Synopsis 1 Introduction 2 Literature review Firstly this chapter analyses what methods are available to ensure the most appropriate documents were chosen to review. Next is the literature review itself, this will cover the topic areas of:  What is Tor? (Historical and technical discussion)  How does Tor provide its users with anonymity and privacy?  Uses of Tor  What is the social impact of Tor?  Alternatives to Tor 3 Project management Discusses why the method of project management chosen for this project was the most appropriate choice and the process that would be used to elicit requirements for the project. Next, a risk analysis and various countermeasures are discussed before concluding with a brief discussion of the intended schedule and any relevant professional issues that could arise. 4 Application Development This chapter looks at each iteration stage of development. The requirements of each iteration are discussed, followed by a description of the design process and the implementation
  • 9. 3 | P a g e procedure. The testing of each iteration is explained as well as concluding which features will be carried forward or dropped from the application in the following iteration. 5 Evaluation against requirements Evaluates whether the project has met the requirements outlined earlier in the development. It will look into how the requirements have been meet or exceeded, before looking at requirements that have not been met, providing an explanation of why these have not been met and a recommendation of how these may be achieved. 6 Conclusion of the project Reflects on the project as a whole. Looks at how the project has been developed, what mistakes were made during the development as well as what has been achieved. This chapter discusses what the project has brought to the field of information security and possible future development for the project. 7 Bibliography Lists all the references used throughout this report, as well as sources used throughout the development of the project. 8 Appendixes Contain all the figures, tables, and section of code along with any other information such as testing strategies that accompany the report.
  • 10. 4 | P a g e 2. Literature Review 2.1. Introduction Tor used to be unheard of, except within the tech community and amongst users of illegal sites. However, since 2013, when the Edward Snowden leaks revealed that GCHQ and the NSA are unable to de-anonymize all Tor users, Tor has been cast into the public eye. Now with 200,000 daily connected users (Tor Project, 2014) Tor and its use is debated more than ever. This report examines Tor in great detail; what it is, who uses it and for what reasons. It will then discuss the illegal side of Tor, before looking into why Tor needs to be investigated and what has been done so far. Further research and a review of the relevant literature are carried out throughout the project, as research into the Tor protocol as well as research into current attacks, not necessarily against Tor, will be required. 2.2. Analysis What is Tor? Tor is an open source project that is a decentralized low latency mix network of specially configured nodes, commonly called relays or bridges, which transmits only TCP traffic through virtual tunnels from a client to a destination, usually, but not exclusively, the Internet. Tor has been incorrectly described as having a single authority and being a Virtual Private Network (VPN) or a peer-to-peer (P2P) network (Hurley et al., 2013, p. 1); but this is not the case. Tor is described by McCoy et al. as a “privacy enhancing system, designed to protect the privacy of Internet users from traffic analysis” (McCoy et al., 2008, p. 63). Syverson elaborates on this, stating that Tor provides anonymous connections to the Internet providing protection against traffic analysis as well as eavesdropping (Syverson, Goldschlag & Reed, 1997, p. 44). Both groups of scholars agree that Tor provides anonymity. As well providing users with anonymity and privacy, Tor can be a tool for anti-censorship. Endorsed by the Electronic Frontier Foundation and other civil liberties groups, one use of Tor is as a means of communication between journalists, whistle-blowers and human rights workers. (Levine, 2014) Tor is also the network supporting what the media commonly refer to as the 'Darknet', due to the encrypted nature of the network and its association with illegal, or ‘dark’, activities. History of Tor Roger Dingledine is largely credited for being one of Tor’s creators, in 2004 he was part of the team that released the paper Tor: The Second-Generation Onion Router (Dingledine, Matthewson & Syverson, 2004). Levine explains how the concept behind Tor, “Onion routing”, can be traced back to 1995, and this should be considered the origin of Tor (Levine, 2014). Michael Reed, one of Tor’s creators, revealed Tor was originally created for military intelligence usage, such as open source intelligence gathering, and the reason for releasing it open source was to provide better cover traffic to hide what the network was really being used for (Reed, 2011). Reed expands upon this, adding that Tor was not designed for helping dissidents in repressive countries or criminals to avoid law enforcement (Reed, 2011).
  • 11. 5 | P a g e Until 2013, Tor was almost unheard of expect amongst technology and criminal circles. In 2013, Edward Snowden made a series of high profile disclosures about several global surveillance programs. One leaked document, “Tor Stinks”, made international news, and describes how the NSA tried to compromise Tor anonymity (The Guardian, 2013). Silk Road, a Tor hidden service which is also known as the eBay for drugs, was an online marketplace where users could buy and sell drugs around the world (Barratt, 2012, p. 683) and was hailed as being a “criminal innovation” (Aldridge & Décary-Hétu, 2014). It was taken offline in 2013 by the FBI. This was the first public demonstration of a government agency taking down a website hosted on Tor, and attracted significant media attention internationally (Greenberg, A., 2013b). However, the effectiveness of the closure of the Silk Road has been questioned by Greenberg, who reports that Silk Road 2.0, an updated and improved Silk Road, was online just one month after the closure of Silk Road (Greenberg, A., 2013a). Since these leaks and the FBI takedown of the Silk Road, Tor has never been more popular (Jeffries, 2013); its popularity has sky-rocketed and it now has around 2.5 million monthly users (Tor Project, 2014). How Tor works Tor is built on the second generation onion routing. Syverson et al. (2000, p. 1) states one of the major reasons in the change of design from gen 1 to gen 2 is to be able to release the source code for public distribution as the patent of generation 1 onion routing preventing this. Dingledine et al. (2004, p. 1) describe how another of the major reasons for changing the design of onion routing was to solve “many critical design and deployment issues that were never resolved” as well as stating the “design has not been updated in years”. Tor creates a “circuit” through several, usually three, Tor nodes; a Guard (Entry) node, a relay, and an exit node. Borisov et al. (Borisov et al., 2007) strengthen the case for selecting three nodes, by demonstrating how increased circuit lengths compromises anonymity. With the nodes selected, and the user wishing to send a message through the Tor network, the message is encrypted like an onion with all three of the nodes Keys. The Diffie-Hellman (DH) handshake is used between each relay and the client to create the session key (Jagerman et al., 2014, p. 4). However the DH handshake has also been criticized, with some scholars proposing an ElGamal key agreement based protocol, although this has currently not been implemented (Øverlier & Syverson, 2007, p. 4). Catalano is also critical of the current onion routing protocol, claiming it has a high round complexity which affects the running time, although he agrees that Tor onion routing provides forward secrecy and is secure (Catalano et al., 2011, p. 255). Loesing is argues that using the DH key exchange makes “building circuits in Tor is a time consuming task” although he fails to fully justify his reasons for this statement (Loesing, 2009, p. 48). Dingledine et al. (2004), Øverlier and Syverson (2007), and Loesing (2009) all acknowledge that using DH to create the circuits in Tor achieves perfect forward secrecy, with Loesing explaining how if an attacker were to collect and store all traffic, to try and force the nodes to decrypt it would be ineffective due to the “telescoping” approach (2009, p. 45). This means that once the session keys are deleted a relay cannot be forced to decrypt old traffic. In a similar vein, Øverlier & Syverson (2007, p. 3), demonstrate how, due to the forward secrecy feature of Tor, attacks do not succeed. The message is encrypted with the last node’s key first, then the encrypted message is encrypted again with the second node’s key, before finally encrypting this message with the first node’s key.
  • 12. 6 | P a g e Onion routing is shown in the image below: Figure 1 - Onion routing circuit Dingledine et al. (2004, p. 5) provide an explanation of how a message that is to be sent through an onion routing network is then decrypted, with the message arriving at and being decrypted by the first node; here the only information that will be known is which node to send the packet to. This will be done at each node through the circuit, until the message is sent out to the Internet. The removal of a layer of encryption is like peeling the layers of an onion, hence the name onion routing. Apart from the three main nodes, Entry, relay, and exit, there is another type of relay, a bridge; this is a relay that is not listed on the consensus, and is used as the first hop in the circuit to route traffic out of a certain country. Ling et al. (2012, p. 2381) conclude that Tor bridges are critical to counter censorship blocking. Winter and Lindskog reinforce this viewpoint by explaining how China is blocking all public relays in the consensus then stating that only “1.6% of public relays are able to be connected to” (2012, p. 11) which shows that, since the bridges are unknown, they cannot be connected to. Hidden services Services such as websites can also be hosted within the Tor network. These are known as “Hidden Services” and can only be visited through Tor. Tor Hidden Services were added in 2004, when the second generation of onion routing was developed (Dingledine et al., 2004). Loesing (2009, p. 36) describes how Tor offers a TCP-based service to be accessed while concealing the identity of the hosting servers IP address. This enables a user (Alice) to connect to another user (Bob’s) server without knowing where, or who, it is. Loesing (2009, p. 40) also brilliantly sums up the hidden service design as, “connecting two circuits created by client and server on a common rendezvous point”. Biryukov et al. (2013, p. 82) expand on Loesing’s explanation, stating that all communication between a client and the hidden service is done through a rendezvous point (RP) which connect circuits from the user to the hidden service. These RPs are mutually agreed points (Dingledine et al, 2004). Data Data Data Dat Unencrypted unless using HTTPS Guard node Relay node Exit node Tor protocol
  • 13. 7 | P a g e Dingledine et al. (2004) also discuss how to connect to a hidden service; the user tells the hidden service what RP will be used, first using a hidden service descriptor to search the distributed hash table, a lookup service to inform the user of which induction points (IP) are servicing the hidden service. The client will then communicate with the IP of the RP which will then be used to communicate with the hidden service. Who uses Tor and for what? Tor was originally designed for military intelligence gathering (Reed, 2011), however it has now become more diverse; whilst it is still used by the military and law enforcement, its users now include activists reporting abuse from danger zones (Tor Project, 2014b). This is particularly relevant given the continued fighting in areas such as Iran and Syria; Tor has been instrumental in getting reports of information from within these countries to the outside world. Tor was even awarded the FSF's Award for Projects of Social Benefit for its role in the revolutions in the Middle East (Sullivan, 2011). Since the Snowden revelations about the surveillance program PRISM, a data mining program used by the NSA and GCHQ to store Internet communications from companies such as Google and Yahoo, Tor has been increasingly used by normal people seeking privacy online and wanting to prevent government agencies from monitoring them. Munson validated this assumption by showing how Tor usage has increased dramatically since the Snowden leaks, which he feels suggests that Tor is being used by users seeking privacy online (Munson, 2013). Arma (2013) however contradicts Munson’s conclusion stating that, due to the increase of “ESTABLISH_RENDEZVOUS requests", this growth is instead likely to be from a botnet, although he also acknowledges some growth can be attributed to activists in Syria and the United States. The distribution of copyrighted digital material has moved from peer-to-peer to the “Darknet” (Wood, 2010, p. 1); this contradicts the Tor project message that users are only using Tor for good, and legal, means. Wood believes users of the “Darknet” exploit the anonymity provided by Tor to illegally download material without being able to be traced. Biryukov et al. (2013b, p. 2) strengthen the argument that Tor is used for illegal purpose; they show that 44% of hidden services were hosting illegal content such as drugs, pornography, illegal copyright material etc. Arguably the most famous hidden service is Silk Road, also known as the eBay for drugs (Barratt, 2012, p. 683). Reportedly, the Silk Road had between 30,000 and 150,000 active customers (Christin, 2012, p. 2). The Silk Road was used to make significant numbers of transactions; Konrad (2013) reports that the Silk Road’s owner and operator Ross William Ulbricht handled $1.2 billion of transactions in the 2.5 years before FBI seizure. This clearly shows how popular illegal activities are on Tor, greatly contradicting the Tor project’s stance that Tor is used exclusively for good and instead showing that Figure from Dingledine et al., 2004, p. 3 Figure 2 - Hidden service architecture
  • 14. 8 | P a g e Tor is being used to conduct illegal activity anonymously, thus making traceability and accountability extremely difficult. Cox (2014) cites Bartlett’s radical views about Tor. Bartlett sets out to discover “the range of things that people do under the conditions of anonymity” (Cox, 2014) and implies that, under the cover of anonymity, people will do more distressing and ‘dark’ things, for example talking to “trolls on pro- anorexia forums” (Cox, 2014). He goes on to suggest that human nature finds a haven on the Internet and uses the tools available to enable this; in this example Tor allows these illegal and immoral activities to be conducted anonymously. While this view regarding the reason why Tor is being used for illegal purposes may be a radical one, it does not hide the fact that a large percentage of Tor is used for illegal activities by criminals using the blanket of Tor’s anonymity as protection from prosecution. Current Tor development projects With Tor being open source, developers are free to use the source code to modify and develop their own applications using Tor. As well as being able to modify the source code, there are libraries that exist which allow developers to use Tor for their projects. Stem, a Python controller library for Tor, requires Tor to be installed on the machine and controls Tor through the control port (usually port 9051). Winter demonstrates how this can be used to create circuits as well as stream in these circuits, although he implies the lack of features and functionality of Stem inspired him to create his own application rather than using the existing Stem library (2014, p. 3). However Atagar praises Stem for the friendly API and documentation, while simultaneously criticizing the lack of backward compatibility it offers (Atagar, 2012). Txtorcon, previously TorCtl, is another Python controller library. It works in the exact same way, using the control port to control Tor. TorCtl library was used for the development of the Torbutton Firefox extension. Meejah (2014) describes how Txtorcon communicates with the Tor network as being “an asynchronous API to speak the Tor client protocol in Python” and believes the main goal of Txtorcon is to enable applications to use the Tor network to improve people's privacy and anonymity on the Internet. Atagar draws a comparison between both libraries, praising both for their “extensive test suites and are being very actively maintained” (Atagar, 2012) and noting that both just control the Tor client. Current attacks on Tor With such a diverse client base, it is easy to see who may want to attack Tor, and their reasons for doing so. For example, law enforcement will want to catch paedophiles who access hidden services containing child abuse, whilst regimes such as China or Iran want to prevent users accessing Tor and so may try to take down Tor altogether. Denial-of-Service (DoS) Wood and Stankovic (2002, p. 55) describe a DoS attack is “any event that diminishes or eliminates a network’s capacity to perform its expected function”. Borisov et al. agree with Wood’s statement and also claim that as well as the blanket DoS which effects the whole network, a selective DoS attack can target just one small section of the network (Borisov et al, 2007). Unlike Wood and Stankovic (2002, p. 55), Wang et al. (2004, p. 193) accurately describes a DoS attack to be the flooding of a node with traffic such as requests until the node is unable to function.
  • 15. 9 | P a g e Jansen (2014) details a DoS which implements a selective DoS attack. This attack targets either the entry or exit node. The attack works by requesting data from a source such as a file server, and then for the client to stop reading from the TCP connection, thus exploiting Tors control flow before requesting more data from the source, causing the memory of the target node to increase, and then the OS terminate the Tor application on the relay. Jansen (2014) acknowledges the attack, and its effectiveness, however is critical of the attack; he discusses several simple defences that would prevent this attack such as the implementation of authenticated SendMe cells to prevent the control flow being misused. De-anonymization of hidden services Rob Jansen expands on his aforementioned DoS attack in order for it to be able to conduct the de- anonymization of hidden services. Jansen expands on the attack first developed by Biryukov, Pustogarov and Weinmann (2013). This attack requires the current guard of a selected hidden service to be known and taken offline, forcing the HS to select a new guard node; this is repeated until the attacker’s guard node is picked. However, this required the attacker to run a compromised guard relay and Jansen is critical of this attack’s success. He suggests that if middle guards was used, then the attack would not work. Jansen also criticises the attack’s method of taking a node offline, stating if the node was correctly configured to prevent a sniper attack, then this would fail. He goes on to further suggest that if the node were to simply reboot after it was taken offline, this alone would be enough to render the attack ineffective (Jansen, 2014). ASN (2013) frame this attack in a new light, instead of trying to prevent the attack, they suggest the core design of Hidden Services is flawed and in need of a redesign if it is to continue to be secure and effective. Conclusion The review has enabled the developer to gain a solid understanding of what Tor is, how it works and who it is used by. In addition, it has provided the opportunity existing Tor libraries to be reviewed. By comparing Stem and Txtorcon it became clear that both offer the same level of functionality, but in some areas this is at such a low level that they can be considered to be severely lacking, as they only allow an application to control the Tor client and not control Tor directly. From this section it is clear to see this is an area of research that is significantly lacking, and would benefit greatly from more research and development. The current attacks on Tor section is also extremely relevant to this project, this section analysed two different attacks, a DoS attack and an attack which de-anonymised hidden services. This section in particular featured analysis from several authors who were not in agreement over the effectiveness of the attacks demonstrated, but all agree on one thing: Tor is vulnerable to attacks. Overall this literature review has demonstrated that while Tor is a network that has been around for over 10 years, and has been heavily researched, there are still many opposing viewpoints regarding certain topics such as attacks. It has shown why Tor is more popular today than it ever has been, and the impact that the anonymity of Tor has on the usage of Tor. Tor development libraries were shown to be an area in which little research has been conducted, and where the current solutions have been met heavily with criticism and opposing views.
  • 16. 10 | P a g e 3. Project Management 3.1. Brief description of chapter This chapter discusses the project and requirement elicitation methodologies that were used for this project. It also considers the project’s potential risks and the countermeasures that were implemented to negate these, as well as discussing the original schedule designed for the project. 3.2. Select methodology Project management has been used in various forms for centuries, but only since the 1950s has the modern concept of project management existed (Kwak, 2003, p. 1). The origin of modern project management is disputed, although there are records of it being used in the 1950s by US defence, Xerox, Bell Laboratories and even NASA. Modern project management can be defined as the “application of knowledge, skills and techniques to execute projects effectively and efficiently” (PMI, 2014). Since the 1950s, project management has become a necessity on all projects; it reduces the risk of the project failing, or the wrong features being developed and ensures that the project is completed efficiently and on time. The Agile Method of project management was chosen for this project. The Agile Method favours “working software over comprehensive documentation” (Paulk, 2002, p. 15). In this project the client only expects a working application and this report to be produced, no other documentation is required. This makes the Agile Method a good choice for this project, as it permits development to be started immediately and provides the client with what they want, whereas methods such as the Waterfall method focus a lot more on documentation, thus delaying the start of development. Another reason for choosing this method was that it promotes adaptability throughout the project’s lifecycle. Unlike other methods, such as the Waterfall Model, the Agile Method allows and encourages changes to be made as and when they are required. This ensures development is continuous and will not be stopped by a requirement that cannot be implemented. This method also allows for regular testing and for working software to be shown frequently to the client; this promotes client feedback any necessary changes can be immediately implemented with little cost to the design and development of the project. This would not be possible with other traditional methodologies, in which changes can be costly and can usually only be implemented at the end of the development. This feature also mitigates risks and ensures that a quality application is produced; any bugs or issues are identified within an iteration (a short timescale of around three weeks) and can be dealt with at that point, rather than having long-lasting effects on the application as would be the case with the Waterfall model. A further benefit of showing the client frequent iterations of the project is that they get to see the progress that is being made, which on a complex project allows them to gain a sense of the challenges faced by the developer and to feel that they are having an input in the application’s development. If Waterfall or Spiral models are used, the client only receives the end product and thus does not gain the same understanding of the project and the issues that the developer faced. The Agile Method can therefore be argued to lead to better customer relations. Figure from: Stack Exchange, 2013. Figure 3 - Agile methodology diagram
  • 17. 11 | P a g e The Agile Method also promotes continual improvement of the application by taking positive, and negative, aspects of the current iteration forward to be expanded upon in the next iteration. In other methods, this can only be achieved at the end of development, ready for the next major release. This ensures that positive features are capitalised on and promoted, optimising the quality of the application. A further benefit of this method over other traditional methods, such as the Waterfall or Spiral models, is that if the development is running behind schedule and the deadline is likely to be missed, it is possible to liaise with the client and choose to focus on core requirements. Dropping any non-critical requirements from the development plan ensures that a working application, albeit one with reduced functions, can be handed to the client instead of having an application that is only half-developed. This means that using this method increases the chances of the developer meeting the deadline by having the project moving consistently forward and not grinding to a halt by trying to achieve goals and requirements that cannot be implemented. 3.3. Requirements elicitation Well thought out, well-structured requirements generally lead to more successful project which meets client expectations and the delivered application is fit for purpose (Hickey & Davies, 2002). It is therefore important to ensure that the requirements gathered provide sufficient detail, are realistic and achievable within the timeframe of the project. “Requirements elicitation is the process of seeking, uncovering, acquiring and elaborating requirements for computer based systems” (Zowghi & Coulin, 2005, p. 1). This process may be time consuming, but it helps to prevent a project going over-budget, being delivered late or failing to provide the required functionality (Jones, 1995, p. 86) Using the Agile Method means that requirements will be gathered at the start of iteration. This allows the requirements to take into account any issues faced or lessons learned during the previous iteration and differs from the Waterfall Method, where the requirements for the entire project are thought of prior to starting development. Using the Agile Method means that the requirements elicitation method will need to be run several times throughout the course of the project. This makes choosing an elicitation method critical; a method that takes a long time to get results would be an inappropriate choice as it would delay the development of the project. For this reason a questionnaire would be considered an inappropriate method of requirement elicitation for this project, as the time spent waiting for replies to the questionnaire could eat into valuable development time. There is more to requirement elicitation than the client simply telling the developer what they want; a more detailed research process is required. This involves finding exactly what the client realistically expects from the application as well as looking at similar projects that have been developed to see what features these offer that could be integrated into the development of the application. A combination of this information can be condensed into well-structured and achievable requirements that can be implemented into the project.
  • 18. 12 | P a g e The elicitation methods that were considered for this project are shown in the table below, along with a summary of their appropriateness for the project: Method Outcome Appropriate? Interview the client (may be formal or informal, in person or via online correspondence) Greater understanding of what is expected, issues with current solution etc., however the quality of answers is dependent on questions asked. Yes – This is a core method that will be used, this will allow us to understand exactly what the client is expecting, the reasoning behind the project and the time scale in which it is required. Keeping in constant communication with the client will promote customer satisfaction. Interview the end users (may be formal or informal, in person or via online correspondence) Understand what the users actually want, if the client and end users expectations are the same, again the quality of answers is dependent on the questions asked. Yes – another core method that will be used to see if the user wants the same features etc. as the client, also allows information to be gathered that could have been missed from the client interview. Prototyping Rapid prototyping would develop a small section of functionality; this could be used to get feedback on the section from users or the client, and could be used to estimate a timescale for the project. No – High cost of failed prototypes, not required due to the chosen agile methodology, however a prototype would allow evaluation of the proposed approach to development. Case study A report which will allow the understanding of the current system/application An example of a case study model is the critical incident technique which observes the human interaction of the current system. (Woolsey, 1986) No – This project does not build upon an existing application, and as case studies are almost always retrospective, it is not appropriate for this project. Brainstorming Could be conducted alongside the interview process; a way for all ideas that may not necessarily have been discussed during the interview to be presented. Yes – Will allow for the members involved to discuss ideas that may not be core to the project, but ideas that are revolutionary, or never before done, but would be welcomed if possible. Figure 4 - Table showing possible elicitation methods As the above table shows, three different methods were selected. These are interviews both with the client and with the potential end-users of the application, as well as brainstorming sessions which will be run in conjunction with the interviews. Using several different methods, and focusing on the end- user as well as the client, ensures that the functions of the application meet the client’s expectations and results in a usable application that meets the requirements of the end-user. By brainstorming with the client, allows the developer to pick up on any implicit requirements that have not be explicitly stated by the client but that would still greatly benefit the application and ensure the client is satisfied with the final product.
  • 19. 13 | P a g e 3.4. Risk analysis As with any project there is risk involved, and although using the Agile method mitigates risk, there are still some risks to the project. The table below shows the possible risks, their potential impact level and finally what reduction strategy will be in place to ensure they are mitigated as much as possible during the project development. Risk Impact level Reduction strategy Tor network unavailable – Internet access or the Tor network unavailable High Use a Tor simulator such as the Tor Path Simulator (TorPS). Poor productivity – Developer’s motivation inhibits the project’s development High Set 20 hours a week minimum for the project, more when needed. Setting small milestones will increase motivation and productivity. Regular meetings with the client will ensure the milestones are met. Technical risk – Project is too complex to implement High Regular meetings with the client ensuring they are kept up to date with the development, and adjust the requirements to allow for a work around if possible. Programmatic risk – Customer changes their mind about wanting the project developed High Find another client or adapt the project to cater for the client’s change of heart. Inherent schedule flaws – due to the uniqueness of the project, it is difficult to estimate and schedule. Medium Better to overestimate than underestimate timescales; use the Agile methodology to renegotiate the schedule with client. Requirements Inflation - more features that were not identified at the beginning of the project emerge that threaten estimates and timelines. Medium Keep in constant contact with the client with regular meetings etc., only accept more features if timescale allows. Specification Breakdown – Only during the development does a conflicting requirement become apparent. Medium Contact the client, work out a solution that would have the lowest impact. Insufficient resources – Unable to develop the project due to not having access to a required resource. Medium See if the resource is really required, look for ways to reduce resource use previously in the project, and try to gain the required resource. Incorrect budget estimation – Overall cost of the project starts to increase and spiral Low There is a budget of zero for this project, to maintain this open source software and libraries will be used. Figure 5 - Table containing potential risks and countermeasures
  • 20. 14 | P a g e 3.5. Schedule The project has a hard deadline of September 12th 2014; at which point the application, all documentation and the accompanying report must be completed. A Gantt chart showing the planned schedule can be found in Appendix 3. As the chart shows, some additional time has been allowed to factor in potential delays during the project. However due to the nature of the project and methodology used, it is possible some iterations will take less time than others, and more or less iterations can be added. This is a very adaptable schedule, and is only used as a base as it will likely to change once development begins. 3.6. Professional issues Appendix 2 contains the ethical checklist that accompanies this project. This document revealed that there were no ethical concerns raised by this project. To ensure copyrighted code is not used, only open source libraries and code will be used and, should code be required from other sources, it will only be used after getting the express written permissions from the author/owner. Should any questionnaires or user feedback be required during the testing phases of the project then this will be conducted anonymously, and all respondents will be under no pressure from the developer to take part. User information will never be put at risk and the creation of this application will at no point compromise users data or identity. Any attacks that are developed as part of this project will be conducted on a closed network where the developer has complete control of the Tor node. This ensures that users of the Tor network are not, at any point, affected by the development of this project. Should a situation arise during the development of the application in which a potential professional issue arises, this will be dealt with before it occurs to ensure that the project never breaks any ethical codes or laws. 3.7. Conclusion of the section This section has justified the use of the Agile Method as a project methodology, having fully considered its advantages and disadvantages over more traditional models, like the Waterfall model. Furthermore, the importance of requirements was considered and the methods used to elicit the requirements for this projects were discussed. Potential risks of the project were analysed and countermeasures were implemented to mitigate the possible effects of these risks. Finally an estimated schedule for the development of the application was drawn up, which factors in some unforeseen delays during the development stages. With these aspects of project management in place, a smoother development should be possible and the application should meet the requirements and be delivered to the client by the deadline.
  • 21. 15 | P a g e 4. Application Development 4.1. Iteration 1 4.1.1. Requirements All requirements for this project will be split into two categories: functional requirements and non- functional requirements. Functional requirements describe what the software should do whilst non- functional requirements judge the operation of the software. By their nature, non-functional requirements can be difficult to evaluate because they tend to be based on the subjective opinion of the assessor rather than being fact-based. During the requirements elicitation for this iteration, all of the non-functional requirements were elicited. Unless otherwise stated, these will be presumed to apply to each iteration of the project, although they will only be discussed in this section of the report. In the case of this project, the non- functional requirements can be considered as principles based on the ISO 9126-1 software quality model (ISO, 2001) which the project should aim to meet and can therefore not be attributed to one specific iteration. The functional requirements for this iteration were elicited using the aforementioned methods and are shown below: Requirement Importance Level Connect to the Tor network High Send and receive a version cell High Decode NetInfo cell to extract data from it High Handle errors from destroy cells Medium Figure 6 - Table containing requirements for iteration 1 The non-functional requirements for this project can be seen in the table below: Quality Characteristics Requirement Importance Level Portability Able to run the application without installing Tor High Able to run on multiple platforms (Windows, Mac, Linux) Medium Not require the application to be installed to run Medium Reliability No more than 10 bugs on delivery High Efficiency Use as little computational resources as possible such as RAM. (No more than a 1gb of RAM) Low Usability No GUI High Precise and constructive error messages High Documentation High Universal naming standard High Dependability Able to operate normally or abnormally without threat to life or environment Med Legal Only use open source software High
  • 22. 16 | P a g e Maintainability Able to expand the system to incorporate new features, fix defects or deal with new technology. High Adaptability Able to change the system to handle additional domain concepts Med Figure 7 - Non-functional requirements The importance of considering both functional and non-functional requirements when developing the application can be seen from the first functional requirement: to be able to connect to the Tor network. Clearly, this is a critical requirement, failure to connect to the network will prevent the project from being continued. One simple way to connect to the Tor network would be to install Tor and allow the application to use the Tor client. However, this would inhibit the first non-functional requirement: to not need to install a Tor client in order to use the application. Failure to consider both functional and non-functional requirements during the development of the application could result in some of the requirements being contradictory and thus not all of the requirements would be able to be met. The second functional requirement, to be able to send a version cell, requires a packet to be sent to a Tor node informing it of the current version we wish to communicate using. This packet must fulfil the criteria outlined in the Tor protocol specification document (Dingledine & Matthewson, n.d.). This should be a simple requirement to achieve. This requirement is critical to the development of the project as it sets up the communication between the client and a Tor node. The third requirement, to decode a NetInfo cell, will likely prove to be challenging. The data contained within the NetInfo cell must be extracted accurately and in the correct order. The first three functional requirements were all considered to be critical; these requirements provide the base upon which the application can be developed. Failure to meet these requirements at this stage of the project could jeopardise the entire project as they provide key functionality to the application. The fourth functional requirement - to handle data from a destroy cell - is also important, but is not a critical requirement as, although it is desirable, it will not affect the application’s functionality. Therefore, in this iteration, the first three functional requirements should be prioritised. As already established, the non-functional requirements (NFRs) will affect the entire project and their importance should not be under-estimated. The portability NFRs may seem simple to achieve, but fulfilling these requirements will have major impacts on the project, and will, for example, have an effect on the programming language chosen as it must be cross-platform compatible and be capable of being used to achieve the functional requirements. Usability, maintainability and adaptability NFRs should be simple to implement and can be said to be of critical importance to the project. This project intends to create a library suitable for further development, an application with poor usability features would not be chosen over existing Tor libraries and therefore if this application is to be successful the usability NFRs need to be met. The legal NFR of only using open source software needs to be achieved as the project has a budget of zero. This is therefore a simple, but critical, requirement to implement. The reliability NFR, to have no more than ten bugs on delivery, was explicitly mentioned by the client. However, this could prove to be a challenging requirement to assess the success of. Whilst testing may
  • 23. 17 | P a g e show that there are little or no bugs in the application, this might not be a true representation of the application because there may be bugs in the application that did not show up during testing. 4.1.2. Design By using the Agile Method, the upfront design is minimized; the developer only designs what is required for each iteration, which dramatically reduces the large upfront design cost that other methodologies incur. Moreover, by only implementing the design as and when it is required, risk is reduced and the developer ensures that all the necessary features are designed. Implementing the entire design in one go could lead to features not being used etc. making it confusing to the end user. This does not mean that features designed in earlier iterations will not be carried over to later iterations of the application. Despite being the first iteration, some design decisions made here will impact the rest of the application. An example of a design decision that will affect the entire application is the programming language used as this will not be able to be changed after the first iteration without dramatic consequences. This makes the choice of programming language a critical design decision. There were three key contenders for programming languages: C, Python and Java. C was discounted as the author has considerably less experience in this language than either Python or Java. To decide which of these languages was more suitable for this project, the advantages and disadvantages of each were considered. Python was found to be the more suitable language for this project, as existing Tor libraries use Python and it makes sense to use the language that Tor developers are already using as it will help to achieve the application’s goal of being used for future development. Another reason for choosing Python over Java was that using Python it is much easier and more effective to extract bytes from packets of network data than it is using Java. Despite this, Java was a serious contender due to the developer’s considerable experience in the language and the speed in which Java can run – which can be up to ten times quicker than Python. The decision was further complicated by the fact that both languages are cross-platform compatible and therefore would both be able to achieve the non-functional portability requirements. Python is not without its disadvantages in relation to this project; at the start of the project the developer was relatively inexperienced in this language, and threading in Python is extremely hard and has been strongly criticised as being “fundamentally broken” (Wittber, 2009). The deciding factor was that the client implied that he had a preference for Python being used for this application. The version of Python to be used was also seriously considered, with the final choice being Python 2.x. Despite being the older version of the language, this was deemed the most appropriate version of the language to use as several existing libraries anticipated to be used to provide functionality are currently only fully compatible with Python 2.x. While some libraries have 3.x versions available, these still contain bugs and tend to be considered to be in Beta mode. The operating system used to develop the application is an unimportant decision, as the portability NFRs state that the application must be compatible with all operating systems and Python can be used on all operating systems. The only potential issue is that Python will have to be installed on Mac and Windows operating systems, although it comes preinstalled on Linux. This also applies to the libraries that the developer expects to use throughout the project. However, the developer’s personal preference for developing applications is to use Linux and consequently this operating system will be used to develop the entire application. As discussed in the requirements and specification section of this report, the application does not require a GUI as it would bring no benefit to the application. Designing a user-friendly, efficient and
  • 24. 18 | P a g e scalable GUI would take considerable time and the absence of a GUI significantly reduces the complexity of the design section. The time saved by not having to design and develop a GUI will be invested into further increasing the quality of the code, as well using the additional time to try and implement more of the requirements. 4.1.2.1. Design features of a library To achieve the usability NFR, it is important to consider the way that the library will be designed. A poorly designed library would likely not be used for future development as developers would probably opt for one of the existing libraries if it were significantly easier to use. It is therefore important to ensure that design structure is simple to use, is intuitive and promotes efficiency. To enhance usability, the single responsibility principle will be implemented; this means that each component implemented in the library should only be responsible for a single section of functionality or a single feature. This makes it easier for the user to understand precisely what they can expect from each function of the application, which should help to make users feel confident in further developing the application in the future. Two popular naming conventions are used for Python, these are mixed case and lower separated with an underscore. The Python PEP 8 documentation recommends that the words be separated by an underscore as it is claimed that this facilitates readability (Van Roussum, 2014). Therefore, this naming convention was chosen as it would further achieve the usability NFR. The names of the relevant functions and variables also needed to be considered. Variable names such as x, y, etc. are extremely poor names - they do not give any information about the data they contain. It was decided that all names should provide as much data as possible whilst remaining a sensible length. This will facilitate easy development and usability as there should be no confusion over what a variable contains or what a function will do; the name should make this information clear to the user. Both the chosen naming convention and the descriptive variable names help to meet the NFR of maintainability – making it easier for the current developer to work on the application as well as for users to further develop it in the future. To further achieve the usability NFRs, detailed comments about all functions within the application will be required. These should provide the user with information concerning the required input, what the function does and what the function will return. It could be argued that the above features are not strictly necessary as a good application would always be favoured over a lesser application, however the amount of effort required to make these significant improvements is negligible and the implementation of these decisions could potentially increase the speed of development by making it easier for the developer to identify pre-existing functions. 4.1.2.2. Version control Version control is an essential feature to be implemented in the application. Although it will not affect the development of the project, it is a safeguard that means were anything to go wrong the code can be retrieved from a specified point. It also offers the ability to track all changes made to the code, which will help locate bugs within the application. For this project, Git was selected over Subversion. This decision was based on the personal preference of both the developer and the client, who also uses Git and therefore it was easy to share code between the parties involved in the application’s creation.
  • 25. 19 | P a g e 4.1.2.3. Design conclusions This section has required more time than was previously anticipated, this was because so many of the design decisions that needed to be made in the first iteration would have effects upon the entire development of the project. It was therefore essential that sufficient thought and consideration was put into these decisions, as failure to make the right choice would lead to greater delays later in the project development. It could also be argued that creating such a detailed design in the first iteration will speed up the development process and ensure that potential issues will be averted as a result of the decisions made in this section. 4.1.3. Implementation The aim of this iteration was to implement all four functional requirements, as detailed at the start of this section. Furthermore it was hoped that as many of the NFRs as possible would also be achieved during this time. Developing the application required several pieces of software to be selected. The most important tool was the code editor Sublime Text 2, this is a text editor which enables code to be written. While it is argued that an Integrated Development Environment (IDE) is more appropriate for developing code, due to the extensive testing and debugging functionality that they provide, they are more complicated to use than a text editor and the testing environment may not be suitable for this application. It is also the developer’s preference to use a text editor, as he has more experience of this method. By using the extensive testing functionality that Python provides, no negative effects of using a text editor over an IDE will be present in the final application. The developer tried to implement the functional requirements in order of their importance; for example, connecting to Tor was the first step undertaken. To do this an SSL connection was made to the Tor node. It used the Tor node’s IP address and ORPort. This was easily implemented and only required the three lines of code shown below: Figure 8 - Code snippet showing connection to a Tor node While this is a simple method to connect to the Tor node, it works and is less complicated than other methods and it was felt best to avoid over-complicating things where possible. However, an improvement was almost immediately thought of. This method requires the user to know the IP address and ORPort of the Tor node, which may not be easy to find out. To simplify the method, and increase usability, it was decided that users should be able to enter either the nickname or the IP address and ORPort of the node that they want to connect to. This was not implemented during this iteration, as it was felt that this should be suggested to the client at the end of this iteration and, if approved, implemented in the following iteration. The second requirement, to be able to send and receive a version cell, was the next requirement to be implemented. It was decided that, because all cells need to be created in the same format and following the same protocol instructions, a function would be created to automatically pack a cell to
  • 26. 20 | P a g e the correct format, thus preventing any code duplication. This achieves the usability and maintainability NFRs. A build cell function was therefore created, which takes the command to be used and the payload and correctly packs this into the correct format of the cell. This is shown in the code below:
  • 27. 21 | P a g e The decoding of the NetInfo cell was perhaps the most challenging requirement to be implemented in this iteration. This was because the developer is still relatively inexperienced with Python and the Tor protocol documentation is not very clear and contains several ambiguities. However, despite these challenges, the NetInfo cell was able to be decoded, although the process overran the estimated timescale dedicated to this section as a result of the aforementioned challenges. To promote efficient code, the developer used an ‘If’ statement to dynamically extract data contained within the packet. The NetInfo cell could contain multiple IP addresses or multiple formats of IP addresses (i.e. IPV4 or IPV6), an appropriate but somewhat inefficient method, would be to run multiple ‘If’ statements for every possible eventuality. This, however, would mean at least eight ‘If’ statements would be required just to extract the client’s IP address. As the code below shows, the developer managed to use a single ‘If-elif’ statement, by doing so dramatically reducing the chances of errors in the code and increasing readability for users. Due to the complexity of decoding the NetInfo cell, this iteration was already starting to fall behind schedule. The decision was therefore made not to implement the handling of the destroy cell as part of this iteration as it does not affect the core functionality and was merely a desirable, rather than a core, requirement. However, it was mentioned to the client and it was agreed that this feature will be implemented in a future iteration. 4.1.4. Testing Although testing may “Often feel like an exercise in futility or at best a waste of time” (Arbuckle, 2010, p. 1), it is a critical area of development. Testing ensures the software functions according to the expectations defined by the requirements/specifications. The overall aim of testing is to find bugs or issues that would negatively affect the functionality of the application, its usability and/or maintainability.
  • 28. 22 | P a g e For this iteration functional testing, which verifies a function performs as expected using a small subset of inputs as well as white box testing, where the tester has full knowledge of the implementation will be conducted. To enable the application to be thoroughly tested, several testing methods were considered. It was decided that a combination of the unittest framework and testing manually were the most appropriate methods of testing for this application. This is because unittest makes it possible to quickly test a large number of input values and it is also heavily integrated with Python. The results from unittest are also displayed in a very comprehensible manner, making it easy to locate and fix bugs. Manual testing also has its advantages, such as being able to test features of the application that a unittest might not be able to do and manually testing each function will allow a realistic user scenario to be tested. To ensure that there are no anomalies in the results, for each testing round each test will be run three times. Should any issues be presented, these will be investigated and corrected before re-running the tests to ensure the bugs have been removed. This process will be continued until all bugs are eradicated from the application. Test No. Test Test method Succeeded? Comments 1 Can connect to a node Unittest Yes 2 Passes correct value to version function Unittest and Manually Yes 3 Creates the correct version cell Unittest Yes 4 Sends the version cell Manual Yes 5 Receives the Netinfo cell Unittest Yes 6 Able to extract the payload of a Netinfo cell Unittest Yes 7 Successfully able to extract the data contained within the payload Unittest Yes 8 Store the extracted data as a dictionary Unittest Yes 9 Create the payload of a NetInfo cell to be sent Unittest and Manually First round: No Second round: Yes First round of testing showed up an error where the IP addresses was being displayed as negatives, this was because there were not being formatted correctly, once this was fixed, the test was able to be passed 10 Builds the NetInfo cell correctly to be sent Unittest and Manually Yes 11 Send the NetInfo cell to the first node Unittest and Manually Yes Figure 9 - Testing results for iteration 1 As can be seen in the testing results in above, eleven tests were conducted for this iteration. All functions were thoroughly tested, with ten tests being passed first time. One test, however, failed.
  • 29. 23 | P a g e This test was to ensure that the correct payload of the NetInfo cell was created. The creation of the NetInfo cell proved to be incorrect as negative IP addresses were being passed. This obviously cannot be allowed, and was found to be a result of the formatting of the IP addresses had been done using the signed char method rather than the required unsigned char method. Once this had been changed, the test was rerun and was successfully passed. 4.1.5. Moving forward from Iteration 1 While this iteration has overrun the allotted time by two weeks, and not all of the functional requirements have been met, for the most part it can be considered a success. A new project plan has been created to showing this, and how this delay has been taken into account for the future iterations, to still insure the project is completed on time, this can be found in appendix 4 They delay is because a detailed design section was developed and this should enable future design development to be achieved quicker and more efficiently. The three functional requirements that were implemented have been implemented successfully and to a high standard, for example the use of functions to reduce code duplication was implemented to help achieve many of the usability NFRs. The majority of the NFRs have already been achieved, which is a significant achievement in such a small amount of time. The testing of the implemented features was a success, despite one function requiring a bug to be dealt with. Moving forward to iteration 2, the recommendation of using an Onion router nickname as well as the IP address to connect to a Tor node will be suggested to the client and the timescale will be altered to take the complexity of the Tor protocol documentation into account.
  • 30. 24 | P a g e 4.2. Iteration 2 4.2.1. Requirements During the demonstration of the previous iteration to the client, he was generally pleased with the development to date. The unachieved requirement of handling data in the destroy cells was mentioned to him and he explicitly requested that this be completed in this iteration. It was also decided to implement the use of Onion router nicknames to identify Tor nodes in addition to the existing IP addresses to facilitate usability. This discussion, as well as several informal interviews conducted with potential end-users of the application, elicited the following requirements. Requirement Importance Level Create a circuit through the Tor Network High Create a circuit of any length High Create a stream through Tor to a web server High Able to retrieve webpages from an internet web server through Tor High Create a circuit using specified nodes Medium Create multiple streams through Tor to a web server Medium Handle errors from destroy cells Low Figure 10 - Functional requirements for iteration 2 It was evident that no further NFRs needed to be added to the original specification and that the existing NFRs should be carried forward into this iteration. The requirement of creating a circuit through the Tor network is perhaps the most challenging requirement faced in the project to date. To achieve this requirement, the calculation of shared keys between nodes will need to be achieved. The encryption of packets will also need to be implemented if this requirement is to be achieved. This is extremely difficult and the Tor documentation is, once again, full of ambiguities and proves a major challenge to developers. In light of this challenging requirement, the predicted timescale has increased from three weeks to four and the hours allocated to the project have been increased in order to develop this aspect of the project. However, once this requirement is implemented it will provide a base upon which the application can be developed. It is therefore critical that this requirement be fulfilled during this iteration. The requirement of being able to create a circuit of any length should be easily achieved once the requirement of being able to create a circuit has been successfully implemented, as it expands on the code used for this process. The creation of a stream through Tor to a web server is also likely to be a simple requirement to fulfil, as this will, once again, expand on the circuit that has been created. Overall, this iteration contains some very difficult requirements to achieve, with the majority of the requirements being dependent on the successful creation of a circuit through Tor. 4.2.2. Design The design for this iteration builds on the design section from the previous iteration, however one small design feature was needed to be considered before implementation. With the greater use of cells, all requiring different commands to be associated with a specified cell, it was important that the
  • 31. 25 | P a g e method used to identify the cell was clear. There was two possible solutions, to use the command id number of the cell or to use the English command. This was thought at great length, as by implementing the English command for the cell command would make the code easier to use, however it would also increase the chance of errors being implemented into a program, I.E a spelling mistake. Also with so many different commands with similar names this might make it more confusing to the user to decide on which command to use. It was therefore decided to implement the number based commands for setting the commands in packets, due to the lower risk of a user entering the incorrect number, reduction in the chances of errors due to spelling mistakes, and being easier to implement as no formatting issues need to be considered. It was also chosen above the English string version as this is what is currently used in the Tor protocol. 4.2.3. Implementation The key requirement for this iteration was to build a circuit through the Tor network. This needed to be split into two parts. The first challenge was to connect to the first node, and once this was successful the developer had to connect to the second and subsequent nodes. This was split into two parts because the cells needing to be sent are different. For example, the cell sent to the first node needs to be a ‘create’ cell, and a ‘created’ cell would be expected back. For the second node onwards, an ‘extend’ cell would be sent and an ‘extended’ cell would be received back. Here an implementation decision needed to be made. Tor currently uses two encryption methods (Dingledine & Matthewson, n.d.): the NTOR and TAP protocols. There is no particular advantage to either method, however the TAP handshake is slightly easier to implement and is the original Tor encryption method which means that it is able to be used on nodes running older versions of Tor, whereas the newer NTOR protocol may not. In addition, more documentation is available for the TAP handshake and for these reasons this was chosen as the encryption method to be used in the project. To create a ‘create’ cell, the Diffie-Hellman protocol must first be used to create a shared key which only the client and the first node knows. To calculate the client’s data for the handshake to be completed, a function was created to prevent code duplication as this will be required to be used extensively for circuit creation throughout the application. The code below shows the DH protocol being conducted, with the creation of x (the private key) and X the public key that will be sent to the Tor node. The public key is encrypted with the onions remote key. To prevent an errors being implemented into the application the decision was made to use a hybridEncryption that has already been created by Dr Gareth Owen. This was decided over creating our own because the time for this iteration was fast running out due to the complexity. It allowed more time to be spent on other sections rather than trying to re develop something that already has a proven success rate. Finally it was chosen as this is thoroughly tested, and thus will not introduce any bugs in the application regarding its use.
  • 32. 26 | P a g e Figure 11 - Code snippet showing the creation of Diffie Hellman public and private keys Once the payload and the client’s half of the Diffie-Hellman key had been calculated using the above function, the packet was created using the build cell function as previously implemented, thus dramatically reducing code duplication. The client’s private key x, is important to be stored as this will need to be used to decrypt the packets that is received. For this a variable within the TorCircuit class has been used. This method of storing the key was chosen over storing all keys in an array for example, is this makes the key easier to be used later on in the application, with less chance of error. The Tor circuit class is shown below: Figure 12 - Code snippet showing the Tor Circuit Class It was then important to receive and decode the received ‘created’ cell as this would complete the handshake and provide the server’s public key, and shared key. This means that packets could be encrypted to the first node, the first step of the circuit. To retrieve the data contained within the packet, the payload of the data was extracted and, by using the Tor documentation, which states that where in the packet each piece of data is located, we were able to extract the public key, the derived key data as well as a unique key shared between the client and the first node.
  • 33. 27 | P a g e This is shown in the code below: Figure 13 - Code snippet showing the decoding of a Create cell As shown in the code above, the calculated KH, Df, Db, Kf and Kb are returned to TorHop. TorHop is a class which handles the creations of the circuits. By being able to save the values of these variables in the class, they will be able to easily call later on, when they will be needed for encrypting and decrypting packets. This method of extracting the shared key data, as well as various other keys, could be reused for decoding received ‘extended’ cells, as these share the same methods of encryption. The only difference is that the ‘extended’ cells would first require decryption as they would be encrypted with other Onion node keys. The method to decrypt the cells is shown below: Figure 14 - Code snippet showing the decryption function This was extremely challenging to implement as it required the public and shared keys of the node as well as other encryption variables to be correctly stored as these would later be used in decryption. It
  • 34. 28 | P a g e was vital to get these in the right order, as otherwise the packets would not be correctly decrypted or encrypted. During the implementation of this requirement, the Tor protocol became a hindrance to the development. This was mostly down to the feature of Tor that does not send back a cell if an incorrectly configured cell has been sent, thus not giving the user any feedback about what has gone wrong. It was also discovered that, if a cell was received, it would a be a destroy cell that contained little to no information, or finally a relay cell that could not be decrypted to provide any useful information due to not extracting the data properly. This was a very frustrating time for the developer as many days were spent trying to chase down an error without knowing where to even start looking for it. After several weeks chasing down errors, a circuit can finally be created and thus the first two requirements of this iteration were successfully completed. By this point, however, the project was running behind schedule – this can be attributed to the complexity of Tor and the accompanying documentation. Fortunately, from this point onwards, this iteration’s remaining requirements proved simple to implement as creating a stream used the circuit and encryption previously implemented and it was simply a case of sending a specially configured relay cell through the Tor network. This is shown in the code below: Figure 15 - Code snippet showing the creation of the Stream cell As shown in the above code section, the creation of a stream simply required the host name or IP address along with the port. It is then passed to the build cell which builds a relay cell containing the data, before being encrypted with the selected nodes encryption keys and sent. Once again this was created as function within the TorCircuit class. The regularly received destroy cells provided the perfect opportunity to achieve the requirement carried forward from Iteration 1 - to handle the destroy cells, which now inform the user of the cause of the error.
  • 35. 29 | P a g e Figure 16 - Code snippet showing the conversion of error codes to English The above code snippet shows how the error code contained within the destroy cell is passed to the above function, comparing this code to a dictionary containing all error codes and there meaning before returning to the user the English error. This method was chosen over just simply informing the user that a destroy cell has been received, is because this aims to achieve the NFR of usability. By providing the user with a more detailed and in depth error message, will allow for easier debugging. 4.2.4. Testing As with the first iteration, testing will be conducted using unittest and manually. However unlike the first iteration which only looked at functional and white box testing, this testing section also needs to consider Integration. An Integration test verifies that the parameters passed between modules are handled correctly, used when a module is developed at a later stage than the module it is interacting with. This is an important testing area to complete as this will ensure functions that were development in the first iteration are capable of being used for functionality development in this iteration. Although the effectiveness of this test can be argued, with some suggesting it is a waste of time, the little time it adds to the testing makes it worth it, especially if a bug is found as this can be quickly fixed, and thus not affecting functions later on during the development that may use it. The testing strategy can be seen below, this shows the test to be conducted, how it was conducted and if the test was successful. As with the first iteration in each round, the tests was run three times to ensure no anomalies where present in the results. Test No. Test Test method Succeeded? Comments 12 Connect to the first hop Unittest Yes 13 Calculate the shared key between client and node Unittest 14 Create a CREATE cell containing the relevant data Unittest Yes 15 Send the CREATE cell to the first node. Unittest Yes
  • 36. 30 | P a g e 16 Receive the CREATED cell back from the first node Unittest Yes 17 Ensure a cell of cmd 3 is received Unittest Yes 18 Extract the payload of the CREATED cell Unittest Yes 19 Extract the first node half of the key Unittest Yes 20 Calculate KH, Df, Db,Kf, Kb from the payload Unittest Yes 21 Check the derived key data is the same as KH Unittest Yes 22 Calculate the shared key Unittest Yes 23 Ensure the nodes entered in the array are correctly passed to the function Unittest / Manually First round: No Second round: Yes Failed the first round do to being passed as a single value for all nodes, rather than a value for each node, this was a simple change to make and by doing so it passed the second round of testing. 24 Search the consensus by a nodes nickname for their IP address and OR port Unittest Yes 25 Packs the IP address and OR port of a selected node in the right format Unittest Yes 26 Calculate the shared key half Unittest Yes 27 Build the EXTEND cell Unittest Yes 28 Correctly encrypt the packet Unittest Yes 29 Ensure the packet count is correct Unittest Yes 30 Send the packet to the correct node Unittest Yes 31 Receive an EXTENDED packet back Unittest Yes 32 Handle a destroy cell correctly Unittest Yes 33 Extract the payload of the EXTENDED packet Unittest Yes 34 Decrypt correctly the payload of an EXTENDED packet Unittest Yes 35 Ensure a RELAY_EXTENDED cell is received Unittest Yes
  • 37. 31 | P a g e 36 Calculate Shared key Unittest Yes 37 Extract derivative key data from the payload Unittest Yes 38 Ensure KH and derivative key data are the same Unittest Yes 39 Ensure KH, Df, Db, Kf, Kb are updated to the TorHop object Unittest Yes 40 Ensure a stream can be created to the a specified webserver Unittest / Manually First round: No Second round: Yes This failed first time due to incorrect formatting of the target webserver, but by correcting the ip address or web address and correct port, this issue was able to be solved, and the second test was passed 41 Ensure the payload of the stream packet is correctly formatted Unittest / Manually Yes 42 Correctly create the stream relay cell Unittest Yes 43 Correctly encrypts the packet to allow it to be sent through the network Unittest Yes 44 Ensure a packet is received back from the Stream request and handled appropriately Unittest Yes 45 Ensure a RELAY_CONNECTED cell is received Unittest Yes 46 Check the data (GET request) is correctly formatted to a packet Unittest / Manually Yes 47 Ensure the packet is encrypted correctly Unittest Yes 48 Ensure a packet is received back and handled appropriately Unittest Yes 49 Ensure all the data is received Unittest / Manually First round: No Second round: Yes In the first round only a single packet was received and did not contain all the data. This showed we must look for more than a single packet, which was implemented by using a while true loop, which allowed this to pass the second round of testing Figure 17 - Testing results iteration 2
  • 38. 32 | P a g e As shown from the above test results, there were several tests that failed first time. This was to be expected on such a complex iteration, however fortunately the three tests that did fail were easily corrected. For example test 49 - Ensure all the data is received failed as it was wrongly assumed all data would be received in a single packet, once this was found not to be the case, a simple loop was implemented to ensure all packets was received. Once completed the test was re run and the test was passed. Overall the tests that failed, was not due to issues in the functionality of the application, but rather developer error, this shows the importance of testing so issues such as these can be picked up early on during development and fixed. 4.2.5. Moving forward from Iteration 1 As with the first iteration this iteration also over run the predicted timescale but two week, it will therefore be necessary in the next iterations to increase the amount of time dedicated to the development. A new project plan has been created to showing this, and how this delay has been taken into account for the future iterations, to still insure the project is completed on time, this can be found in appendix 5. The main reason for the delay is due to the project being much harder and trickier than first expected; the lack of error messages provided by the Tor is proving to be the most difficult feature of the Tor protocol to handle and days have been spent trying to debug the software, despite not knowing what is going on. With normal development of an application, if something goes wrong an error message would be presented to the developer to indicate the area where the error occurred, but with the Tor protocol, this does not happen. A further delay is caused by the confusing Tor protocol documentation, which contains many inconsistences, making understanding exactly what is required for each function difficult to understand, and on several occasions help from the Tor community has been required to understand certain points. However all requirements in this iteration was completed, including the requirement that was not completed in iteration 1. All functional requirements have been completed, while also satisfying the NFR of maintainability and usability. It would have been quicker to develop the functions without considering these requirements, but by considering those during development will ensure an application that meets and exceeds the client’s expectation.