2. Tableof Contents
ERD Conversion..........................................................................................................................2
Overview.................................................................................................................................2
Amendments Analysis..............................................................................................................3
Entity Conversion.....................................................................................................................3
Attribute Conversion................................................................................................................4
Relationship Conversion...........................................................................................................5
Column Name Formatting.........................................................................................................6
Identifiers................................................................................................................................6
Key Analysis...............................................................................................................................6
Primary Keys............................................................................................................................6
Foreign Keys............................................................................................................................6
Table Order and Dependency Analysis.....................................................................................7
Datatype Analysis......................................................................................................................8
Attribute Analysis......................................................................................................................9
MySQL Engine Choice ..............................................................................................................10
Default Values .........................................................................................................................10
Test Data Analysis...................................................................................................................11
SQL Query design.....................................................................................................................12
Database Size ..........................................................................................................................13
Database Security....................................................................................................................13
User Control Analysis............................................................................................................. 13
Security Measure Analysis...................................................................................................... 14
Constraints Analysis ................................................................................................................14
Testing and Error Handling .....................................................................................................15
View Creation........................................................................................................................ 15
Data Deletion......................................................................................................................... 15
Prerequisites............................................................................................................................16
References ...............................................................................................................................16
3. ERD Conversion
Overview
As designed in assignment one, below is the final ERD displaying how the database was intended to be
planned out. However, upon further review the tables were not fully suited for a fully functioning
database and has since been modified and improved (see amendments for details). MAC OS terminal
within MySQL was used for most command line inputs seen below.
4. Amendments Analysis
Below are some examples of how the ERD from assignment one has been improved to more suit the
database design, and the reasons these changes have been made.
Table Change Reason
Locations Removed Room Count Room count is an unnecessary
field to store within a table.
Because Rooms are linked to a
location through their ID, simple
queries can find this information
without it taking up space in the
database (see SQL query design
for examples).
Users Removed Name Name is an unnecessary field
within the Users table because
many users can have the same
name. This means the field
cannot be searchable and
instead will be found through
the EIN (unique identifier) which
hold all information on the user
within the company.
AdditionalInfo Changed Relationship AdditionalInfo is intended to
display less accessed
information for a Room. This is
not linked to the location at all
and therefore should be
displayed as a connected to the
Room.
Entity Conversion
Below are some example processes of how the ERD tables were converted into functioning entities
within a database.
= Locations(ID, Name, Address, City, Country, StatusID, DateCreated)
CREATE TABLE Locations (
ID INT,
Name TEXT,
Address TEXT,
City TEXT, = Initial table entities
Country TEXT,
StatusID INT,
DateCreatedDATE);
ERD Entity DatabaseEntity Intended Type
Location ID ID Integer
Location Name Name Text
Address Address Text + Integer
City City Text
Country Country Text
DateCreated DateCreated Date
Room Count [REDACTED] [REDACTED]
5. = Users(ID, EIN, PermissionsID)
CREATE TABLE Users (
ID INT,
EIN INT, = Initial table entities
PermissionsID INT);
Attribute Conversion
Below are some example processes of how the ERD entities were developed with corresponding
attributes to match their intended purpose.
=
CREATE TABLE Locations (
ID INT NOT NULL AUTO_INCREMENT,
Name VARCHAR(64),
Address TEXT NOT NULL, = Initial entity attributes
City VARCHAR(32),
Country VARCHAR(32),
StatusID INT NOT NULL,
DateCreatedDATE NOT NULL);
ERD Entity DatabaseEntity Intended Type
User ID ID Integer
EIN EIN Integer
Permissions ID PermissionsID Integer
DatabaseEntity Intended Attributes Reason
ID Cannot be null
Should auto
increment
Identifiers act as the
primary key for this
given table. No data
entry should have
the same primary
key, otherwise it
cannot be found
Address Cannot be null The main purpose of
the database is to
provide information
to find locations and
check their rooms.
StatusID Cannot be null StatusID will act as a
foreign key from the
table ‘Status’. All
locations are
required to have a
status.
DateCreated Cannot be null In order for locations
to have scheduled
checks by users, they
need to have a valid
start date field.
6. =
CREATE TABLE Users (
ID INT NOT NULL AUTO_INCREMENT,
EIN INT NOT NULL UNIQUE, = Initial entity attributes
PermissionsID INT NOT NULL,
Relationship Conversion
Below are some example processes of how the ERD relationships were converted into functioning
primary and foreign keys in the database.
Rooms Table Relationships
= FOREIGN KEY (LocationID) REFERENCES Locations
A location can have many rooms. For this reason,
Rooms should contain the foreign key of its
corresponding LocationID.
Status Lookup Table Relationships
= FOREIGN KEY (StatusID) REFERENCES Status (ID)
Locations and Rooms tables both require many
status’. For this reason, a lookup is intended to allow
these values to be searchable via ID; the Status table
therefore holds many status’.
DatabaseEntity Intended Attributes Reason
ID Cannot be null
Should auto
increment
Identifiers act as
the primary key for
this given table. No
data entry should
have the same
primary key,
otherwise it cannot
be found
EIN Cannot be null
Should be unique
The EIN is the
intended
company’s unique
identifier for each
user. This can be
used to lookup the
users credentials.
PermissionsID Cannot be null
Should be
defaulted to read
only
Every user should
have permissions
for security
reasons, if
permissions are not
explicitly set, they
should be defaulted
to read only.
7. Column Name Formatting
Identifiers
For Creating table identifiers, a format was implemented to resist using the table name in the ID columns.
This allows for more readable queries. For example, a location table would require a column name "ID".
Using Location.ID instead of Location.LocationID makes sense because the ID is already associated with
the table name.
Key Analysis
Primary Keys
Lookup tables only require a very small amount of entries. For this reason, the identifiers do not require
the ability to hold a largenumber, which is why it is more effective to use datatypes such as 'SMALLINT '
for these columns (see datatype analysis for byte sizes).
Primary Table Primary Key Datatype
Status ID SMALLINT
CheckStatus ID SMALLINT
Locations ID INT(11)
Rooms ID INT(11)
CheckHistory ID INT(11)
Permissions ID SMALLINT
Users ID INT(11)
UserAccess ID INT(11)
AdditionalInfo N/A N/A
Foreign Keys
Data within tables used as foreign keys will speed up the process of querying data. For example, the
'Status' table will hold the 3 status names a location can become, requiring only an identifier lookup
when querying. Location.Status can therefore be 1, 2 or 3. Integer datatypes also take up less memory,
making the overall database smaller. This also takes into account scope for the future of the database; it
can grow much bigger without slowing down querying of the data inside.
Primary Table Primary Key Foreign Table Foreign Key
Locations ID Status StatusID
Rooms ID Locations LocationID
Rooms ID Status StatusID
CheckHistory ID Rooms RoomID
CheckHistory ID CheckStatus CheckStatusID
Users ID Permissions PermissionsID
UserAccess ID Locations LocationID
UserAccess ID Users UserID
AdditionalInfo ID Locations LocationID
AdditionalInfo ID Users OwnerID
8. Table Order and Dependency Analysis
Due to the previously discussed foreign keys, table generation is required to be created in a specific
order. A table cannot be generatedif a foreign key within that table has no existing reference table.
Below is the order that allows for the tables to be generated without errors within MySQL.
Table Creation Order Reason
Status 1 Status table is a foreign key for both Locations
and Rooms which act as the main tables for
this database. This table has no dependencies
as it is a lookup table therefore can be
generated first.
CheckStatus 2 Same applies for CheckStatus. This table does
not depend on any other tables to be created
due to it being a lookup table. For this reason,
it should be one of the first tables to be
generated.
Locations 3 Locations can now be generateddue to its
foreign key existing (Status). This is required in
order for Rooms to be generated. Locations
has the most dependencies (3), which means
it is a high priority in order of generation.
Rooms 4 Now this table has all its dependencies
removed and can be created.
CheckHistory 5 CheckHistory is data based around Rooms,
therefore it naturally follows below the
Rooms table in creation order.
Permissions 6 Permissions are needed for Users, therefore
need to be generatedbefore. Much like Status
and CheckStatus, it acts as a lookup table and
has no dependencies.
Users 7 Users are necessary for UserAccess to act as a
linking table between Users and Locations.
UserAccess 8 UserAccess has no dependencies relying on its
creation however is very low level in priority.
AdditionalInfo 9 AdditionialInfo is the lowest priority table due
to no dependencies or primary key and
therefore can be generated last.
9. Datatype Analysis
Datatypeaccuracy is important to having a fast and effective working database. Below is the reasoning
behind each datatype present in the database.
Attribute Type Size Analysis
Locations.Name varchar(128) 129 Varchar is necessary when handling fields
that arerequired to be searchable and
intend to be queried. 128 is a suitable size
for most varchar fields, research was done
into each datatype, for example, the longest
Location name in the world is 85 characters
long, Gilbert, K (2019).
Locations.DateCreated DATE() 3 Locations creation time does not need to be
accurateto more than the day. This will be a
highly searchable field therefore the data
size should be as small as possible.
CheckHistory.LastChecked DATETIME() 8 Last check is importantto be as accurateas
possibledueto the scheduled updates which
are set to 8 months forward of a Locations last
check date. This field should be searched less
and therefore can hold extra information and
data size.
Users.EIN INT(11) 4 Integers such as unique identifiers can
become increasingly large when a database
grows. Fortunately, Integersdo not take up
much data, and with a recommended
display size of 11 large numbers can be
stored without risk of running out of room.
AdditionalInfo.Description TINYTEXT 225 Descriptions can contain a lot of text.
Descriptions are not required to be searched
by and therefore text is suitable for holding
a largeamount of characters. TINYTEXT is
used instead of TEXT due to description sizes
not needing to be big, and TEXT datatypes
take up a largeamount of bytes. (w3schools,
2019)
Status.ID SMALLINT 2 Primary keys within lookup tables are
limited by how many values are required to
be added. Fields like Status and Check status
will never grow as much as other tables
therefore SMALLINTS reduce the size of the
data while preserving scope.
10. Attribute Analysis
Below issome differentexampleof attributesusedontable columns andthe reasonsforwhythey
are needed.
Table Name Column Name Attributes Reason
Status ID NOT NULL,
AUTO_INCREMENT
Identifiers are critical parts of a
tables, allowing each row to be
located and queried. For this reason,
all identifier fields have to require a
value. They are also required by
many other tables in the database,
and therefore would cause problems
if left blank. This also means
Identifiers will automatically
increment to ensure no two rows are
given the same identifier.
Locations DateCreated NOT NULL The date a Location is created, unlike
an identifier, is more useful for the
customers’ requirements which is
why this field is always required. 8-
month interval room checks would
not work without knowing a start
date to the information being added
to the database.
Users EIN NOT NULL,
UNIQUE
EIN’s arethe company’s user
identifiers. With these numbers’
users can be located and information
can be found about a particular
customer. To prevent Users being
created twice or human error these
are required to be unique.
AdditionalInfo Description (No Attributes) The description field while useful is
not a searchable field, it also is not
required by any other table in the
database. For this reason, it does not
need any attributes.
11. MySQL Engine Choice
The MySQL engine choice for this database was InnoDB, version 8.0.15. below are some of the features
that InnoDB has over MyISAM and the reason I chose to use this engine format.
InnoDB Engine Reason
Row-level Lock This is better for write heavy tables which suits
this database unlike MyISAM’s table-level lock,
which is designed more for read heavy tables.
Crash recovery InnoDB uses automatic crash recovery to help
prevent loss of data or security breaches.
Implements foreign keys Foreign keys are critical to a relational database
such as the one being created. This database does
not perform user-defined partitioning which
allows the use of innoDB’s foreign keys. (Frühwirt,
P, 2010)
Table Compression Compressions is useful to boost raw performance
and scalability of a database. Data sizes are smaller
therefore more information can be contained
without directly effecting performance.
(Potter, J. 2018)
Default Values
Default values are important for table columns that directly affect how other values behave within the
database. ‘Status’ columns and ‘Permission’ columns will impact how querying and data is changed and
therefore need to be set to specific values as seen below.
Table Column Default Value Reason
Locations Status 1 If a Location is added by a user
to the database, that means it
has to be ‘Live’ (Status.ID = 1) at
that current time, unless
specified otherwise by the user.
Rooms Status 1 If a Room is added by a user to
the database, that means it has
to be ‘Live’ (Status.ID = 1) at that
current time, unless specified
otherwise by the user.
CheckHistory CheckStatusID 2 When a Location is added to the
database, by default it should be
‘required’ (CheckStatus.ID = 2)
for a ‘RoomCheck’.
Users PermissionsID 1 If a user gets added to the
database, they should by default
be set to ‘read’ (Permissions.ID =
1) to prevent manipulation of
the database by unauthorised
users.
12. Test Data Analysis
Below contains example test data used to populate the developed database, ready for querying and
testing.
Table Test Data Example Reason
Status insert into Status (Name) values ('Live'); Status rows on the Locations and
Rooms table have default values set to
'Live'. For this reason, it was necessary
to include this in the test data.
CheckStatus insert into CheckStatus (Name) value
('Required');
The purpose of the database is to alert
users of rooms that require checking.
This is a suitable name for rooms that
have expired their check period.
Locations insert into Locations (Name, Address, City,
Country, StatusID, DateCreated) values
('Lillian', '42183 Dryden Plaza', 'Kuching',
'Malaysia', 1, '2017/01/20');
Locations have a largedataset. For this
reason, a data generator was used to
createrandom locations. This was
required to include numbers within the
address field to make full use of its
datatype. (Mockaroo, 2019).
Rooms Insert into Rooms (LocationID, Name,
StatusID, DateCreated) values (1, 'Lillian
Reception', 1, '2017/01/20');
Rooms required to match a locations
detail. For test purposes names were
very ambiguous, but included the
Location name within, for ease of
viewing and querying purposes.
CheckHistory insert into CheckHistory (RoomID,
CheckStatusID, LastChecked) values (1, 1,
'2016/12/04 18:43:24');
CheckHistory has a unique datatype
that is DATETIME. For this reason, the
field needed to contain a specific date
and time to make the most of this
column. This was also important for
querying.
Permissions insert into Permissions (Name) values
('Read');
All Users by default are set to Read
Only, therefore this was a necessary
test data field to include.
Users insert into Users (EIN, PermissionsID)
values (493166438, 3);
Unique EIN’s were specified as 9>
Integer lengths, therefore these test
data fields had to be unique to prevent
two users with the same credentials.
AdditionalInfo insert into AdditionalInfo (LocationID,
OwnerID, Type, Description) values (1, 10,
'Amusement', 'Arcade and restaurant/bar');
Description fields are expected to be
far bigger and longer, however because
the test data needed to be relevant to
the location it was describing, manually
typing long descriptions for each
location was not feasible.
UsersAccess insert into UserAccess (LocationID, UserID)
values (1, 1);
As long as each user was assigned one
or more locations to have access to,
this test data set was random in how it
was generated, so long as there was a
good variation in access count for query
sake.
13. SQL Query design
After completion of assignment one, the questions proposed to query the database found out to be too
simple. Some queries below have now been made more complicated in order to demonstrate the
effective structure and relationships of the information in the database.
Query SQL Format
Which user owns the most locations and how
many do they own?
SELECT OwnerID, COUNT(OwnerID) AS
'value_occurance' FROM AdditionalInfo GROUP BY
OwnerID ORDER BY 'value_occurance' DESC LIMIT
1;
What date were locations createdin a specific
country?
SELECT DateCreated FROM Locations WHERE
Country = "China";
How many locations were createdthis year? SELECT COUNT(ID) LocationCount FROM Locations
WHERE DateCreatedBETWEEN'2019-01-01
00:00:00' AND CURDATE();
Find all the rooms that contain "Halls" in their
name
SELECT Name FROM Rooms WHERE Name LIKE
'%Hall%';
How many users have access to a specific
location?
SELECT COUNT(UserID) FROM UserAccess WHERE
LocationID = 1;
Which Rooms have failed their checks on a
specific year?
SELECT RoomID FROM CheckHistory WHERE
CheckStatusID = 2 AND LastChecked BETWEEN
'2017-08-12 00:00:00' AND '2017-08-12
00:00:00:00';
How will I know when a specific room has
overrun its 8-month required check interval?
SELECT RoomID, LastChecked, CASE WHEN
LastChecked < (DATE_SUB(CURDATE(), INTERVAL 8
MONTH)) THEN"Check Required" ELSE "Check Not
Required" END AS Required FROM CheckHistory
WHERE RoomID = 7 ORDER BY LastChecked DESC
LIMIT 1;
Which user has access to the most locations on
the database?
SELECT UserID, COUNT(UserID) AS
'AccessOccurance' FROM UserAccess GROUP BY
UserID ORDER BY 'value_occurance' DESC LIMIT 1;
How many users have execute permissions on
the database?
SELECT ID FROM Users WHERE PermissionsID = 3;
How many rooms have passed their checks
compared to failed?
SELECT (SELECT COUNT(RoomID) FROM
CheckHistory WHERE CheckStatusID = 1) - (SELECT
COUNT(RoomID) FROM
CheckHistory WHERE CheckStatusID = 2);
Find the average user access per location CREATE VIEW UserAccessCount AS SELECT UserID,
COUNT(UserID) AS LocationCount FROM
UserAccess GROUP BY UserID;
SELECT ROUND(AVG(LocationCount), 2) AS
AverageUserAccess FROM UserAccessCount;
Find A specific users EINassociated with the
rooms they own
SELECT LocationID, EINFROM UserAccess JOIN
Users ON UserAccess.UserID = Users.ID WHERE
Users.ID = 5;
14. Database Size
Here arethe table sizes for the database containing all the test data. Although the test dataset is small, it
provides evidence of how datatypes impact table sizes. With the table ‘AdditionalInfo’ containing a Text
datatype, making it the biggest table in size.
Table Name Table Rows Size (KB)
AdditionalInfo 19 60
CheckHistory 60 50
CheckStatus 2 20
Locations 19 30
Permissions 3 20
Rooms 54 50
Status 2 20
UserAccess 48 50
Users 10 50
Database Security
User Control Analysis
Below are the main users that will have access to the database and the intended permissions they should
be granted, and the reasons given.
User Access Permissions Reason
Root Administrator Database script, ALL PRIVELEGES Root user should have all
privileges allows modification,
deletion and creation on all
areas of the database and script.
Customer (Manager) SELECT, UPDATE, INSERT Managers utilising the database
should be allowed to
add/remove/update
information within the database
while reading the contents.
Customer (General) SELECT, UPDATE General access permissions
should not go further than
reading the information and
updating rows when required.
All users have unique identifiers
and therefore can be located if
incorrect changes are made.
Server Administrators Databasescript, DELETE,
INSERT, SELECT, UDPATE
Server admins should be able to
access the test database in
order to improve the database.
However, they should not be
allowed to remove/create
tables for security reasons; only
the root user should have this
power.
15. Security Measure Analysis
Security Measure Action Reason
Removing all users DROP USER 'name'@localhost
DELETE FROM USERS WHERE
AUTHENTICATION_STRING = "";)
FLUSH PRIVILAGES
Remove users before accessing
the database locally, especially
users with no password in the
test environment to prevent
unauthorised access through
fake users or creating new ones.
Resetting cache for privileges to
ensure changes have been
updated.
Eradicating User/File privileges SET @name = "%name%"
SELECT * FROM 'permissions'
WHERE 'user' LIKE @name;
Removing blank users and/or
users with admin privileges to
prevent anyone else
accessing/modifying data other
than yourself.
Removing Test Database DROP DATABASE RoomChecker; Starting the local server fresh
will eliminate all chances of
unauthorised access and
modifying of data on the current
database. A script has been
created to speed up the process
of producing a new database.
DatabaseScript .txt file containing script config
for above points, table
generation and test data being
implemented
This will prevent the root user
from missing any security
precautions by self-containing
all required commands in a
script. This also speeds the
process up of generating the
local database.
Constraints Analysis
Below are the constraints from assignment one developed after the database was finally designed.
Attribute Constraint Reason
Users.EIN Required to be 9 or more
digits.
A users EIN(unique identifier) is required to be 9
or more digits to ensure everyone’s is unique
within the company. This means the datatype for
the users table is increased due to largeinteger
sizes being stored.
Locations.Address Large bytesize datatype
and non-searchable field
Because address names can be very long, this
field is required to be a ‘TEXT’ datatype to allow
for size. However, this prevents the address field
from being searchable.
Users Not name searchable As discussed in assignment one, names cannot be
used for users as many users have the same
name. This was resolved through the use of an
EIN which could be looked up on the company
portal to find the specific users details.
16. Testing and Error Handling
Below is an example of a useful view table that can make use of the database to develop more specific
tables.
View Creation
CREATE VIEW UserAccessCount AS SELECT UserID, COUNT(UserID) AS LocationCount FROM UserAccess
GROUP BY UserID;
UserID LocationCount
1 12
2 8
3 7
4 2
5 2
6 2
7 5
8 3
9 4
10 3
The database remained in a creation script for ease of running on the server, however it was not easy to
continuously modify these tables in the creation script. To help with this, ALTER TABLE commands were
used to test and modify the database directly through the terminal. Below is some example ALTER TABLE
commands run to help achieve the final database.
ALTER TABLE Users DROP COLUMN Name;
ALTER TABLE AdditionalInfo ADD FOREIGNKEY (UserID) REFERENCES Users(ID);
ALTER TABLE Locations DROP COLUMN RoomCount;
Data Deletion
Due to data dependencies within the database, it is necessary to remove certain data from foreign key
tables if top level data is removed.
UserAccess > AdditionalInfo > CheckHistory > Rooms > Locations
Table Dependency Command Reason
UserAccess FOREIGN KEY (LocationID) REFERENCES
Locations (ID) ON DELETE CASCADE,
Relies on Locations ID Column. A
user cannot have access to a
Location that does not exist.
AdditionalInfo FOREIGN KEY (LocationID) REFERENCES
Locations (ID) ON DELETE CASCADE,
This table presents extra
information surrounding a location,
therefore shouldn’t exist if the
location is removed.
CheckHistory FOREIGN KEY (RoomID) REFERENCES
Rooms (ID) ON DELETE CASCADE,
If a location is removed, rooms
within that room should also be
removed. Those rooms therefore
should not have CheckHistory
information anymore.
Rooms FOREIGN KEY (LocationID) REFERENCES
Locations (ID) ON DELETE CASCADE,
Rooms shouldn’t exist if a location
they belong to is removed.
17. Prerequisites
To ensure the database is createdand run properly from any external body, below are some predefined
requirements that will specific the environment in which this database is running in and how to ensure it
works on any other machine.
- MySql local server to be run through a terminal on a Linux machine
- Required defaults attributeon each column should be turned off on the MySql server config
- Script should be run to compile the database with all test data: 'sh script.sh'
- innodb version 8.0.15 was used during the making of this database
References
Gilbert, K. (2019) The longest location name in the world. [Internet] Available at:
https://www.worldatlas.com/articles/the-10-longest-place-names-in-the-world.html (Accessed: 22nd
April 2019)
Potter, J. (2018) MySQL Performance: MyISAM vs InnoDB. [Internet] Available at:
https://www.liquidweb.com/kb/mysql-performance-myisam-vs-innodb/ (Accessed: 27th April 2019)
Mockaroo Website. Mockaroo Data Generator. [Internet] Available at: https://mockaroo.com/ (Accessed:
1st May 2019)
W3schools Website. SQL Data Types for MySQL. [Internet] Available at:
https://www.w3schools.com/sql/sql_datatypes.asp (Accessed: 2nd May 2019)
Frühwirt, P. et al. (2010) InnoDB Database Forensic. International Conference on Advanced Information
Networking and Applications. IEEE