Data input and editing in GIS involves collecting, digitizing, and correcting geospatial data to build a GIS database. There are two main types of data sources: digital data which may require processing or conversion, and non-digital hardcopy data which must be digitized. Common digitizing methods include on-screen digitizing, scanning, and geocoding of hardcopy maps and plans. Data editing aims to correct locational, topological, and attribute errors introduced during data input. Thorough planning is required to determine the best data collection and input methods based on factors like data format, source, accuracy needs, and project requirements.
2. Definition
• GIS data capture, input and editing
means the identification, collection,
digitization and correction of errors for
the data necessary in the building of a
GIS database.
3. Data Input Methods
Two main types of data sources:
• Digital – May require processing, importing,
conversion
• Non-digital – Hardcopy e.g. plan/map
5. Digital Map Data
For a GIS project, the first thing is to analyze the
data requirement and collect the data.
• Already exist in digital find them or buy
them
• Already exist but not in digital geocode,
digitizing / scanning
• Don’t exist capture from reality (by GPS,
remote sensing, etc.)
6. Data Input Methods
Digital data
• Importing e.g. GPS data having spatial locations
• On-screen digitising – tracing features from digital
imagery
• Geo-locating – some digital data that do not have
spatial location needs to be geo-rectified.
• Converting – Data in spatial format that cannot be
viewed/analysed by a particular software needs to
converted in the required format. Provided by most
GIS software packages.
7. Data Input Methods
Digital data
• Geo-referencing – a process of associating locations
in a spatial dataset to real, geographic space using a
coordinate system.
• Converting – Data in spatial format that cannot be
viewed/analysed by a particular software needs to
converted in the required format. Provided by most
GIS software packages.
8. Data Input Methods
Hardcopy data
• Geocoding is the process of finding associated
geographic coordinates (often expressed as
lat/long) from other geographic data and involves
conversion of spatial information into digital
form, for example, embedding geographic
location information within a digital photo.
• Geocoding involves capturing the map, and
sometimes also the attributes
• Example: scanning (raster), digitizing (vector)
9. Geocoding
• Geocoding is the
process of finding
associated
geographic
coordinates (often
expressed as lat/long)
from other geographic
data.
• Address matching is
the most common
form of geocoding.
11. Digitizing
• Captures map data by
tracing lines from a map
by hand
• Uses a cursor and an
electronically-sensitive
tablet
• Result is a string of
points with (x, y) values
Tally Systems (1998)
12. Digitizing
Steps of digitizing with digitizing tablet:
• Stable base map
• Fix to tablet
• Digitize control
• Determine coordinate transformation
• Trace features
• Proof plot
• Edit
• Clean and build
13. Digitizing
Traditionally most of the cost of a GIS project, but one time
cost, can be re-used in various projects, requires maintenance
• Digitizing is the transformation of information from analog
format, such as a paper map, to digital format, so that it can
be stored and displayed with a computer.
– Using digitizing table
– On screen digitizing
• There are some main issues to consider before digitizing
commences:
– For what purpose will the data be used?
– What coordinate system will be used for the project?
– What is the accuracy of the layers to be associated?
– What is the accuracy of the map being used?
– What is the expected accuracy of the result for the project?
14. On screen digitizing
• Same principle but different process
with on digitizing table
• First, create vector file with right
projection in ArcCatalog (if you use
ArcGIS)
• Then using GIS editor function to
create new features
• If something goes wrong, you can also
modify features
15. Data Input Methods
Hardcopy data
• Scanning – a device (scanner) senses the binary grey
tone or colour values of the analogue (paper
document) and outputs them as a series of pixels in
parallel scan lines. Actual scanning is done by a
scanning head which is able to sense reflected light
or transmitted light and to turn the light intensity into
a pixel value.
16. Scanning
• Places a map on a glass plate, and
passes a light beam over it
• Measures the reflected light
intensity
• Result is a grid of pixels
• Image size and resolution are
important
• Features can “drop out”
Scanning is to use a scanner to convert the analogue
map into a computer-readable form automatically.
17. Scanning
• Flat bed - document mounted on a flat surface
• Drum - document mounted on drum surface
• DPI (dot per inch)
• File size
• Raster to vector conversion (“vectorisation”)
18. Data Input Methods
Parameters for evaluating a Scanner
a) Scanner resolution: smallest image size sensible by
scanner expressed in dpi (dots per inch).
b) The larger dpi the finer the resolution and the slower
the scan and greater the data volume and vice versa.
c) Maximum document size: modern large format
scanners are drum.
d) Binary/grey tone/colour capability: best scanners
provide all the three capabilities.
e) Radiometric range: number of grey tone levels
(standard one is 256)
f) Geometric accuracy: how much data is distorted by the
scanning process?
g) Weight
h) Maximum document thickness: typically 3-5mm
i) Price.
19. Data Input Methods
Advantages of Scanning
1. Fast means of digitizing large or dense data formats
2. Process is largely automatic and puts minimum strain
on operator
3. Output data can be easily integrated with satellite
remotely sensed data.
Disadvantages of Scanning
1. High cost of hardware/software
2. Very intensive manuscript preparation
20. Data Sources for GIS
• Analogue maps
• Existing digital data - DEM
• Aerial photographs
• Digital RS/Satellite images
• Ground survey – e.g. GPS data
• Reports and publications
21. GIS maps are digital not analog
• Analog (paper) map vs. digital (electronic) map
• Real (touchable) vs. virtual maps (“inside the
computer”)
• Conversion
22. Finding Existing Map Data
• Map libraries
• Reference books
• State and local agencies
• Federal agencies
• Commercial data suppliers
23. Map Libraries
• via network searches, CDs, etc.
• often archival data http://www.nla.gov.au/map/
24. State and local agencies
• Land parcel (DCDB)
• Land use map
• Place name
• Digital elevation model (DEM)
• Soil
• Vegetation maps
• Aerial photography
25. State and local agencies (cont’d)
EPA (Environmental Protection Agency)
• Regional ecosystem map
• National parks
• Wildlife
Bureau of Statistics
• Population
• Housing
• Income
• Other socio-economic data
26. State and local agencies (cont’d)
Bureau of Meteorology Department
• rainfall
• temperature
• satellite image
• Street map
• Aerial photography
• Land use / zoning
• Asset and infrastructure
30. Field data collection - Surveying
Use equipment or
technique to
measure:
• Direction/Bearing
• Distance
• Position / location
(x,y,z)
31. Field data collection - GPS
Global Positioning Systems (GPS)
• space-based radio positioning
systems
• could give 3D position, velocity,
and time 24 hours a day, in all
weather
• worldwide coverage
• NAVSTAR system (U.S. Department of Defense)
• GLONASS system (Russian Federation)
32. Remote Sensing & Imagery
― the science of acquiring, processing, and
interpreting images and related data from aircraft
and spacecraft that record the interaction between
matter and electromagnetic energy. (Sabins, 1997)
• Sensor measures EMR reflected from or emitted by
the earth’s surface.
• In most case, it is digital format – can be directly
input to GIS
36. GIS Data Capture Process
• Data Identification - identify user information/data needs and
select appropriate data sources.
• Analogue data collection – Collect & assess analogue maps
on quality, completeness and complexity.
• Analogue Data Preparation - Identify features to be digitized
and assign feature codes assigned to them.
• Digitization and Editing - Converting analogue data to digital.
During editing, the digital data are displayed, checked and
corrected for errors. Enter attribute data.
37. Spatial Data Acquisition Methods for GIS
Data for GIS must be digital format. It costs about 60%
- 80% of a GIS projects.
There are a number of issues that should be considered:
• Data format: vector or raster
• The natural of the source data
• The potential losses that may occur in transition
• The project requirement
• Storage space, software and hardware
• Requirement for data sharing with other systems or software
• Requires careful planning and constant management
38. Spatial Data Acquisition Methods for GIS
(cont’d)
Methods:
• Manual digitising and scanning of analogue
maps
• Image data input and conversion to a GIS
• Direct data entry
• Transfer of data from existing digital sources
• Maps: scales, resolution, accuracy
39. Data Input Methods
Issues of Data Input
1. Balance required between time & effort spent on data
capture and editing errors from the technique used.
2. Cost and/or time required for the capture process.
3. The necessary format of the resulting data.
4. The required accuracy and precision of the data.
Accuracy refers to the degree of error between
estimated/modelled and actual location.
Precision is based on level of detail in data definition.
5. Availability of similar data in useable format.
40. Transfer data from existing digital sources
The following questions need to be considered:
• What is the age of the data?
• Where did it come from?
• In what medium was it originally produced?
• What is the area coverage of the data?
• To what is the map scale was the data digitized?
• What projection, coordinate system, and datum were used
in maps?
• What was the density of observations used for its
compilation?
• How accurate are positional and attribute features?
• In what format is the data kept?
42. Attribute data
• Logically can be thought of as
in a flat file
• Table with rows and columns
• Attributes by records
• Entries called values
43. Data editing
• Data editing aims to remove the errors that arise
during encoding of geographic data.
• The errors include locational, topological and
attribute data errors.
• Location errors are positional inaccuracies of
digitised features including:
– Locational displacement of features
– Doubling digitising
– Omission error (missing features)
– Commission errors – features erroneously included
in the data when they should have been excluded.
44.
45. • These are unintentional nodes that occur at along a line or a
polygon.
• They can be due to misplaced point or to push wrong button
during digitising
• To correct pseudo notes, first determine whether they are
indeed errors
– This can be achieved by comparing with the data sources
• Incorrect pseudo nodes can be selected manually and
deleting them
• GIS software can also do it automatically
Pseudo Nodes
46. Dangling Nodes
• The dangling node are defined as a single node connected
to a single line entity.
• They result from three possible mistakes:
• failure to close a polygon (unclosed polygon)
• failure to connect the node to the object it was supposed
to be connected to (undershoot)
• going beyond the entity you were supposed to connect to
(overshoot)
49. Correcting dangling nodes
Dangling nodes errors are identified by a graphic symbol different from
the one used for pseudo nodes and actual nodes
50. Lecture 4 52
Figure 5.13 Examples of spatial error in vector data (Heywood, Cornelius and Carver)
Manual Digitizing
common errors that require editing
51. Some common digitizing errors
• Slivers
• Duplicate lines
• Duplicate nodes
• Unended lines
• Gaps
52. Correcting dangling nodes
• If the dangling node is an open polygon, the software alerts you by
providing the number of complete polygons in the dataset.
• If the number is different from expected the errors can then be
corrected manually
• In the case of an open polygon, you merely move one of the nodes to
connect with the other.
• For undershoots, the node is identified and is moved or 'snapped' to the
object to which it should have been connected
• Overshoot errors are corrected by identifying the intended line
intersection point and 'clipping' the line so that it connects where it is
supposed to.
53. Topological errors
• Topological errors are those that violate
topology rules defined by either the GIS
software or user.
• For example:
– only one point may exist at a given location.
– Lines must intersect at a node.
– Overlapping polygons do not exist.
54.
55. Attribute Errors and editing
• Attribute error are incorrect labels attached
to spatial data.
• Attribute data errors are more difficult to
identify than locational errors as they are not
apparent until later on in the data
processing analysis.
56. Attribute Errors and editing
• Attribute data errors may include:
– Incorrect assignment of features unique
identifiers.
– Missing data records or too many records.
– Missing attribute
– Incorrect attribute value
• Attribute data error may result from:
– Observation or measurement errors.
– Data entry errors
– Out dated data
57. Attribute Errors and editing
• Attribute data editing in GIS main includes
the following on attributes associated with
features and their values:
– Adding
– Deleting
– Updating