SlideShare une entreprise Scribd logo
1  sur  22
Télécharger pour lire hors ligne
Introduction

 The quality and consistency of geoscientific data
 management practices across the minerals
 exploration and mining industry vary greatly.
 This occurs for a variety of reasons;
 •   Budgetry constraints
 •   Technical knowledge constraints
 •   Lack of appreciation of the value of data within the organisation
 •   Lack of an IT Dept (or lack of integration with the IT group)
 •   Staff turnover
 •   Technology change-overs/updates
 •   “Islands” of accountability



                                                                         2
Introduction


 On the following slides are listed the primary
 factors conducive to best practice geoscientific
 data management.

 Each principle is discussed first to allow for a
 clear understanding of what the principle
 involves and why, along with, in some instances,
 an example of it’s application or requirement.
Principle 1: Centralised Data Management

  Simply put, this is the practice of having all geoscience data in a
  centralised location, preferably not on an operations site, on fully
  maintained/monitored servers, with fully tested back-up systems
  in a industry standard server room. This data may be derived from
  site/project based databases, or replicated to them – the main
  point is that the version on site is not the only copy.

  It is not strictly necessary for the site copy of the data to be
  maintained however experience has shown that site personnel
  have a significantly improved attitude towards the quality of the
  data when a site copy is maintained as they have a stronger sense
  of ownership of the data and view the centralized storage as
  merely a backup.
Principle 2: Standard Geoscientific Legend

   This is a standard set of observational data codes used across all
   sites/projects. Multiple legends within one organisation
   frequently cause problems both in the database and at the point
   of data capture.
   Problems at the point of capture emerge as the different legends
   will “cross-breed” as a geologist transferred to a new site will
   sometimes use codes from his previous posting either out of
   preference or by force of habit. This results in contaminated data
   that is frequently useless if this practice is allowed to persist for
   some time.
   Problems at the database end are the result of either differing
   legend codes being stored in the same field, or multiple fields
   being created for each data type to cater for each legend. The
   former is confusing whilst the latter is both inefficient and
   confusing.
Principle 3: Standard Geoscientific Data Model

   This is simply the use of one data model across the organisation
   rather than having a different data model for each mining
   operation or exploration project. Some companies run one data
   model for their mining operations and another for their
   exploration.
   Ultimately these data sets should be coming together so that all
   data for a project, deposit or terrane is in one location and the
   maximum value can be achieved from analysis of the data.
   Complete data sets such as this are essential for understanding
   the geological setting and processes involved in forming the
   deposit, thereby allowing for predictive tools for discovering
   another.
Principle 4: Same System Digital Data Logging

   This is simply the use of one data model across the organisation
   rather than having a different data model for each mining
   operation or exploration project. Some companies run one data
   model for their mining operations and another for their
   exploration.
   Ultimately these data sets should be coming together so that all
   data for a project, deposit or terrane is in one location and the
   maximum value can be achieved from analysis of the data.
   Complete data sets such as this are essential for understanding
   the geological setting and processes involved in forming the
   deposit, thereby allowing for predictive tools for discovering
   another.
Principle 5: Direct Data Transfer

   This principle requires that the data is either transferred directly
   from the data collection tool to the database, or is done via a
   secure facility e.g., Acquire’s Briefcase mechanism.
   Systems where data is exported to a text-based file is open to,
   and frequently subjected to, manual editing which is outside of
   the validation controls inherent in the system. This can result in
   contaminated data in the database, or difficulty in loading the
   data which then requires support from specialized users.
   Furthermore, these files are commonly not transferred
   immediately to the database and therefore are exposed to the
   risks of loss and multiple versions.
   A further point here is that importers must be constructed, have
   validation coded in, and subsequently be maintained to enable
   the importing of the exported, text-based file
Principle 6: Digital Sample Submission

   • This is where all sampling data is derived from a digital data
     collection tool and submitted to the lab digitally. Where physical
     sampling sheets are required, there should be a facility associated
     with the data collection tool to provide a printed version.
   • While barcode tags are now a common technology for assisting in
     managing samples they still have issues of; having to be manually
     handled at several points in the transport and processing of the
     sample, and; can be difficult to get a reading from when dirty and/or
     wet.
   • It is recommended that RFID technology be used to manage samples
     as this eliminates the multiple-point manual handling of the samples
     to obtain their sample numbers. RFID tags are now extremely
     affordable and readily available. Even in a small hole of a hundred
     samples, the time saved by avoiding having to find and scan each
     barcode is significant. Depending how samples are placed, this may
     also remove the risk of injury through bending over or physically
     lifting the samples.
Principle 7: Automated Assay Loading

  This principle involves the assays being imported directly into the
  database without the opportunity to be manually edited by personnel.
  The idea behind this is very similar to that behind Principle 5. Direct
  Data Transfer – the analogy between data and a piece of medical
  equipment for surgery; the more hands that come into contact with it,
  the dirtier it gets. By avoiding personnel having the ability to manually
  interact with, or edit, the data before it gets to the database, the cleaner
  it is.
  There are a multitude of ways of achieving this;
  • Emails from the lab may be delivered to a common folder where a
       batch process extracts and loads them into the database,
  • The laboratory concerned my have a portal or some other web-
       hosted access through which the DHDMS can acess the assay data
       for loading.
  • The laboratory has direct access to the DHDMS and loads the data
       directly.
Principle 8: Drillhole Data Staging

   The recommendation here is that the data is loaded into the
   database but is not available to general users or any
   extraction/reporting facilities until it has been approved (i.e. checked
   that all relevant data is present, QAQC is acceptable, etc). Ultimately
   what is to be avoided is unapproved data being used in what may be
   critical calculations or decisions.
   An example would be a geochemist including assay data in an extract
   he ran, when later it is revealed by the geologist responsible for the
   data that it in fact failed it’s QAQC and was subsequently re-assayed
   by the laboratory. Meanwhile the geochemist is unaware that he has
   some poor quality data that has been superceded.
   While this principle is intimately linked to the following one and may
   at first appear to be the same, they are in fact separate as many
   companies apply Principle 9 but not Principle 8.
Principle 9: Drillhole Signoff/Approval

   This principle is centred around the assigning of accountability for
   the quality of the data to the person that responsible for it. This is
   the logical subsequent step to the previous point and records the
   name of the approver against the data.
   Elements of a sense of ownership of the data, as discussed in
   Principle 1, are equally valid here.
Principle 10: Audit Trail Facility

   The recommendation here is that all inserts/deletes/mods made
   in the database are logged (date/time, userid, previous value) to
   sufficient detail to allow for rollback to occur if required. This
   then allows for the correction of data contaminated whilst in the
   database whether by accident or malicious intent
Principle 11: External Database Audits

  This is a self explanatory principle – external audits provide an
  independent assessment of the quality of the data stored and the
  processes used in obtaining and approving it. Remembering that
  large investment decisions may be made on the basis of this data,
  it is essential that this process occur on a semi-regular basis.
Principle 12: Database Photo Management
  While storing photos within a database is a recent technology (e.g. SQL
  Filestream), it is recommended to be adopted for the following reasons;
  • Current folder-based systems do not easily allow for integration into
      other systems or software packages
  • Accidental or malicious deletion may not be recoverable in folder-
      based systems
  • There is currently no useful way to store metadata about the
      image(s)
  • Folder-based systems do not cater well for ATV/OTV images or
      images from emerging technologies such as Hylogger.
  Standardised Folder-based Photo Management
  • If, due to budget or technology restrictions it is not possible to
      implement Principle 12, then a folder-based system is still better
      than no system at all. In this case it is imperative that the correct
      permissions be set up on the folders/system to minimize the risk of
      accidental or malicious deletion. Further steps should also be taken
      to regularly backup the system for the same reason.
Principle 13: One GIS Software Standard

  A simple principle, though one that often gets overruled by
  personnel in islands of accountability standing their ground and
  insisting that that need a particular system despite the fact that
  no one else in the organisation is using it.
  The advantages are obvious;
  • Potential savings on licensing costs
  • Elimination of conflict with IT groups who logically want to
     reduce the number of applications they need to cater for
  • Data tends to get doubled up, i.e. stored for each system,
     resulting in the potential for multiple, unsynchronized data
     (“multiple truths”).
  • Constantly converting data for one package from another
     allows for the possibility of mistakes and contamination,
     particularly where coordinate system conversions are involved.
Principle 14: Controlled GIS Data

  This principle is primarily concerned with avoiding multiple truths
  and lost data. In the application of this principle all GIS data is
  published to a structured area and users are expected to access
  this area for their GIS data. Other data sets brought into the
  organisation must go through this process of being published
  prior to use.
  Implementation of this principle may be done simply with a folder
  structure where proper permissions have been set to avoid
  deletion or over-writing of the published data. A more
  sophisticated option would be an environment such as Sharepoint
  where data can be checked in and out with full version control.
Principle 15: Centralised Grid Transformations

   Grid and coordinate conversions are a constant source of error
   and contamination within many organisations. Implementation of
   this principle involves a sophisticated system where grid
   definitions are entered into a database by surveyor and their
   userid is recorded against the entry in much the same vein as in
   Principle 9. The system must be capable of versioning these
   definitions as they do change over time.
   The database then produces a definition file that is accessed by
   the conversion software. The apparently complex part then is
   integrating your GIS and other packages to utilize this conversion
   software to do all coordinate conversions.
   While the above does sound overly complex the truth is that it
   the architecture and execution are not particularly difficult. What
   this then allows for is;

                                      Continued on next slide
Principle 15: Centralised Grid Transformations           (cont)




   • Elimination of multiple versions of coordinate conversion
      formulas and macros that once released are impossible to
      control.
   • Following on from the above is the elimination of potentially
      expensive mistakes caused by using the wrong or outdated
      conversion facility.
   Standardised Grid Transformations
   • If it is not possible to implement Principle 15 as described
      above, then it recommended that surveyor approved
      transformation parameters or formulae are published to a
      central area where they can be accessed by users, in much the
      same way as discussed in Principle 14. This area is likely a
      folder structure and as such should have the correct
      permissions to prevent deletion or editing except by the
      surveyors.
Principle 16: Controlled Geophysics Data

  The principle in this instance is very similar to Principles 14 & 16
  in that approved data is published to a central area, protected by
  permissions , where users go to access the processed geophysical
  data.
  With regard to the raw geophysical data, while this is almost
  useless to anybody but the geophysicists, the data should still be
  stored in a protected folder system to prevent the contamination
  or loss of the primary, unprocessed data.
Principle 17: Database Driven Tenement Management

    The principle in this instance is very similar to Principles 14 & 16
    in that approved data is published to a central area, protected by
    permissions , where users go to access the processed geophysical
    data.
    With regard to the raw geophysical data, while this is almost
    useless to anybody but the geophysicists, the data should still be
    stored in a protected folder system to prevent the contamination
    or loss of the primary, unprocessed data.
Principle 18: Exploration Embedded IT People

  This principle involves IT specialists embedded in, and paid for by,
  the exploration group but that have a reporting line through to
  the company’s IT department. This is the preferable choice as the
  personnel are fully exposed to the exploration requirements,
  challenges and planning schedule but are grounded in the IT
  requirements of standardisation where feasible and security
  issues.
  Exploration-centric IT People
  Should Principle 18 not be a feasible option, then the
  organisation’s IT group should have support and architecture
  people in which a significant part of their focus is the exploration
  group and is familiar with their requirements, sometimes rapidly
  changing requirements and the limitations/demands of the
  remote environs in which exploration personnel frequently work.

Contenu connexe

Tendances

Hitachi high-performance-accelerates-life-sciences-research
Hitachi high-performance-accelerates-life-sciences-researchHitachi high-performance-accelerates-life-sciences-research
Hitachi high-performance-accelerates-life-sciences-researchHitachi Vantara
 
Sharing scientific data ethics and consent
Sharing scientific data    ethics and consentSharing scientific data    ethics and consent
Sharing scientific data ethics and consentAboul Ella Hassanien
 
Sharing scientific data: Ethics and consent
Sharing scientific data: Ethics and consentSharing scientific data: Ethics and consent
Sharing scientific data: Ethics and consentAboul Ella Hassanien
 
Approved TPA along with Integrity Verification in Cloud
Approved TPA along with Integrity Verification in CloudApproved TPA along with Integrity Verification in Cloud
Approved TPA along with Integrity Verification in CloudEditor IJCATR
 
Deduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFSDeduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFSIRJET Journal
 
Research Data Service geosciences 18oct2018
Research Data Service geosciences 18oct2018Research Data Service geosciences 18oct2018
Research Data Service geosciences 18oct2018University of Edinburgh
 
IRJET- Enhancement of Security in Cloud Storage of Electronic Health Reco...
IRJET-  	  Enhancement of Security in Cloud Storage of Electronic Health Reco...IRJET-  	  Enhancement of Security in Cloud Storage of Electronic Health Reco...
IRJET- Enhancement of Security in Cloud Storage of Electronic Health Reco...IRJET Journal
 
Trust threads: Provenance for Data Reuse in Long Tail Science
Trust threads: Provenance for Data Reuse in Long Tail ScienceTrust threads: Provenance for Data Reuse in Long Tail Science
Trust threads: Provenance for Data Reuse in Long Tail ScienceBeth Plale
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data CommonsSimon Twigger
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET Journal
 
HathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsHathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsBeth Plale
 
IRJET- Secure File Sharing and Retrieval using Fog Nodes
IRJET- Secure File Sharing and Retrieval using Fog NodesIRJET- Secure File Sharing and Retrieval using Fog Nodes
IRJET- Secure File Sharing and Retrieval using Fog NodesIRJET Journal
 
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...Scalable and Repeatable Machine Learning pipelines: A key requirement for you...
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...All Things Open
 

Tendances (15)

Hitachi high-performance-accelerates-life-sciences-research
Hitachi high-performance-accelerates-life-sciences-researchHitachi high-performance-accelerates-life-sciences-research
Hitachi high-performance-accelerates-life-sciences-research
 
Sharing scientific data ethics and consent
Sharing scientific data    ethics and consentSharing scientific data    ethics and consent
Sharing scientific data ethics and consent
 
ijcatr04081001
ijcatr04081001ijcatr04081001
ijcatr04081001
 
Sharing scientific data: Ethics and consent
Sharing scientific data: Ethics and consentSharing scientific data: Ethics and consent
Sharing scientific data: Ethics and consent
 
Approved TPA along with Integrity Verification in Cloud
Approved TPA along with Integrity Verification in CloudApproved TPA along with Integrity Verification in Cloud
Approved TPA along with Integrity Verification in Cloud
 
Deduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFSDeduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFS
 
50120140504001
5012014050400150120140504001
50120140504001
 
Research Data Service geosciences 18oct2018
Research Data Service geosciences 18oct2018Research Data Service geosciences 18oct2018
Research Data Service geosciences 18oct2018
 
IRJET- Enhancement of Security in Cloud Storage of Electronic Health Reco...
IRJET-  	  Enhancement of Security in Cloud Storage of Electronic Health Reco...IRJET-  	  Enhancement of Security in Cloud Storage of Electronic Health Reco...
IRJET- Enhancement of Security in Cloud Storage of Electronic Health Reco...
 
Trust threads: Provenance for Data Reuse in Long Tail Science
Trust threads: Provenance for Data Reuse in Long Tail ScienceTrust threads: Provenance for Data Reuse in Long Tail Science
Trust threads: Provenance for Data Reuse in Long Tail Science
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data Commons
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
 
HathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsHathiTrust Research Center Secure Commons
HathiTrust Research Center Secure Commons
 
IRJET- Secure File Sharing and Retrieval using Fog Nodes
IRJET- Secure File Sharing and Retrieval using Fog NodesIRJET- Secure File Sharing and Retrieval using Fog Nodes
IRJET- Secure File Sharing and Retrieval using Fog Nodes
 
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...Scalable and Repeatable Machine Learning pipelines: A key requirement for you...
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...
 

Similaire à Geoscientific Data Management Principles

Warehouse Planning and Implementation
Warehouse Planning and ImplementationWarehouse Planning and Implementation
Warehouse Planning and ImplementationSHIKHA GAUTAM
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
Data Ware House Testing
Data Ware House TestingData Ware House Testing
Data Ware House Testingmanojpmat
 
Acceliant white paper_edc_and_epro
Acceliant white paper_edc_and_eproAcceliant white paper_edc_and_epro
Acceliant white paper_edc_and_eproTrianz
 
IRJET-Auditing and Resisting Key Exposure on Cloud Storage
IRJET-Auditing and Resisting Key Exposure on Cloud StorageIRJET-Auditing and Resisting Key Exposure on Cloud Storage
IRJET-Auditing and Resisting Key Exposure on Cloud StorageIRJET Journal
 
Hrm database-management-java-project
Hrm database-management-java-projectHrm database-management-java-project
Hrm database-management-java-projectchetanmbhimewal
 
IOUG93 - Technical Architecture for the Data Warehouse - Paper
IOUG93 - Technical Architecture for the Data Warehouse - PaperIOUG93 - Technical Architecture for the Data Warehouse - Paper
IOUG93 - Technical Architecture for the Data Warehouse - PaperDavid Walker
 
Predictive Modeling Procedure
Predictive Modeling ProcedurePredictive Modeling Procedure
Predictive Modeling ProcedurePredactica Social
 
Data migration patterns special
Data migration patterns   specialData migration patterns   special
Data migration patterns specialManikandan Suresh
 
Privacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storagePrivacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storageJPINFOTECH JAYAPRAKASH
 
DMDW Lesson 04 - Data Mining Theory
DMDW Lesson 04 - Data Mining TheoryDMDW Lesson 04 - Data Mining Theory
DMDW Lesson 04 - Data Mining TheoryJohannes Hoppe
 
Privacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storagePrivacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storageJPINFOTECH JAYAPRAKASH
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing fo...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing fo...DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing fo...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing fo...IEEEGLOBALSOFTTECHNOLOGIES
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing for ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing for ...JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing for ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing for ...IEEEGLOBALSOFTTECHNOLOGIES
 
Privacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storagePrivacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storageIEEEFINALYEARPROJECTS
 
best-practices-for-realtime-data-wa-132882.pdf
best-practices-for-realtime-data-wa-132882.pdfbest-practices-for-realtime-data-wa-132882.pdf
best-practices-for-realtime-data-wa-132882.pdfaliramezani30
 
The Evolving World of Substation Asset Data
The Evolving World of Substation Asset DataThe Evolving World of Substation Asset Data
The Evolving World of Substation Asset DataPower System Operation
 
8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data Project8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data ProjectCitiusTech
 

Similaire à Geoscientific Data Management Principles (20)

Warehouse Planning and Implementation
Warehouse Planning and ImplementationWarehouse Planning and Implementation
Warehouse Planning and Implementation
 
Data Mining
Data MiningData Mining
Data Mining
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Data Ware House Testing
Data Ware House TestingData Ware House Testing
Data Ware House Testing
 
Acceliant white paper_edc_and_epro
Acceliant white paper_edc_and_eproAcceliant white paper_edc_and_epro
Acceliant white paper_edc_and_epro
 
IRJET-Auditing and Resisting Key Exposure on Cloud Storage
IRJET-Auditing and Resisting Key Exposure on Cloud StorageIRJET-Auditing and Resisting Key Exposure on Cloud Storage
IRJET-Auditing and Resisting Key Exposure on Cloud Storage
 
Hrm database-management-java-project
Hrm database-management-java-projectHrm database-management-java-project
Hrm database-management-java-project
 
IOUG93 - Technical Architecture for the Data Warehouse - Paper
IOUG93 - Technical Architecture for the Data Warehouse - PaperIOUG93 - Technical Architecture for the Data Warehouse - Paper
IOUG93 - Technical Architecture for the Data Warehouse - Paper
 
Environmental Monitoring in Regulated Labs and Cleanrooms
Environmental Monitoring in Regulated Labs and CleanroomsEnvironmental Monitoring in Regulated Labs and Cleanrooms
Environmental Monitoring in Regulated Labs and Cleanrooms
 
Predictive Modeling Procedure
Predictive Modeling ProcedurePredictive Modeling Procedure
Predictive Modeling Procedure
 
Data migration patterns special
Data migration patterns   specialData migration patterns   special
Data migration patterns special
 
Privacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storagePrivacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storage
 
DMDW Lesson 04 - Data Mining Theory
DMDW Lesson 04 - Data Mining TheoryDMDW Lesson 04 - Data Mining Theory
DMDW Lesson 04 - Data Mining Theory
 
Privacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storagePrivacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storage
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing fo...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing fo...DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing fo...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing fo...
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing for ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing for ...JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing for ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Privacy preserving public auditing for ...
 
Privacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storagePrivacy preserving public auditing for secure cloud storage
Privacy preserving public auditing for secure cloud storage
 
best-practices-for-realtime-data-wa-132882.pdf
best-practices-for-realtime-data-wa-132882.pdfbest-practices-for-realtime-data-wa-132882.pdf
best-practices-for-realtime-data-wa-132882.pdf
 
The Evolving World of Substation Asset Data
The Evolving World of Substation Asset DataThe Evolving World of Substation Asset Data
The Evolving World of Substation Asset Data
 
8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data Project8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data Project
 

Geoscientific Data Management Principles

  • 1.
  • 2. Introduction The quality and consistency of geoscientific data management practices across the minerals exploration and mining industry vary greatly. This occurs for a variety of reasons; • Budgetry constraints • Technical knowledge constraints • Lack of appreciation of the value of data within the organisation • Lack of an IT Dept (or lack of integration with the IT group) • Staff turnover • Technology change-overs/updates • “Islands” of accountability 2
  • 3. Introduction On the following slides are listed the primary factors conducive to best practice geoscientific data management. Each principle is discussed first to allow for a clear understanding of what the principle involves and why, along with, in some instances, an example of it’s application or requirement.
  • 4. Principle 1: Centralised Data Management Simply put, this is the practice of having all geoscience data in a centralised location, preferably not on an operations site, on fully maintained/monitored servers, with fully tested back-up systems in a industry standard server room. This data may be derived from site/project based databases, or replicated to them – the main point is that the version on site is not the only copy. It is not strictly necessary for the site copy of the data to be maintained however experience has shown that site personnel have a significantly improved attitude towards the quality of the data when a site copy is maintained as they have a stronger sense of ownership of the data and view the centralized storage as merely a backup.
  • 5. Principle 2: Standard Geoscientific Legend This is a standard set of observational data codes used across all sites/projects. Multiple legends within one organisation frequently cause problems both in the database and at the point of data capture. Problems at the point of capture emerge as the different legends will “cross-breed” as a geologist transferred to a new site will sometimes use codes from his previous posting either out of preference or by force of habit. This results in contaminated data that is frequently useless if this practice is allowed to persist for some time. Problems at the database end are the result of either differing legend codes being stored in the same field, or multiple fields being created for each data type to cater for each legend. The former is confusing whilst the latter is both inefficient and confusing.
  • 6. Principle 3: Standard Geoscientific Data Model This is simply the use of one data model across the organisation rather than having a different data model for each mining operation or exploration project. Some companies run one data model for their mining operations and another for their exploration. Ultimately these data sets should be coming together so that all data for a project, deposit or terrane is in one location and the maximum value can be achieved from analysis of the data. Complete data sets such as this are essential for understanding the geological setting and processes involved in forming the deposit, thereby allowing for predictive tools for discovering another.
  • 7. Principle 4: Same System Digital Data Logging This is simply the use of one data model across the organisation rather than having a different data model for each mining operation or exploration project. Some companies run one data model for their mining operations and another for their exploration. Ultimately these data sets should be coming together so that all data for a project, deposit or terrane is in one location and the maximum value can be achieved from analysis of the data. Complete data sets such as this are essential for understanding the geological setting and processes involved in forming the deposit, thereby allowing for predictive tools for discovering another.
  • 8. Principle 5: Direct Data Transfer This principle requires that the data is either transferred directly from the data collection tool to the database, or is done via a secure facility e.g., Acquire’s Briefcase mechanism. Systems where data is exported to a text-based file is open to, and frequently subjected to, manual editing which is outside of the validation controls inherent in the system. This can result in contaminated data in the database, or difficulty in loading the data which then requires support from specialized users. Furthermore, these files are commonly not transferred immediately to the database and therefore are exposed to the risks of loss and multiple versions. A further point here is that importers must be constructed, have validation coded in, and subsequently be maintained to enable the importing of the exported, text-based file
  • 9. Principle 6: Digital Sample Submission • This is where all sampling data is derived from a digital data collection tool and submitted to the lab digitally. Where physical sampling sheets are required, there should be a facility associated with the data collection tool to provide a printed version. • While barcode tags are now a common technology for assisting in managing samples they still have issues of; having to be manually handled at several points in the transport and processing of the sample, and; can be difficult to get a reading from when dirty and/or wet. • It is recommended that RFID technology be used to manage samples as this eliminates the multiple-point manual handling of the samples to obtain their sample numbers. RFID tags are now extremely affordable and readily available. Even in a small hole of a hundred samples, the time saved by avoiding having to find and scan each barcode is significant. Depending how samples are placed, this may also remove the risk of injury through bending over or physically lifting the samples.
  • 10. Principle 7: Automated Assay Loading This principle involves the assays being imported directly into the database without the opportunity to be manually edited by personnel. The idea behind this is very similar to that behind Principle 5. Direct Data Transfer – the analogy between data and a piece of medical equipment for surgery; the more hands that come into contact with it, the dirtier it gets. By avoiding personnel having the ability to manually interact with, or edit, the data before it gets to the database, the cleaner it is. There are a multitude of ways of achieving this; • Emails from the lab may be delivered to a common folder where a batch process extracts and loads them into the database, • The laboratory concerned my have a portal or some other web- hosted access through which the DHDMS can acess the assay data for loading. • The laboratory has direct access to the DHDMS and loads the data directly.
  • 11. Principle 8: Drillhole Data Staging The recommendation here is that the data is loaded into the database but is not available to general users or any extraction/reporting facilities until it has been approved (i.e. checked that all relevant data is present, QAQC is acceptable, etc). Ultimately what is to be avoided is unapproved data being used in what may be critical calculations or decisions. An example would be a geochemist including assay data in an extract he ran, when later it is revealed by the geologist responsible for the data that it in fact failed it’s QAQC and was subsequently re-assayed by the laboratory. Meanwhile the geochemist is unaware that he has some poor quality data that has been superceded. While this principle is intimately linked to the following one and may at first appear to be the same, they are in fact separate as many companies apply Principle 9 but not Principle 8.
  • 12. Principle 9: Drillhole Signoff/Approval This principle is centred around the assigning of accountability for the quality of the data to the person that responsible for it. This is the logical subsequent step to the previous point and records the name of the approver against the data. Elements of a sense of ownership of the data, as discussed in Principle 1, are equally valid here.
  • 13. Principle 10: Audit Trail Facility The recommendation here is that all inserts/deletes/mods made in the database are logged (date/time, userid, previous value) to sufficient detail to allow for rollback to occur if required. This then allows for the correction of data contaminated whilst in the database whether by accident or malicious intent
  • 14. Principle 11: External Database Audits This is a self explanatory principle – external audits provide an independent assessment of the quality of the data stored and the processes used in obtaining and approving it. Remembering that large investment decisions may be made on the basis of this data, it is essential that this process occur on a semi-regular basis.
  • 15. Principle 12: Database Photo Management While storing photos within a database is a recent technology (e.g. SQL Filestream), it is recommended to be adopted for the following reasons; • Current folder-based systems do not easily allow for integration into other systems or software packages • Accidental or malicious deletion may not be recoverable in folder- based systems • There is currently no useful way to store metadata about the image(s) • Folder-based systems do not cater well for ATV/OTV images or images from emerging technologies such as Hylogger. Standardised Folder-based Photo Management • If, due to budget or technology restrictions it is not possible to implement Principle 12, then a folder-based system is still better than no system at all. In this case it is imperative that the correct permissions be set up on the folders/system to minimize the risk of accidental or malicious deletion. Further steps should also be taken to regularly backup the system for the same reason.
  • 16. Principle 13: One GIS Software Standard A simple principle, though one that often gets overruled by personnel in islands of accountability standing their ground and insisting that that need a particular system despite the fact that no one else in the organisation is using it. The advantages are obvious; • Potential savings on licensing costs • Elimination of conflict with IT groups who logically want to reduce the number of applications they need to cater for • Data tends to get doubled up, i.e. stored for each system, resulting in the potential for multiple, unsynchronized data (“multiple truths”). • Constantly converting data for one package from another allows for the possibility of mistakes and contamination, particularly where coordinate system conversions are involved.
  • 17. Principle 14: Controlled GIS Data This principle is primarily concerned with avoiding multiple truths and lost data. In the application of this principle all GIS data is published to a structured area and users are expected to access this area for their GIS data. Other data sets brought into the organisation must go through this process of being published prior to use. Implementation of this principle may be done simply with a folder structure where proper permissions have been set to avoid deletion or over-writing of the published data. A more sophisticated option would be an environment such as Sharepoint where data can be checked in and out with full version control.
  • 18. Principle 15: Centralised Grid Transformations Grid and coordinate conversions are a constant source of error and contamination within many organisations. Implementation of this principle involves a sophisticated system where grid definitions are entered into a database by surveyor and their userid is recorded against the entry in much the same vein as in Principle 9. The system must be capable of versioning these definitions as they do change over time. The database then produces a definition file that is accessed by the conversion software. The apparently complex part then is integrating your GIS and other packages to utilize this conversion software to do all coordinate conversions. While the above does sound overly complex the truth is that it the architecture and execution are not particularly difficult. What this then allows for is; Continued on next slide
  • 19. Principle 15: Centralised Grid Transformations (cont) • Elimination of multiple versions of coordinate conversion formulas and macros that once released are impossible to control. • Following on from the above is the elimination of potentially expensive mistakes caused by using the wrong or outdated conversion facility. Standardised Grid Transformations • If it is not possible to implement Principle 15 as described above, then it recommended that surveyor approved transformation parameters or formulae are published to a central area where they can be accessed by users, in much the same way as discussed in Principle 14. This area is likely a folder structure and as such should have the correct permissions to prevent deletion or editing except by the surveyors.
  • 20. Principle 16: Controlled Geophysics Data The principle in this instance is very similar to Principles 14 & 16 in that approved data is published to a central area, protected by permissions , where users go to access the processed geophysical data. With regard to the raw geophysical data, while this is almost useless to anybody but the geophysicists, the data should still be stored in a protected folder system to prevent the contamination or loss of the primary, unprocessed data.
  • 21. Principle 17: Database Driven Tenement Management The principle in this instance is very similar to Principles 14 & 16 in that approved data is published to a central area, protected by permissions , where users go to access the processed geophysical data. With regard to the raw geophysical data, while this is almost useless to anybody but the geophysicists, the data should still be stored in a protected folder system to prevent the contamination or loss of the primary, unprocessed data.
  • 22. Principle 18: Exploration Embedded IT People This principle involves IT specialists embedded in, and paid for by, the exploration group but that have a reporting line through to the company’s IT department. This is the preferable choice as the personnel are fully exposed to the exploration requirements, challenges and planning schedule but are grounded in the IT requirements of standardisation where feasible and security issues. Exploration-centric IT People Should Principle 18 not be a feasible option, then the organisation’s IT group should have support and architecture people in which a significant part of their focus is the exploration group and is familiar with their requirements, sometimes rapidly changing requirements and the limitations/demands of the remote environs in which exploration personnel frequently work.