SlideShare une entreprise Scribd logo
1  sur  54
Télécharger pour lire hors ligne
Beautiful
Research
Data
Kirsta Stapelfeldt,
Coordinator
UTSC Library’s Digital
Scholarship Unit
In this presentation
● Part One: preparing to create machine-
readable data at the onset of a research
endeavour
● Part Two: Working with “messy,” datasets
Benefits of machine-readable data
● Easier to query for new insights
● Easier to mount in a computing environment
● Easier to share with others
Just a .csv + Fusion Tables
● Fusion tables is an experimental, web-based
chrome app
● Took a spreadsheet that Natalie has been
working on and loaded it into the app
● Results have not been massaged at all
● We can expect additional benefits from
having structured data in the future
Part one
In which you have no research data...yet
Best Case Scenario
You start by utilizing some best practices
4 Pieces of low-hanging fruit...
1. No word documents
● database (even a spreadsheet) not .docs
● avoid a lot of style information in your
research documents (such as bolding and
italicizing text, or moving things to other
areas of the page using the tab key or
spacebar)
● Why?
Look beyond the surface.
& n
&nsbp; &nsbp; &nsbp; &nsbp; no thank you!
http://www.bartleby.com/103/33.html
Beauty is more than browser deep
http://www.gutenberg.org/ebooks/18827
2. Use consistent formats for
elements such as date & language
● i.e. dates recorded consistently where
possible (05/25/2014)
3. Taxonomies & Standards
● use controlled vocabularies for keywords,
place names, person names of relevance
o using an open format for a place name can make
geocoding much easier
o stay consistent in a given language
4. Text Encoding
● Ensure you are using Unicode (UTF-8)
● How do you know ?
o Notepad can be your friend
o Test a sample between systems
http://www.string-functions.com/encodingerror.aspx
Changing the way you
think about your research
process
Draw a picture
1. Think small.
Atomistic information (what is the smallest
meaningful unit of information you are
collecting?)
For example:
● A person’s name, religion, and DOB
● Mention of a location or name
● Repeated occurrence
2. Connect the dots.
What are the relationships between your data
elements?
Useful tool: The Entity Relationship Diagram
Draft Dragomans
Content Model
Crow’s Foot Notation
Exercise - Building an ERD
Beautiful Research Data (Structured Data and Open Refine)
Part two
Your data is a mess
Tools for dealing with messy data
● Regular Expressions
● Open Refine
Regular Expressions: Find &
Replace on Steroids
● Available in most productivity suites (iWork,
Microsoft Word, Libre Office/Open Office)
● Often syntax is a little different across
systems
“The regular expression
(?<=.) {2,}(?=[A-Z]) matches at least two
spaces occurring after period (.) and before an
upper case letter as highlighted in the text
above.”
Beautiful Research Data (Structured Data and Open Refine)
Beautiful Research Data (Structured Data and Open Refine)
Open Refine
● Similar to spreadsheet
software
● Installed on your computer,
but used through your
browser
● “Power Tool” for messy data
Following will draw heavily from this lesson -
http://programminghistorian.org/lessons/cleaning-data-
with-openrefine (Thanks to Seth van Hooland, Ruben Verborgh, Max
De Wilde)
Base Assumption of Open Refine
● You have “structured data”
● some consistent and machine-readable
logic has been applied to your data
o Excel, .csv, XML
● you may have structured data and not
know it
o Check export options from any software you
regularly use
1. Remove duplicates
2. Remove blanks
3. Make data atomistic (smallest meaningful
unit)
4. Keep terms/formats consistent
http://data.freeyourmetadata.org/
powerhouse-museum/phm-
collection.google-refine.tar.gz
Choose file & select
Next...
Set appropriate
options and “Create
Project”
Project is created with
75,814 rows.
1. Look for
Blank
Records
See if any RecordIDs
are blank by using a
numeric facet
“Non-numeric” rows
are blank.
Hovering over the cell
makes an “edit” link
visible
The “blank” fields actually
contained a single
whitespace. You can delete
the whitespace and then
select “Apply to All Identical
Cells” -
A confirmation
message will always
show up noting what
you’ve done, and
giving you a chance to
“undo”
2. Look for
Duplicate Records
using Record ID
(since it should be
unique)
Sorting is a visual tool only unless you
“Reorder rows permanently”
“Blank down” will
delete the second
instance of a
duplicated “Record
ID”
Then, we can facet
the “Record ID”
column by blank
records.
the “true” facet
contains all the blank
records.
Clicking the “true” link
will narrow to the
blank records, which
can then be removed.
3. Make data
atomistic
“Category” contains
numerous categories
separated by the “|”
character
You can tell the
system to split the
cells using this
character.
Now only single
categories appear.
Creating a text facet
on “Categories” brings
up all the options in
this column.
We can “cluster” to
detect similar terms
that might have
variances in spelling
or capitalization
4. Make terms
consistent
This interface allows
you to select which
term is authoritative.
You can then merge
terms together.
a couple of
additional
features...
The “Undo/Redo” tab
allows you to back up
in steps to the
creation of your
project, if you make a
mistake.
A “text filter” can allow
you to search in a
column (by regular
expression too!)
Refine has its own set
of regular expressions
that can be used to
perform functions on
data.
https://github.com/OpenRefine/OpenRefine/wiki/GREL-
Functions
A full list of these is
available on Github.
Finally, projects can be exported as Refine
projects, but also in a number of additional
structured formats.
Do this frequently.
Structured data is beautiful data. Make a plan
to create structured data during your research
Clean legacy data or data you inherit, by
becoming a regular expression (regex) expert
and/or using a tool like OpenRefine.
Go to your library or ITS department to see if you can get
support. Thanks for listening to me!

Contenu connexe

Tendances

Fitting MarcEdit into the library software ecosystem
Fitting MarcEdit into the library software ecosystemFitting MarcEdit into the library software ecosystem
Fitting MarcEdit into the library software ecosystemTerry Reese
 
Terry Reese - Real-world data editing with MarcEdit
Terry Reese - Real-world data editing with MarcEditTerry Reese - Real-world data editing with MarcEdit
Terry Reese - Real-world data editing with MarcEditKohaGruppoItaliano
 
Krish data controls
Krish data controlsKrish data controls
Krish data controlssubakrish
 
Presentation on data preparation with pandas
Presentation on data preparation with pandasPresentation on data preparation with pandas
Presentation on data preparation with pandasAkshitaKanther
 
Semantically Reconnecting Fragmented Information through User Activity Monito...
Semantically Reconnecting Fragmented Information through User Activity Monito...Semantically Reconnecting Fragmented Information through User Activity Monito...
Semantically Reconnecting Fragmented Information through User Activity Monito...Hinnerk Brügmann
 
Data Dictionary in System Analysis and Design
Data Dictionary in System Analysis and DesignData Dictionary in System Analysis and Design
Data Dictionary in System Analysis and DesignArafat Hossan
 
Io files and web
Io files and webIo files and web
Io files and webAhmed Nobi
 
Doctrine Data migrations | May 2017
Doctrine Data migrations | May 2017Doctrine Data migrations | May 2017
Doctrine Data migrations | May 2017Petr Bechyně
 
Practical approaches to entification in library bibliographic data
Practical approaches to entification in library bibliographic dataPractical approaches to entification in library bibliographic data
Practical approaches to entification in library bibliographic dataTerry Reese
 
Sharing Research Information Using RSS Feeds
Sharing Research Information Using RSS FeedsSharing Research Information Using RSS Feeds
Sharing Research Information Using RSS FeedsIAALD Community
 
Web based database application design using vb.net and sql server
Web based database application design using vb.net and sql serverWeb based database application design using vb.net and sql server
Web based database application design using vb.net and sql serverAmmara Arooj
 
Data structures Lecture no.3
Data structures Lecture no.3Data structures Lecture no.3
Data structures Lecture no.3AzharIqbal710687
 
Data structures Lecture no. 2
Data structures Lecture no. 2Data structures Lecture no. 2
Data structures Lecture no. 2AzharIqbal710687
 
Data structures lectures no 1
Data structures lectures no 1Data structures lectures no 1
Data structures lectures no 1AzharIqbal710687
 

Tendances (20)

Data Dictionary
Data DictionaryData Dictionary
Data Dictionary
 
Fitting MarcEdit into the library software ecosystem
Fitting MarcEdit into the library software ecosystemFitting MarcEdit into the library software ecosystem
Fitting MarcEdit into the library software ecosystem
 
Terry Reese - Real-world data editing with MarcEdit
Terry Reese - Real-world data editing with MarcEditTerry Reese - Real-world data editing with MarcEdit
Terry Reese - Real-world data editing with MarcEdit
 
CEK KEMIRIPAN PADA CROSSREF
CEK KEMIRIPAN PADA CROSSREFCEK KEMIRIPAN PADA CROSSREF
CEK KEMIRIPAN PADA CROSSREF
 
Krish data controls
Krish data controlsKrish data controls
Krish data controls
 
Presentation on data preparation with pandas
Presentation on data preparation with pandasPresentation on data preparation with pandas
Presentation on data preparation with pandas
 
Chapter 15
Chapter 15Chapter 15
Chapter 15
 
Toby Green: Data, data everywhere
Toby Green: Data, data everywhereToby Green: Data, data everywhere
Toby Green: Data, data everywhere
 
Access
AccessAccess
Access
 
Semantically Reconnecting Fragmented Information through User Activity Monito...
Semantically Reconnecting Fragmented Information through User Activity Monito...Semantically Reconnecting Fragmented Information through User Activity Monito...
Semantically Reconnecting Fragmented Information through User Activity Monito...
 
Data Dictionary in System Analysis and Design
Data Dictionary in System Analysis and DesignData Dictionary in System Analysis and Design
Data Dictionary in System Analysis and Design
 
Io files and web
Io files and webIo files and web
Io files and web
 
Doctrine Data migrations | May 2017
Doctrine Data migrations | May 2017Doctrine Data migrations | May 2017
Doctrine Data migrations | May 2017
 
Practical approaches to entification in library bibliographic data
Practical approaches to entification in library bibliographic dataPractical approaches to entification in library bibliographic data
Practical approaches to entification in library bibliographic data
 
Sharing Research Information Using RSS Feeds
Sharing Research Information Using RSS FeedsSharing Research Information Using RSS Feeds
Sharing Research Information Using RSS Feeds
 
Web based database application design using vb.net and sql server
Web based database application design using vb.net and sql serverWeb based database application design using vb.net and sql server
Web based database application design using vb.net and sql server
 
Data structures Lecture no.3
Data structures Lecture no.3Data structures Lecture no.3
Data structures Lecture no.3
 
VB6 Using ADO Data Control
VB6 Using ADO Data ControlVB6 Using ADO Data Control
VB6 Using ADO Data Control
 
Data structures Lecture no. 2
Data structures Lecture no. 2Data structures Lecture no. 2
Data structures Lecture no. 2
 
Data structures lectures no 1
Data structures lectures no 1Data structures lectures no 1
Data structures lectures no 1
 

En vedette

Katharine Jarmul, Founder at Kjamistan - "Learn Data Wrangling with Python"
Katharine Jarmul, Founder at Kjamistan - "Learn Data Wrangling with Python"Katharine Jarmul, Founder at Kjamistan - "Learn Data Wrangling with Python"
Katharine Jarmul, Founder at Kjamistan - "Learn Data Wrangling with Python"Dataconomy Media
 
معادلة الخط المستقيم للصف التاسع رياضيات منتدى احباب الاردن - الاسطورة
معادلة الخط المستقيم للصف التاسع رياضيات   منتدى احباب الاردن - الاسطورةمعادلة الخط المستقيم للصف التاسع رياضيات   منتدى احباب الاردن - الاسطورة
معادلة الخط المستقيم للصف التاسع رياضيات منتدى احباب الاردن - الاسطورةمعين بني هاني
 
ملف تاسع ف 1 اوراق العمل
ملف تاسع ف 1  اوراق العململف تاسع ف 1  اوراق العمل
ملف تاسع ف 1 اوراق العملfatima harazneh
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with SparkKrishna Sankar
 
The tunnel powerpoint
The tunnel powerpointThe tunnel powerpoint
The tunnel powerpointishict
 

En vedette (8)

Open refine to update and clean up your messy data
Open refine to update and clean up your messy dataOpen refine to update and clean up your messy data
Open refine to update and clean up your messy data
 
Katharine Jarmul, Founder at Kjamistan - "Learn Data Wrangling with Python"
Katharine Jarmul, Founder at Kjamistan - "Learn Data Wrangling with Python"Katharine Jarmul, Founder at Kjamistan - "Learn Data Wrangling with Python"
Katharine Jarmul, Founder at Kjamistan - "Learn Data Wrangling with Python"
 
Data Wrangling
Data WranglingData Wrangling
Data Wrangling
 
معادلة الخط المستقيم للصف التاسع رياضيات منتدى احباب الاردن - الاسطورة
معادلة الخط المستقيم للصف التاسع رياضيات   منتدى احباب الاردن - الاسطورةمعادلة الخط المستقيم للصف التاسع رياضيات   منتدى احباب الاردن - الاسطورة
معادلة الخط المستقيم للصف التاسع رياضيات منتدى احباب الاردن - الاسطورة
 
ملف تاسع ف 1 اوراق العمل
ملف تاسع ف 1  اوراق العململف تاسع ف 1  اوراق العمل
ملف تاسع ف 1 اوراق العمل
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with Spark
 
The tunnel powerpoint
The tunnel powerpointThe tunnel powerpoint
The tunnel powerpoint
 
Drone for the Future
Drone for the FutureDrone for the Future
Drone for the Future
 

Similaire à Beautiful Research Data (Structured Data and Open Refine)

The search engine index
The search engine indexThe search engine index
The search engine indexCJ Jenkins
 
Cleaning and sorting data
Cleaning and sorting dataCleaning and sorting data
Cleaning and sorting dataNina Sandlin
 
Data Management for Quantitative Biology - Database Systems (continued) LIMS ...
Data Management for Quantitative Biology - Database Systems (continued) LIMS ...Data Management for Quantitative Biology - Database Systems (continued) LIMS ...
Data Management for Quantitative Biology - Database Systems (continued) LIMS ...QBiC_Tue
 
Advanced Excel Technologies In Early Development Applications
Advanced Excel Technologies In Early Development ApplicationsAdvanced Excel Technologies In Early Development Applications
Advanced Excel Technologies In Early Development ApplicationsBrian Bissett
 
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...Bill Buchan
 
Access2003
Access2003Access2003
Access2003mrh1222
 
Access2003
Access2003Access2003
Access2003tanik363
 
Building nTier Applications with Entity Framework Services (Part 1)
Building nTier Applications with Entity Framework Services (Part 1)Building nTier Applications with Entity Framework Services (Part 1)
Building nTier Applications with Entity Framework Services (Part 1)David McCarter
 
Nt1330 Unit 1 Exercise 2 Reaction Paper
Nt1330 Unit 1 Exercise 2 Reaction PaperNt1330 Unit 1 Exercise 2 Reaction Paper
Nt1330 Unit 1 Exercise 2 Reaction PaperMichelle Madero
 
Binary Search Tree Investigation
Binary Search Tree InvestigationBinary Search Tree Investigation
Binary Search Tree InvestigationLindsay Alston
 
Building nTier Applications with Entity Framework Services (Part 1)
Building nTier Applications with Entity Framework Services (Part 1)Building nTier Applications with Entity Framework Services (Part 1)
Building nTier Applications with Entity Framework Services (Part 1)David McCarter
 
PATTERNS07 - Data Representation in C#
PATTERNS07 - Data Representation in C#PATTERNS07 - Data Representation in C#
PATTERNS07 - Data Representation in C#Michael Heron
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera, Inc.
 
Record Deduplication and Record Linkage
Record Deduplication and  Record LinkageRecord Deduplication and  Record Linkage
Record Deduplication and Record LinkageCRISLANIO MACEDO
 
Matlab for a computational PhD
Matlab for a computational PhDMatlab for a computational PhD
Matlab for a computational PhDAlbanLevy
 
No more Three Tier - A path to a better code for Cloud and Azure
No more Three Tier - A path to a better code for Cloud and AzureNo more Three Tier - A path to a better code for Cloud and Azure
No more Three Tier - A path to a better code for Cloud and AzureMarco Parenzan
 

Similaire à Beautiful Research Data (Structured Data and Open Refine) (20)

The search engine index
The search engine indexThe search engine index
The search engine index
 
Cleaning and sorting data
Cleaning and sorting dataCleaning and sorting data
Cleaning and sorting data
 
Data Management for Quantitative Biology - Database Systems (continued) LIMS ...
Data Management for Quantitative Biology - Database Systems (continued) LIMS ...Data Management for Quantitative Biology - Database Systems (continued) LIMS ...
Data Management for Quantitative Biology - Database Systems (continued) LIMS ...
 
Advanced Excel Technologies In Early Development Applications
Advanced Excel Technologies In Early Development ApplicationsAdvanced Excel Technologies In Early Development Applications
Advanced Excel Technologies In Early Development Applications
 
Bp301
Bp301Bp301
Bp301
 
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
 
Access2003
Access2003Access2003
Access2003
 
Access2003
Access2003Access2003
Access2003
 
Building nTier Applications with Entity Framework Services (Part 1)
Building nTier Applications with Entity Framework Services (Part 1)Building nTier Applications with Entity Framework Services (Part 1)
Building nTier Applications with Entity Framework Services (Part 1)
 
Nt1330 Unit 1 Exercise 2 Reaction Paper
Nt1330 Unit 1 Exercise 2 Reaction PaperNt1330 Unit 1 Exercise 2 Reaction Paper
Nt1330 Unit 1 Exercise 2 Reaction Paper
 
Binary Search Tree Investigation
Binary Search Tree InvestigationBinary Search Tree Investigation
Binary Search Tree Investigation
 
Building nTier Applications with Entity Framework Services (Part 1)
Building nTier Applications with Entity Framework Services (Part 1)Building nTier Applications with Entity Framework Services (Part 1)
Building nTier Applications with Entity Framework Services (Part 1)
 
PATTERNS07 - Data Representation in C#
PATTERNS07 - Data Representation in C#PATTERNS07 - Data Representation in C#
PATTERNS07 - Data Representation in C#
 
Ad507
Ad507Ad507
Ad507
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
 
Record Deduplication and Record Linkage
Record Deduplication and  Record LinkageRecord Deduplication and  Record Linkage
Record Deduplication and Record Linkage
 
Matlab for a computational PhD
Matlab for a computational PhDMatlab for a computational PhD
Matlab for a computational PhD
 
Data analysis with pandas
Data analysis with pandasData analysis with pandas
Data analysis with pandas
 
Data Analysis With Pandas
Data Analysis With PandasData Analysis With Pandas
Data Analysis With Pandas
 
No more Three Tier - A path to a better code for Cloud and Azure
No more Three Tier - A path to a better code for Cloud and AzureNo more Three Tier - A path to a better code for Cloud and Azure
No more Three Tier - A path to a better code for Cloud and Azure
 

Dernier

IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmonyelliciumsolutionspun
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.Sharon Liu
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesSoftwareMill
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilVICTOR MAESTRE RAMIREZ
 
Webinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptWebinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptkinjal48
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIIvo Andreev
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageDista
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...OnePlan Solutions
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxJoão Esperancinha
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native BuildpacksVish Abrams
 
ERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxAutus Cyber Tech
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntelliSource Technologies
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsJaydeep Chhasatia
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024Mind IT Systems
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadIvo Andreev
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Incrobinwilliams8624
 
Watermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesWatermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesShyamsundar Das
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorShane Coughlan
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 

Dernier (20)

IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retries
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-Council
 
Webinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptWebinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.ppt
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AI
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptx
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native Buildpacks
 
ERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptx
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptx
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and Bad
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Inc
 
Watermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesWatermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security Challenges
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS Calculator
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 

Beautiful Research Data (Structured Data and Open Refine)

Notes de l'éditeur

  1. Take Dragomans File and load it into
  2. padani example
  3. Difficult to install on Windows?
  4. Here I have launched OpenRefine in my browser. The sample file I’m using is located at the url on the slide. Remember that a longer version of this tutorial is available at http://programminghistorian.org/lessons/cleaning-data-with-openrefine