SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
Accessibility
Considerations forVery
Large Datasets
Puneet Kishor
University of Wisconsin-Madison and Creative Commons
Monday, October 25, 2010
Acknowledgments to
CODATA for inviting me, Creative
Commons for funding my trip,
University of Wisconsin-Madison for
paying my salary, and most
importantly, the US Federal
Goverment for making all the data
available to anyone, anywhere
without any pre-conditions
Monday, October 25, 2010
Research context:
ecosystem process
modeling of very large
terrestrial ecosystems
Monday, October 25, 2010
Information by numbers
Monday, October 25, 2010
7
daily variablestm
axtm
in
taveprcpsrad
vpd
dayl
Monday, October 25, 2010
1
km2 cell
1 km
1 km
tm
axtm
in
taveprcpsrad
vpd
dayl
Monday, October 25, 2010
13million cells
4587
2889
.25
Monday, October 25, 2010
8401days
8400
Monday, October 25, 2010
111billion septets
.32
tm
axtm
in
taveprcpsrad
vpd
dayl
111.32b
Monday, October 25, 2010
725raw gigabytes
.78
Monday, October 25, 2010
10times as much in
a database
Monday, October 25, 2010
84GB of NetCDF format
in tar gzipped archives
Monday, October 25, 2010
2 3 4 5
8 9 10 11
6
12
7
13 14
1
4˚square chunks
Monday, October 25, 2010
“½”incomplete
documentation
Monday, October 25, 2010
0ways to query
the data
?X
Monday, October 25, 2010
1. Acquire NetCDF file of lat/lon values for each
cell from the weather data 1 km2 estimates
2. Dump lat/lon values to CSV with Panoply
3. Import into ArcMap as XY data
4. Export as shapefile
5. Assign WGS84 datum to shapefile in ArcCatalog
6. Reproject to Lambert Spherical (“US National
Atlas Equal Area”)
7. Separate by 2x2 degree tile using "tile_num"
attribute (so grid will match the netCDF met
files) using defination query in ArcMap and
exporting to individual shapefiles (256 tiles) as
"mask".
8. Open lambert points in qGIS and make 1km grid
(shapefile) for each 2x2 tile
9. Assign projection to output (EPSG:2163)
10. Add each new grid shapefile (one at a time) to
ArcMap with 2x2 Grid as separate layer
11. Select by location (select from grid x that
intersect mask x)
12. Export selected features of grid x (now will be
numbered sequentially by record in a way that
matches the met NetCDF “ncells”)
13. Clean up: delete extra fields from qGIS
(ID,MAXX,MINX,MAXY,MINY) add ncell_id (FID
+1) block_id, block_name
“10”times the work to
unpack the data
Monday, October 25, 2010
Many kinds of queries
f<variable> <location> <point in time>
avg(srad) at x,y on Dec 2, 2001
tmin for area on May 19, 1992
tmax at x,y on May 19, 1992
f<variable> <point location> <duration of time>
tave at x,y during the first quarter of 1983
sum(vpd) at x,y during the last week of Mar, 2003
Monday, October 25, 2010
accessible¦aksesəbəl¦
adjective
1 (of a place) able to be reached or entered : the town is
accessible by bus | the building has been made accessible to
disabled people.
• (of an object, service, or facility) able to be easily obtained or
used : making learning opportunities more accessible to adults.
• easily understood : his Latin grammar is lucid and accessible.
• able to be reached or entered by people in wheelchairs : it
provides specialized features such as nonslip floors and accessible
entrances.
2 (of a person, typically one in a position of authority or
importance) friendly and easy to talk to; approachable : he is more
accessible than most tycoons.
Monday, October 25, 2010
Accessible information
is easy to: find,
determine what one
can do with it, acquire,
and use
Monday, October 25, 2010
Factors that affect
accessibility: law;
technology; culture;
semantics; and
economics
Monday, October 25, 2010
Law makes sharing
permissible; technology
makes it possible; culture
makes it acceptable;
semantics make it
understandable; and
economics affordable
Monday, October 25, 2010
It is permissible,
acceptable, and
affordable to access
public sector
information, but not
necessarily possible or
understandableMonday, October 25, 2010
Goals of the new
storage: make the
information
technologically and
semantically accessible
Monday, October 25, 2010
Allow access by
providing user-
interface, application
programming interface
and documentation
Monday, October 25, 2010

Contenu connexe

En vedette

En vedette (7)

2 sharif
2 sharif2 sharif
2 sharif
 
10
1010
10
 
G T C N Exec Summ J M 1
G T C N  Exec  Summ  J M 1G T C N  Exec  Summ  J M 1
G T C N Exec Summ J M 1
 
How Many Errors Can Be In My Paper?
How Many Errors Can Be In My Paper?How Many Errors Can Be In My Paper?
How Many Errors Can Be In My Paper?
 
Your True Reality Short Version
Your True Reality  Short VersionYour True Reality  Short Version
Your True Reality Short Version
 
Funky Buddha Fashion Collection SS 14
Funky Buddha Fashion Collection SS 14Funky Buddha Fashion Collection SS 14
Funky Buddha Fashion Collection SS 14
 
Auto Enrolment: Are You Ready?
Auto Enrolment: Are You Ready?Auto Enrolment: Are You Ready?
Auto Enrolment: Are You Ready?
 

Dernier

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

3 kishor

  • 1. Accessibility Considerations forVery Large Datasets Puneet Kishor University of Wisconsin-Madison and Creative Commons Monday, October 25, 2010
  • 2. Acknowledgments to CODATA for inviting me, Creative Commons for funding my trip, University of Wisconsin-Madison for paying my salary, and most importantly, the US Federal Goverment for making all the data available to anyone, anywhere without any pre-conditions Monday, October 25, 2010
  • 3. Research context: ecosystem process modeling of very large terrestrial ecosystems Monday, October 25, 2010
  • 6. 1 km2 cell 1 km 1 km tm axtm in taveprcpsrad vpd dayl Monday, October 25, 2010
  • 11. 10times as much in a database Monday, October 25, 2010
  • 12. 84GB of NetCDF format in tar gzipped archives Monday, October 25, 2010
  • 13. 2 3 4 5 8 9 10 11 6 12 7 13 14 1 4˚square chunks Monday, October 25, 2010
  • 15. 0ways to query the data ?X Monday, October 25, 2010
  • 16. 1. Acquire NetCDF file of lat/lon values for each cell from the weather data 1 km2 estimates 2. Dump lat/lon values to CSV with Panoply 3. Import into ArcMap as XY data 4. Export as shapefile 5. Assign WGS84 datum to shapefile in ArcCatalog 6. Reproject to Lambert Spherical (“US National Atlas Equal Area”) 7. Separate by 2x2 degree tile using "tile_num" attribute (so grid will match the netCDF met files) using defination query in ArcMap and exporting to individual shapefiles (256 tiles) as "mask". 8. Open lambert points in qGIS and make 1km grid (shapefile) for each 2x2 tile 9. Assign projection to output (EPSG:2163) 10. Add each new grid shapefile (one at a time) to ArcMap with 2x2 Grid as separate layer 11. Select by location (select from grid x that intersect mask x) 12. Export selected features of grid x (now will be numbered sequentially by record in a way that matches the met NetCDF “ncells”) 13. Clean up: delete extra fields from qGIS (ID,MAXX,MINX,MAXY,MINY) add ncell_id (FID +1) block_id, block_name “10”times the work to unpack the data Monday, October 25, 2010
  • 17. Many kinds of queries f<variable> <location> <point in time> avg(srad) at x,y on Dec 2, 2001 tmin for area on May 19, 1992 tmax at x,y on May 19, 1992 f<variable> <point location> <duration of time> tave at x,y during the first quarter of 1983 sum(vpd) at x,y during the last week of Mar, 2003 Monday, October 25, 2010
  • 18. accessible¦aksesəbəl¦ adjective 1 (of a place) able to be reached or entered : the town is accessible by bus | the building has been made accessible to disabled people. • (of an object, service, or facility) able to be easily obtained or used : making learning opportunities more accessible to adults. • easily understood : his Latin grammar is lucid and accessible. • able to be reached or entered by people in wheelchairs : it provides specialized features such as nonslip floors and accessible entrances. 2 (of a person, typically one in a position of authority or importance) friendly and easy to talk to; approachable : he is more accessible than most tycoons. Monday, October 25, 2010
  • 19. Accessible information is easy to: find, determine what one can do with it, acquire, and use Monday, October 25, 2010
  • 20. Factors that affect accessibility: law; technology; culture; semantics; and economics Monday, October 25, 2010
  • 21. Law makes sharing permissible; technology makes it possible; culture makes it acceptable; semantics make it understandable; and economics affordable Monday, October 25, 2010
  • 22. It is permissible, acceptable, and affordable to access public sector information, but not necessarily possible or understandableMonday, October 25, 2010
  • 23. Goals of the new storage: make the information technologically and semantically accessible Monday, October 25, 2010
  • 24. Allow access by providing user- interface, application programming interface and documentation Monday, October 25, 2010