Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Enabling Efficient Cloud Workflows
Eugene Cheipesh
echeipesh@azavea.com
GitHub: @echeipesh
www.azavea.com
Cloud Native Workflows
“Hey, let's not download the whole file every time.”
Cloud Native
● Network is between storage and computation
● Storage is chunked and probably Key/Value
● Computation must s...
Cloud Optimized GeoTiff
● Defines an internal structure of GeoTiffs that is
optimized for streaming reading from blob
stor...
https://trac.osgeo.org/gdal/wiki/CloudOptimizedGeoTIFF
http://www.cogeo.org/
“A GeoTIFF file is a TIFF 6.0 file, and
inherits the file structure as described
in the corresponding portion of the
TIFF ...
TIFF
● Tagged Image File Format
● Binary image is separated out into
“segments”
● Contains metadata in a sequence of “tags...
GeoTiff Custom Tags
● GeoTiff adds custom tag GeoKeys
○ non-TIFF keys for CRS map space etc.
● This is how we know where t...
GeoTiff Streaming Read
● Step 1: Fetch the header
● Step 2: Decide what segments you want
● Step 3: Read the segments
Request: BBOX Mean
GeoTIFF: All of the above
COG: IFDs First - Rule #1
TIFF Segments
● Chunks of an image are stored at different
locations. This allows for parts of the image to
be decompresse...
TIFF: Strip Segments
COG: Strip Segments - Bad
TIFF: Tile Segments
COG: Use Tiled Segments - Rule #2
Request: BBOX Mean at 60m
Request: BBOX Mean at 60m
GeoTIFF: Has Overviews
COG: Provide Overviews - Rule #3
COG Rules
● Put IFDs first
● Sort IFDs by resolution
● Use Tiled segments
● Include overviews (in file)
COG Rules Continued ?
● Provenance Metadata
● Include Summary Statistics
● File naming convention
● Catalog suggestion
COG “Profiles”: Slippy Map
● Restrict CRS as EPSG:3857
● Align GeoTiff segments with slippy tile grid
● Align GeoTiff file...
COG World Coverage?
“You’re going to need more than one GeoTIFF.”
s3://[bucket]/[Shard]_[GeoHash]_[ISO_TIME]_[ID].tiff
GeoTrellis & COGs
COG: Tile Request
COG Tiler
● Application specific catalog
● Reprojection on read
● Memache gives responsiveness
● Raster metadata pushed in...
GeoTrellis Layers
GeoTrellis: Avro Layers
GeoTrellis Avro Layers
● Many small files, one per tile
○ PUT requests get expensive
● Data duplication hazzard
○ Probably...
GeoTrellis: COG Layers
GeoTrellis Strucuted COG
● Base zoom tiles are in base GeoTiff layer
● Introduces “DATE_TIME” tag
● Includes VRT for layer...
GeoTrellis: COG Pyramid
Unstructured COGs
Cloud Native thinking
GeoTrellis Unstructured COG
● Lots of good raster sources are already COG
● They don’t have a strict layout though
● Lets ...
COG: Tile Request (Again)
GeoTrellis Unstructured COG
● Requires mechanism to query for raster names
○ WFS, STAC, PostgreSQL, Scan
● COG provies “gr...
Coregistration Workflow
● Working with multiple raster layers with differing:
○ CRS, Resolution, Extent
● Must pick the wo...
Coregisteration Workflow
COGs for GeoTrellis
● Treating GeoTiff as a Key/Value tile store
○ With built in notion level of detail
● Turns an ingest ...
COGs in General
● GeoJSON of raster formats
● Gradated spec
● Set of best practices
● Driven by implementations
THANKS
Q: Is this a good idea ?
Q: So is this a standard ?
Q: Why not use MRF ?
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Prochain SlideShare
Chargement dans…5
×

2

Partager

Télécharger pour lire hors ligne

Cloud Optimized GeotTIFFs: enabling efficient cloud workflows

Télécharger pour lire hors ligne

Small changes to the way common GeoTIFF file is written has broad implications for enabling efficient use of raster data in cloud native workflows.

Livres associés

Gratuit avec un essai de 30 jours de Scribd

Tout voir

Cloud Optimized GeotTIFFs: enabling efficient cloud workflows

  1. 1. Enabling Efficient Cloud Workflows
  2. 2. Eugene Cheipesh echeipesh@azavea.com GitHub: @echeipesh www.azavea.com
  3. 3. Cloud Native Workflows “Hey, let's not download the whole file every time.”
  4. 4. Cloud Native ● Network is between storage and computation ● Storage is chunked and probably Key/Value ● Computation must scale ● Coordination is limited and risky
  5. 5. Cloud Optimized GeoTiff ● Defines an internal structure of GeoTiffs that is optimized for streaming reading from blob storage like S3 or Google Cloud Storage. ● That is, it’s best utilized for requests that use HTTP GET Range Requests
  6. 6. https://trac.osgeo.org/gdal/wiki/CloudOptimizedGeoTIFF
  7. 7. http://www.cogeo.org/
  8. 8. “A GeoTIFF file is a TIFF 6.0 file, and inherits the file structure as described in the corresponding portion of the TIFF spec. All GeoTIFF specific information is encoded in several additional reserved TIFF tags, and contains no private Image File Directories (IFD's), binary structures or other private information invisible to standard TIFF readers.” http://geotiff.maptools.org/spec/geotiff2.4.html#2.4 GeoTiff
  9. 9. TIFF ● Tagged Image File Format ● Binary image is separated out into “segments” ● Contains metadata in a sequence of “tags” ● Can contain metadata for multiple images in different “image file directories” (IDFs)
  10. 10. GeoTiff Custom Tags ● GeoTiff adds custom tag GeoKeys ○ non-TIFF keys for CRS map space etc. ● This is how we know where the pixels are on the surface of the earth (the geospatial bit). http://geotiff.maptools.org/spec/geotiff6.html#6.1
  11. 11. GeoTiff Streaming Read ● Step 1: Fetch the header ● Step 2: Decide what segments you want ● Step 3: Read the segments
  12. 12. Request: BBOX Mean
  13. 13. GeoTIFF: All of the above
  14. 14. COG: IFDs First - Rule #1
  15. 15. TIFF Segments ● Chunks of an image are stored at different locations. This allows for parts of the image to be decompressed and loaded separately. ● There are two methods for segment layout: Striped and Tiled
  16. 16. TIFF: Strip Segments
  17. 17. COG: Strip Segments - Bad
  18. 18. TIFF: Tile Segments
  19. 19. COG: Use Tiled Segments - Rule #2
  20. 20. Request: BBOX Mean at 60m
  21. 21. Request: BBOX Mean at 60m
  22. 22. GeoTIFF: Has Overviews
  23. 23. COG: Provide Overviews - Rule #3
  24. 24. COG Rules ● Put IFDs first ● Sort IFDs by resolution ● Use Tiled segments ● Include overviews (in file)
  25. 25. COG Rules Continued ? ● Provenance Metadata ● Include Summary Statistics ● File naming convention ● Catalog suggestion
  26. 26. COG “Profiles”: Slippy Map ● Restrict CRS as EPSG:3857 ● Align GeoTiff segments with slippy tile grid ● Align GeoTiff files with tile boundaries ● Serve tile endpoints directly from COG
  27. 27. COG World Coverage? “You’re going to need more than one GeoTIFF.”
  28. 28. s3://[bucket]/[Shard]_[GeoHash]_[ISO_TIME]_[ID].tiff
  29. 29. GeoTrellis & COGs
  30. 30. COG: Tile Request
  31. 31. COG Tiler ● Application specific catalog ● Reprojection on read ● Memache gives responsiveness ● Raster metadata pushed into files
  32. 32. GeoTrellis Layers
  33. 33. GeoTrellis: Avro Layers
  34. 34. GeoTrellis Avro Layers ● Many small files, one per tile ○ PUT requests get expensive ● Data duplication hazzard ○ Probably not authoritative ● Limited access ○ Requires GeoTrellis code to read
  35. 35. GeoTrellis: COG Layers
  36. 36. GeoTrellis Strucuted COG ● Base zoom tiles are in base GeoTiff layer ● Introduces “DATE_TIME” tag ● Includes VRT for layer files ● GeoTiff segments match tile size and layout ● Additional zoom levels stored in overviews ● Pyramid up when overviews “run out”
  37. 37. GeoTrellis: COG Pyramid
  38. 38. Unstructured COGs Cloud Native thinking
  39. 39. GeoTrellis Unstructured COG ● Lots of good raster sources are already COG ● They don’t have a strict layout though ● Lets “inline” ingest process into the batch job ● Slower but reduces data duplication
  40. 40. COG: Tile Request (Again)
  41. 41. GeoTrellis Unstructured COG ● Requires mechanism to query for raster names ○ WFS, STAC, PostgreSQL, Scan ● COG provies “gradual” wins over GeoTiff ● Reproject on read to match workflow
  42. 42. Coregistration Workflow ● Working with multiple raster layers with differing: ○ CRS, Resolution, Extent ● Must pick the working/target layout ● We can “push-down” this process into IO
  43. 43. Coregisteration Workflow
  44. 44. COGs for GeoTrellis ● Treating GeoTiff as a Key/Value tile store ○ With built in notion level of detail ● Turns an ingest workflow into query workflow ○ Use the metadata to plan the read ● Opens up the results ○ QGIS, GeoServer, GDAL
  45. 45. COGs in General ● GeoJSON of raster formats ● Gradated spec ● Set of best practices ● Driven by implementations
  46. 46. THANKS
  47. 47. Q: Is this a good idea ?
  48. 48. Q: So is this a standard ?
  49. 49. Q: Why not use MRF ?
  • ssuserc32eed1

    Aug. 15, 2019
  • wsh6759

    Apr. 1, 2019

Small changes to the way common GeoTIFF file is written has broad implications for enabling efficient use of raster data in cloud native workflows.

Vues

Nombre de vues

5 422

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

2 686

Actions

Téléchargements

33

Partages

0

Commentaires

0

Mentions J'aime

2

×