How to build Linked Data Platform (in a W3C sense https://www.w3.org/TR/ldp/) without actually building one. We'll look into the rich set of services provided by Amazon as part of AWS and see if we can configure them to look like an LDP (spoiler - yes, we can).
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Building Linked Data Platform with AWS
1. Building Linked Data
Platform with AWS...
...without actually building one
Eugene Morozov
Twitter: @eugenemorozov
Blog: http://emorozov.com
LinkedIn: https://www.linkedin.com/in/emorozov
2. If you really want to hear about it...
● Dev background
● Financial services
● Some background in building APIs - integration, reporting, etc.
3. Project experience
● Creating a reporting API that can be used by clients
● Getting away from the assumption that we must actually build something
● Solution to reuse AWS S3 with a thin veneer of custom authentication
5. Exposing data sets with SemanticWeb standards
● Enter Linked Data Platform (https://www.w3.org/TR/ldp)
● Linked Data Platform (LDP) defines a set of rules for HTTP operations on
web resources, some based on RDF, to provide an architecture for read-write
Linked Data on the web.
6. LDP 101
● Resources
● Containers (Basic, Direct, Indirect)
● Content Types
● Headers
○ Mandates use of Link to specify
resources and containers
○ Use of Accept-Post, Accept-Patch
● REST verbs
○ All of the standard verbs, with
prescribed behaviour for containers
7. LDP 101
● Use headers to advertise what operations are available on the containers
○ Mandates use of Link to specify LDP resources and containers
○ Use of Accept-Post, Accept-Patch
● Suggests use of well-known terms DC, RDFS
11. AWS 101 - S3
● Fairly rich API for storage via S3
● But that’s not quite enough to conform to LDP because of how headers are
treated
12. AWS 101 - API Gateway
● APIGateway is a way to define APIs in Swagger (with some custom
extensions) and deploy it
● Can be used to proxy other AWS services
14. AWS 101 - CloudFront CDN
● Use of CDN to proxy requests through to API -> S3
● More control over routing
● Caching
● Lower cost per API request if caching on top of the API
15. AWS 101 - Cognito
● Rich APIs for access control
● Don’t need to expose IAM roles to the outside world
16. AWS 101 - CloudFormation
● CloudFormation for provisioning
● Declarative deployment
● Can complete LDP-in-a-box
18. Demo
● Python, AWS
● Picking first two things off the list
○ Flatten content and push it into S3
○ Define subset of the DLP in Swagger
19. Future work
● Would be great to add Route53, Cognito and CDN
● Would be good to have fully conformant LDP definition in Swagger (doable)
● Perhaps going for a templated container in APIGateway, so we don’t need to
explicitly flatten the containers
● Add billing into LDP API, perhaps via S3 caller pays feature or log analysis
● In-a-box approach with CloudFormation
20. Some references and inspiration
● MarkLogic case study https://d0.awsstatic.com/whitepapers/marklogic-on-
aws.pdf
● RDF Data Management in Amazon Cloud
http://dl.acm.org/citation.cfm?id=2320790
● Automatic mapping of web APIs http://datalegend.net/assets/paper7.pdf
Hello, welcome to the third Semantic Web London Meetup
My name is Eugene Morozov, I co-manage engineering practice at Lab49
We are a strategy, design and technology consulting firm and we specialise in capital markets
Fine print - whatever is on the screen is my own work, not a work done for our clients
Other TODO:
Look into https://github.com/awslabs/ecs-refarch-cloudformation
Serverless architectures (AWS APIGateway + Lambdas)
https://github.com/awslabs/aws-big-data-blog/tree/master/aws-blog-titan-graph-database