Contenu connexe
Similaire à Seravia in the Cloud (20)
Seravia in the Cloud
- 12. Seravia on AWS WWW Data Crawlware ELB EC2 rails, mongo, mysql, sphinx S3 EC2 parsing, pentaho, ETL S3 EMR hadoop, hive, BI EC2 S3 EC2 rails, mongo, mysql, sphinx
- 13. WWW architecture ELB EC2 webserver S3 EC2 webserver EC2 mongo EC2 sphinx EC2 mysql EC2 webserver, rails EC2 mongo EC2 sphinx
- 14. Data Architecture EC2 post-processing EC2 Parsing, ETL S3 EC2 Parsing, ETL EMR Hadoop, hive, BI EC2 post-processing 1. Raw data – html, xml, text files 2. Pre-processed – unrelated tsv files 3. Analyzed – related tsv files and reports 4. Post-processed – json documents EC2 post-processing