Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Best Practices • Again, there Getting Started on Hadoop

21 242 vues

Publié le

Best Practices

• Again, there are much more efficient ways to handle Hadoop Streaming
and Text Analytics…
• Unit Tests, Continuous Integration, etc., – all great stuff, but “Big Data”
software engineering requires additional steps
• Sample data, measure data ratios and cluster behaviors, analyze in R,
visualize everything you can, calibrate any necessary “magic numbers”
• Develop and test code on a personal computer in IDE, cmd line, etc., using
a minimal data sets
• Deploy to staging cluster with larger data sets for integration tests and QA
• Run in production with A/B testing were feasible to evaluate changes
quantitatively
• Learn from others at meetups, unconfs, forums, etc.

Publié dans : Technologie

×