2. What is Eudoxus - Disclaimer
• A distributed system that supports and streamlines
processes and operations related to the textbooks
distribution for the higher education institutions of Greece
• Stakeholder: Hellenic Ministry of Education
http://minedu.gov.gr/
• Developed, operated and supported by GRNET
http://grnet.gr/
• Users: Students, Secretariats, Publishers, Distribution
Points (e.g. Bookstores), related Ministry depts, Courier
service, Helpdesk- More than 270.000 individual users
• Multiple APIs and integration with other ISs
http://eudoxus.gr
3. Presentation Focus
• Design and software architecture
o by design based on 100% FOSS infrastructure
o full JEE stack, rich web clients (GWT/Java)
o scalability, performance and bottlenecks
• Building within a very tight schedule and ever changing
requirements
o agile methodology
o human environment
o “collateral damage”
4. Design Goals - Challenges
• Tight schedule, fixed milestones - deadlines
o Incrementally release modules in production
• Not finalized requirements
o Be flexible, handle changing requirements
• High availability
o Redundant architecture
o Live application updates – no downtime
5. Design Goals – Challenges (cont'd)
• Impossible to predict usage load
– design for scalability,
– elastically use infrastructure to serve growing usage,
– prepare for the worst
– efficient use of hardware infrastructure
• Safeguard the transactional nature of related processes
• Synchronize data and connect to other systems
6. Technology decisions
• Web-based user interfaces
• Support various APIs
• RESTFul APIs (JSON / XML payload)
• use SOAP-based web services for integration in
specific cases (e.g. courier service)
• Java-based core (business logic tier)
• Javascript-based / GWT-based (Java again!) rich GUIs
7. Technology decisions (cont'd)
• Back-end storage: RDBMS vs noSQL decision
o noSQL options offer better scalability than typical RDBMs
o needed at least a minimal transactional core - eventual
consistency is not acceptable
o dropped the initial hybrid approach - full RDBMS design
• Use caching as much as possible
• Authentication / Credentials
• Stateless mechanism (killed JAAS)
• Support both
– Shibboleth-based authentication for students
– form based login (other user classes)
9. Infrastructure Software
• Google Web Toolkit (GWT) for the rich GUI clients (browser)
• Apache, nginx front-ends
• Varnish web cache
• JBoss Application Server
o stateless EJBs implementing core business logic
o JPA ORM (hibernate substrate)
o JMS Queues and Message Driven Beans
o JMX MBeans for scheduled tasks and maintanance
• PostgreSQL RDBMS
• Solr (Lucene server) indexing / search server
11. Losing the HTTP Session
• No HTTP session => workers don't need to
replicate – synchronize
• Application state is kept in the client
• Authentication generates a short-lived auth token
for the specific user & session
• Token is kept in a cookie and send back with every
request (over SSL), be it a REST or GWT-remoting
call
– In GSS used the token to encrypt the header
• Completely stateless middle tier
14. The worker
• Stateless EJBs implement business functions
• POJOs (JPA) for the data model
• Mbeans for management tasks (no UI – access via
the jmx-console)
• Persistent JMS queues (hornetQ) & MDBs for
syncing with other systems, serializing reports,
handling file indexing (Solr)
• REST API implementaiton will move to JAX-RS
(Jersey)
15. The data store
• Single RDBMS (postgresql) instance with fallback
• “Plan B” alternatives
– Cold replicas for reports
– Sharding / hot replication (one writer / multiple readers)
– Moving parts of the data to NoSQL or even use Solr as
a NoSQL data store
• Currently we handle the load without sweating
16. More on caching
• Varnish caches specific HTML output and
REST responses for URLs that do not
change (Eudoxus controls this via http
headers)
• 2nd level caching - EHCache (per worker)
• We will be introducing distributed read-write
second level caching (working with Infinispan
right now)
17. Meeting Tight Schedules
• Typical setup
o VCS (mercurial), wiki for requirements, issue tracker
o dedicated release engineer, dedicated DBA
o local development -> test server -> production server
o FOSS dev tools (Eclipse, ant, etc)
• Agile approach
• Documentation was the first victim of the pressing
schedule :( “updated unit tests” was the second :o
• A few statistics on the team
o 5 months from scratch to production
o ~10 developers
o ~10 testers (functional testing)
o ~20 helpdesk members for end-user support
o Full GRNET NOC support