This is the introductory presentation from our second Talend community user group Bristol event held in June 2015. Speakers include Ian Cray, Director of KETL Limited, Mike Newens, AgeUK and Brad Flemming Talend. Mike from AgeUK presented the charity's case study in adopting Talend as a data cleaning and profiling tool for their CRM data.
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
Talend community user group Bristol & SW UK event
1. www.ketl.co.uk
Agenda
1. Intro with Ian Cray
2. Age UK - from zero to (Talend)
hero in a few short weeks with
Mike Newens
3. Talend update with Brad
Flemming
4. Demos with Joshua
2. A longer life gives you the chance
to learn more about nature, meet
more people, see more places,
and to help more people
Fauja Singh aged 102, Marathon runner
3. www.ketl.co.uk
Age UK
Age UK's vision is for a world where
everyone can love later life. They do
this by inspiring, supporting and
enabling in a number of ways…
5. www.ketl.co.uk
Age UK: Mike
Newens
The implementation of a Talend
Data Management Platform needed
to improve data quality and to
import data into new CRM system.
6. How things were before Talend
Raw Source
Data Raw Data
Tables
Load Raw Data
Identify
Standard
Entities
Match
Create
Import Files
Standard
Data
Interface
Raisers’
EdgeImport
Files
Data Clean
Pre Talend Project (Raisers’ Edge CRM) MS SQL SSIS
Raisers’ Edge
CRM
7. What were the main challenges you wanted
the project to solve?
Dupes across files
Data exception handling
Data quality reporting
8. How we went about the project
4 day training course followed by mentoring and design advice.
We installed and configuration Talend in the Development
environment as well as delivered and documented the specification
for UAT and LIVE environments.
We completed the end to end process of file management, cleaning,
mandatory field checking, de-duplication, error logging, auditing and
loading to MS Dynamics for the first 3 feeds.
We succeeded in creating a flexible, modular design that allows for
the quick addition of new feeds and cleaning rules.
9. Talend versus previous toolset
SSIS rather clumsy
Heavy use of scripting
Talend offers more elegant solution
11. Project development
We are currently undergoing a rigorous testing process
that will not only test current code and rules but will also
provide a quantitative, documented and quick regression
testing process for when we add new feeds.
Test and Live environments have been specified but not
yet delivered. We have concise documentation for the
install process for Talend in each environment.
12. www.ketl.co.uk
Benefits of the
project
The CRM data team will now be
able to set up a DQM process
reducing costs/risks by eliminating
duplication and saving staff time.
13. How will the project impact on AgeUK?
Improved data quality availability and quality will have a
positive impact for just about every area of AgeUK as
CRM data is used in:
• Contact management
• Selections
• Reporting, Insight and Analysis
15. www.ketl.co.uk
13-14 Orchard Street, Bristol BS1 5EH
+44 (0)117 905 5323
info@ketl.co.uk @KETL_BI
Thanks for listening
Don’t forget to sign up to our Talend
community user group Bristol on
LinkedIn for regular updates >
Ian Cray
call: 07813 899 046
email: Ian@ketl.co.uk
Editor's Notes
Having evaluated Talend prior to the project we really didn't have any major concerns.
The real concern we had was related to the solution design. We had initial meetings with another Talend consultancy and we were not convinced that we could work together to produce a design that would lead to our goal of a generic framework.
Dupes across files - current SSIS processes de-dupe within a file and against our CRM but not across files
Data exception handling - current SSIS processes are rather cumbersome
Data quality reporting - current SSIS processes do not log data quality issues
These challenges were met by our more complete framework design which we would have struggled to implement using SSIS
The most problematic aspect of SSIS was the rather clumsy way we were obtaining a modular design and code reuse. With SSIS we break down our processes into packages (which I suppose would map onto Talend Jobs).
Packages communicate through SSIS variables and database interfaces. We are also making heavy use of scripting within SSIS event handlers to implement exception handling and parameter passing which means that our process design is not always transparent.
With Talend however, we now have jobs, joblets, contexts, metadata, routines and a development tool which enables far more elegant solutions.
Process runs from left to right and from top to bottom
Microsoft Dynamics is not yet in production at AgeUK but we are now working on utilising the framework’s data quality functionality with our current CRM.
Due to the success of which the pilot project, other pieces of work that perhaps wouldn’t have been previously considered are now candidates for development with Talend. Such an example is the de-duplication of CRM data prior to migration to our new Microsoft Dynamics CRM.
While AgeUK has data feed integration requirements that are more complex than most charities we have clearly developed a framework that is sufficiently generic and complete to be applicable to similar organisations.