SlideShare une entreprise Scribd logo
1  sur  4
Big Data Architect/Consultant ‘04’ – ‘16’
CURRICULUM VITAE
Mr. Yuvaraj Mani Email: yuvam@live.com
Mobile: 07576426338
PROFESSIONAL SUMMARY
 A MicrosoftCertified Professional with around 13 + years of total IT experience in the Design,
Modeling,Development,Implementation and Supportof SQL Server 2000/2005/2008/2008
R2,2012, Oracle 9i/10G.
 Experienced in Insurance,Manufacturing,Retail,Banking,Travelling, Law and Education
industries.
 14 + years of experience in Data Extraction, Transforming and Loading (ETL) and (ELT) using Big
Data and Data warehousing using SSIS..
 Have created the distributed storage using HDFS , Amazon S3 for the data locality and distributed
computing.
 Have implemented the map reduce and Spark streaming for the Batch and Streaming process on
the YARN architecture.
 2+ years of DevelopmentExperience in Big data /Hadoop by using Hadoop and Hadoop
Ecosystem Tools (HDFS,MapReduce,Yarn, Hive, Hive UDFs,Beeline(HS2),SQOOP, Drill,
HBase,Oozie,Spark Streaming , Python, Scala,Spork,Spark sql)
 Machine language algorithm using ML Lib, Mahout.
 Have worked on the NoSQL Database MongoDB ,Cassandra,Hbase.
 Have used extensively Hive for the Data Warehousing applications.
 Have used Amazon web services for the local file system storage.
 Extensively using Spark SQL and Scala for processing the real time and the middle size data.
 Have used the MapReduce for the batch processing and the huge volume of data
 Extensively using MapR3.1,4.0.1& 5.1, Intel CDH 5.0, CDH 4.0 clusters.
 Extensive Knowledge in RDBMS & SQL (Oracle, MySQL ,Teradata and SQLite).
 Have used very extensively Cloudera Hadoop Distribution system for processing the Big data.
 Have used the PIG for ETL using MapReduce methodologies.
 Have used the SQOOP to transfer the data between the Hadoop and the relational Database
system.
 Exposure to UNIX & Linux Ubuntu Environment.
 Good knowledge in ETL& Teradata and other Bigdata tools like Splunk.
 Expertise in TES - Tidal Scheduler (Cisco Enterprise Scheduler) and Cron Scheduler.
 Have used Team foundation server and visual source safe for version controls
PROJECT EXPERIENCE HIGHLIGHTS
TUI Group Surbiton; Crawley Big Data Consultant SEP ‘16’ – DEC ‘16’
CDH;Tableau 9.3,SQL Server 2014; Big Data; Hadoop 1.2;Spark 1.3;YARN;HDFS;Apache Sqoop;
Spark SQL;Cassandra;Spork;Oozie; ;scala;python
 Have analyzed business requirement and the impact analysis for existing work flow
and involved in the architecture work flow of project in Big data technology (Hadoop
environment).
 Created the functional specifications document and Technical specifications
document as per the business requirements for every new release.
 Have used the Cloudera’s open source platform for designing and processing the big
data.
 Developed the new rules both Software and Hardware as per business requirement
implemented in Big data Hadoop, Hadoop ecosystem tools, Java and Unix shell
scripting as per the detail in design document.
 Have implemented the distributed storage using HDFS.
Big Data Architect/Consultant ‘04’ – ‘16’
 Have extracted the data from the different source system to the HDFS and
implemented the incremental mode using Sqoop.
 Have implemented distributed computing using Spark streaming using python and
scala for the faster processing of the real time data.
 Have improved the performance by implementing the in memory spark streaming
processing.
 Have used the Spork for the ETL processing and spark SQL for the sql querying
 Have created the unix shell scripting to schedule cron job job using oozie..
 Have monitored the job processing using spark web url.
 Have used the Agile methodologies and the scrum meetings to deliver the Project
A2Dominion Housing LTD, London; Big Data Consultant/Data scientist FEB ‘16’ – May ‘16’
CDH;Hadoop 1.2,Map Reduce,Hive,Pig,Sqoop,Tableau 9.2,SQL Server 2014;Big Data;Spark
1.3;Oozie;Data lake; Data Vault; Cassandra
 Have analyzed the business requirement and designed the architecture.
 Have used the Cloudera’s open source platform for designing and processing the big
data.
 Have created the data transfer between RDBMS and the HDFS using sqoop..
 Have implemented the incremental mode using sqoop and created the job using
oozie.
 Have imported the data from the different source systems such as JSON;Parguet.
 Have created the distributed storage using Hadoop HDFS for the data locality using
the YARN architecture.
 Have created the distributed computing using map reduce using Java
 Have created the tables, dynamic partition and the buckets using HIVE for the data
analysis
 Have created the ETL batch processing using map reduce using pig and stream
processing using Spark 1.3.
 Have monitored the performance ofthe map reduce using the web url and traced the
counters for the processing
 Have optimized the Hive and created the Java using core Java for the faster data
processing
 Have used the unix shell scripting to schedule the cron job using oozie
 Have created the visualization charts and dashboards
STUDYGROUP INTERNATIONAL LTD Big Data Consultant APR ‘15’ – DEC ‘15’
CDH;Hadoop 1.2;Oracle 10G;SQL Server 2012;Power BI; Apache Sqoop;Apache Pig, Apache Hive;
Core Java; Cassandra;Spark 1.3 ;Apache flume
 Have used the Cloudera’s open source platform(CDH) for designing and processing
the big data.
 Have analyzed the source system and created the system design based on the
business requirements
 Have implemented the batch processing using map reduce paradigm using Java
 Have implemented the distributed storage and the data locality using HDFS on the
YARN architecture.
 Have extracted the data between the RDBMS and Hadoop using Sqoop.
 Have implemented the incremental load based on the latestdate and created the
metadata.
 Have implemented the batch processing using map reduce paradigm using Java and
batch streaming using Spark using RDD for the faster processing
 Have used the spork to implementthe High level map reduce paradigm.
 Have created the tables and used the Spark SQL for the Sql querying.
 Have monitored the streaming processing using spark url
EVERSHEDS LONDON LLP [FINANCE] Big Data Consultant- DEC ‘14’ – MAR ‘15’
CDH;Hadoop 1.2;Oracle 10G;Power BI;Hadoop 1.2;Apache Sqoop;YARN;Apache Hive;Apache Pig
 Have analyzed the existing system architecture and created the new architecture.
 Have extracted the data between the RDBMS and Hadoop using Sqoop.
Big Data Architect/Consultant ‘04’ – ‘16’
 Have implemented the incremental load based on the latestdate and created the
metadata
 Have implemented the batch processing using map reduce paradigm using Java
 Have implemented the distributed storage and the data locality using HDFS on the
YARN architecture.
 Have implemented the batch processing using map reduce paradigm using Java
 Have implemented the ETL processing using pig.
 Have created the tables,dynamic Partition and the bucket processing to improve the
performance
 Have optimized the Hive to improve the performance.
 Have automated the sqoop data extraction using UnixCron Job.
 Have provided the bestsolution to improve the performance.
 Have monitored the map reduce counters on the to improve
DTZ LONDON Big Data Consultant SEP ‘14’ – OCT ‘14’
Hortonworks ;Hadoop 1.2;Oracle 10G;Power BI;Hadoop 1.2;Apache Sqoop; YARN;Apache Hive;Apache
Pig
 Have used the Horton works open source platform(CDH) for designing and
processing the big data.
 Have implemented the batch processing using map reduce paradigm using Java
 Have implemented the distributed storage and the data locality using HDFS on the
YARN architecture.
 Have implemented distributed computing using map reduce algorithm for
transforming the data.
 Have implemented the ETL processing using pig using Map Reduce Algorithm.
 Have automated the sqoop data extraction using UnixCron Job.
ELEKTRON TECHNOLOGY UK LTD Big Data Consultant Oct ‘13’ – June ‘14’
CDH;Hadoop 1.2;Oracle 10G;Power BI;Hadoop 1.2;Apache Sqoop;YARN;Apache Hive;Apache Pig
Roles and Responsibilities:
 Have analyzed the existing system architecture and created the new architecture.
 Have extracted the data between the RDBMS and Hadoop using Sqoop.
 Have implemented the incremental load based on the latestdate and created the
metadata
 Have implemented the batch processing using map reduce paradigm using Java
 Have implemented the distributed storage and the data locality using HDFS on the
YARN architecture.
 Have implemented the batch processing using map reduce paradigm using Java
 Have implemented the ETL processing using pig.
 Have created the tables,dynamic Partition and the bucket processing to improve the
performance
 Have optimized the Hive to improve the performance.
 Have automated the sqoop data extraction using UnixCron Job.
 Have provided the bestsolution to improve the performance.
Have monitored the map reduce counters on the to improveWetherSpoonUK LTD
Mar ’11’ – Sep‘13’
SQL Server 2008R2/2012 SSIS/SSRS;Oracle 10G; Aztec, Resource Link DW &BI Consultant
Roles and Responsibilities:
 Responsible for the design,developmentand implementation of mappings using ETL
using SSIS.
 Have used the T-SQL using stored procedures, Views, user defined functions
effectively in the SSIS.
 Have dealt with the huge or big volume of data using SSIS.
 Have created CTE’S, Temp table using T-SQL to load the ETL using SSIS.
 Designed SSIS Packages to transfer data between servers, load data into database;
 Used .NET Provider Data Reader Source Task to Read from RDBMS sources.
 Worked on XML Web Services Control Flow task to read from XML Source.
 Involved in the Database Migration from 2000 to 2008 R2..
Big Data Architect/Consultant ‘04’ – ‘16’
 Performed Unit Testing and Prepared maintenance plans.
 Developed the Financial reports such as GL Summary,PL Account Summary
usingSSRS that delivered more value to the business.
 Involved in the database migration from 2000 to 2008 R2.
BUPA Oct ’10’ – Feb ‘11’
SQL SERVER 2008 SSIS,ETL& BI Developer,Teradata
.
Covidien UK LTD July’10’ – Oct ‘10’
SQL SERVER 2008 SSIS/SSAS SYSTEM ANALYST
Mphasis an HP Company April ‘06’ - April ‘10’
Delivered business solutions for customers:
General Motors Apr ‘06’ - Apr ‘10’
SQL Server 2005 SSIS
Environment: SQL Server 2008, BIDS, Integration Services (SSIS),Analysis Services (SSAS), Windows
Server 2003.
Ramco Systems Mar ‘05’ - April ‘06’
Galenicare, Pharmisicist Group Switzerland.
SQL Server 2000.DTS
Environment: Windows 2000,Sql Server 2000,Visio,QueryAnalyzer,SQL Profiler
Fusiontec Software Pvt. Ltd, Chennai June ‘02’ – Mar ‘05’
ConAgra’s Foods, USA
Environment: Windows 2000,Sql Server 2000,Visio,QueryAnalyzer,SQL Profiler
TECHNICAL SKILLS
Operating Systems - Win 2000/NT/2003/2008 server.
RDBMS - MS SQL SERVER 2008/2005/2000,MS Access, Oracle8i/9i,Oracle 10G.
Database Tools - SQL Query Analyzer, SQL Enterprise Manager,Management.
Studio,SQL Server 2005, Query Editor, Reporting Server,
Ms-Access,Pl/sql Developer,Toad.
ETL Tools - Data Transformation Services (DTS),
MS SQL Server Integration Services (SSIS).
Reporting Tools - Crystal Reports XI/X, MS SQL Server Reporting Services (SSRS),
Cognos 10.1/10.2
OLAP Tools - SSAS, Transformer 10.2,IBM Cognos Designer.
EDUCATION
 Qualified Professional Masters degree (Master of Computer Application) MCA from one of the
India’s reputed University,University of Madras, Chennai.
CERTIFICATION
Certified Microsoft professional for
 Implementing a Data warehouse with MicrosoftSql server 2012.
 Querying a Database using MicrosoftSql server 2012.

Contenu connexe

Tendances

a9TD6cbzTZotpJihekdc+w==.docx
a9TD6cbzTZotpJihekdc+w==.docxa9TD6cbzTZotpJihekdc+w==.docx
a9TD6cbzTZotpJihekdc+w==.docxVasimMemon4
 
Introduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLIntroduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLNick Dimiduk
 
Big dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosqlBig dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosqlKhanderao Kand
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
Overview of stinger interactive query for hive
Overview of stinger   interactive query for hiveOverview of stinger   interactive query for hive
Overview of stinger interactive query for hiveDavid Kaiser
 
Hadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduceHadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduceUwe Printz
 
HDInsight Hadoop on Windows Azure
HDInsight Hadoop on Windows AzureHDInsight Hadoop on Windows Azure
HDInsight Hadoop on Windows AzureLynn Langit
 
Hive at Yahoo: Letters from the trenches
Hive at Yahoo: Letters from the trenchesHive at Yahoo: Letters from the trenches
Hive at Yahoo: Letters from the trenchesDataWorks Summit
 
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachEvolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachDataWorks Summit
 
Interactive query in hadoop
Interactive query in hadoopInteractive query in hadoop
Interactive query in hadoopRommel Garcia
 
Nagarjuna_Damarla
Nagarjuna_DamarlaNagarjuna_Damarla
Nagarjuna_DamarlaNag Arjun
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Databricks
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Don Demcsak
 
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.Data Con LA
 
sam_resume - updated
sam_resume - updatedsam_resume - updated
sam_resume - updatedsam k
 
Big Data on the Microsoft Platform
Big Data on the Microsoft PlatformBig Data on the Microsoft Platform
Big Data on the Microsoft PlatformAndrew Brust
 
Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014Data Con LA
 

Tendances (20)

a9TD6cbzTZotpJihekdc+w==.docx
a9TD6cbzTZotpJihekdc+w==.docxa9TD6cbzTZotpJihekdc+w==.docx
a9TD6cbzTZotpJihekdc+w==.docx
 
Introduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLIntroduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQL
 
Big dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosqlBig dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosql
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Overview of stinger interactive query for hive
Overview of stinger   interactive query for hiveOverview of stinger   interactive query for hive
Overview of stinger interactive query for hive
 
Hadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduceHadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduce
 
HDInsight Hadoop on Windows Azure
HDInsight Hadoop on Windows AzureHDInsight Hadoop on Windows Azure
HDInsight Hadoop on Windows Azure
 
Hive at Yahoo: Letters from the trenches
Hive at Yahoo: Letters from the trenchesHive at Yahoo: Letters from the trenches
Hive at Yahoo: Letters from the trenches
 
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachEvolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
 
Interactive query in hadoop
Interactive query in hadoopInteractive query in hadoop
Interactive query in hadoop
 
Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_ppt
 
Nagarjuna_Damarla
Nagarjuna_DamarlaNagarjuna_Damarla
Nagarjuna_Damarla
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
Big data with java
Big data with javaBig data with java
Big data with java
 
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
 
sam_resume - updated
sam_resume - updatedsam_resume - updated
sam_resume - updated
 
Big Data on the Microsoft Platform
Big Data on the Microsoft PlatformBig Data on the Microsoft Platform
Big Data on the Microsoft Platform
 
Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014
 
SAP HORTONWORKS
SAP HORTONWORKSSAP HORTONWORKS
SAP HORTONWORKS
 

Similaire à YUVAM17_BIGDATA

Rajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developerRajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developerRajeev Kumar
 
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scalaSunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scalaMopuru Babu
 
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Alex Zeltov
 
Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Pabba Gupta
 
Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Rajan Kanitkar
 
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop   in-memory processing with sparkHortonworks tech workshop   in-memory processing with spark
Hortonworks tech workshop in-memory processing with sparkHortonworks
 
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop EcosystemWhy Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop EcosystemCloudera, Inc.
 
Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitSaptak Sen
 
Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data Jongwook Woo
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Djamel Zouaoui
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1Thanh Nguyen
 
Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015Edureka!
 
Hot-Spot analysis Using Apache Spark framework
Hot-Spot analysis Using Apache Spark frameworkHot-Spot analysis Using Apache Spark framework
Hot-Spot analysis Using Apache Spark frameworkSupriya .
 

Similaire à YUVAM17_BIGDATA (20)

Rajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developerRajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developer
 
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scalaSunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
 
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
 
Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0
 
hadoop resume
hadoop resumehadoop resume
hadoop resume
 
BigData_Krishna Kumar Sharma
BigData_Krishna Kumar SharmaBigData_Krishna Kumar Sharma
BigData_Krishna Kumar Sharma
 
spark_v1_2
spark_v1_2spark_v1_2
spark_v1_2
 
Sureh hadoop 3 years t
Sureh hadoop 3 years tSureh hadoop 3 years t
Sureh hadoop 3 years t
 
Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014
 
Mukul-Resume
Mukul-ResumeMukul-Resume
Mukul-Resume
 
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop   in-memory processing with sparkHortonworks tech workshop   in-memory processing with spark
Hortonworks tech workshop in-memory processing with spark
 
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop EcosystemWhy Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
 
Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop Summit
 
Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1
 
Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015
 
Big Data Journey
Big Data JourneyBig Data Journey
Big Data Journey
 
Robin_Hadoop
Robin_HadoopRobin_Hadoop
Robin_Hadoop
 
Hot-Spot analysis Using Apache Spark framework
Hot-Spot analysis Using Apache Spark frameworkHot-Spot analysis Using Apache Spark framework
Hot-Spot analysis Using Apache Spark framework
 

YUVAM17_BIGDATA

  • 1. Big Data Architect/Consultant ‘04’ – ‘16’ CURRICULUM VITAE Mr. Yuvaraj Mani Email: yuvam@live.com Mobile: 07576426338 PROFESSIONAL SUMMARY  A MicrosoftCertified Professional with around 13 + years of total IT experience in the Design, Modeling,Development,Implementation and Supportof SQL Server 2000/2005/2008/2008 R2,2012, Oracle 9i/10G.  Experienced in Insurance,Manufacturing,Retail,Banking,Travelling, Law and Education industries.  14 + years of experience in Data Extraction, Transforming and Loading (ETL) and (ELT) using Big Data and Data warehousing using SSIS..  Have created the distributed storage using HDFS , Amazon S3 for the data locality and distributed computing.  Have implemented the map reduce and Spark streaming for the Batch and Streaming process on the YARN architecture.  2+ years of DevelopmentExperience in Big data /Hadoop by using Hadoop and Hadoop Ecosystem Tools (HDFS,MapReduce,Yarn, Hive, Hive UDFs,Beeline(HS2),SQOOP, Drill, HBase,Oozie,Spark Streaming , Python, Scala,Spork,Spark sql)  Machine language algorithm using ML Lib, Mahout.  Have worked on the NoSQL Database MongoDB ,Cassandra,Hbase.  Have used extensively Hive for the Data Warehousing applications.  Have used Amazon web services for the local file system storage.  Extensively using Spark SQL and Scala for processing the real time and the middle size data.  Have used the MapReduce for the batch processing and the huge volume of data  Extensively using MapR3.1,4.0.1& 5.1, Intel CDH 5.0, CDH 4.0 clusters.  Extensive Knowledge in RDBMS & SQL (Oracle, MySQL ,Teradata and SQLite).  Have used very extensively Cloudera Hadoop Distribution system for processing the Big data.  Have used the PIG for ETL using MapReduce methodologies.  Have used the SQOOP to transfer the data between the Hadoop and the relational Database system.  Exposure to UNIX & Linux Ubuntu Environment.  Good knowledge in ETL& Teradata and other Bigdata tools like Splunk.  Expertise in TES - Tidal Scheduler (Cisco Enterprise Scheduler) and Cron Scheduler.  Have used Team foundation server and visual source safe for version controls PROJECT EXPERIENCE HIGHLIGHTS TUI Group Surbiton; Crawley Big Data Consultant SEP ‘16’ – DEC ‘16’ CDH;Tableau 9.3,SQL Server 2014; Big Data; Hadoop 1.2;Spark 1.3;YARN;HDFS;Apache Sqoop; Spark SQL;Cassandra;Spork;Oozie; ;scala;python  Have analyzed business requirement and the impact analysis for existing work flow and involved in the architecture work flow of project in Big data technology (Hadoop environment).  Created the functional specifications document and Technical specifications document as per the business requirements for every new release.  Have used the Cloudera’s open source platform for designing and processing the big data.  Developed the new rules both Software and Hardware as per business requirement implemented in Big data Hadoop, Hadoop ecosystem tools, Java and Unix shell scripting as per the detail in design document.  Have implemented the distributed storage using HDFS.
  • 2. Big Data Architect/Consultant ‘04’ – ‘16’  Have extracted the data from the different source system to the HDFS and implemented the incremental mode using Sqoop.  Have implemented distributed computing using Spark streaming using python and scala for the faster processing of the real time data.  Have improved the performance by implementing the in memory spark streaming processing.  Have used the Spork for the ETL processing and spark SQL for the sql querying  Have created the unix shell scripting to schedule cron job job using oozie..  Have monitored the job processing using spark web url.  Have used the Agile methodologies and the scrum meetings to deliver the Project A2Dominion Housing LTD, London; Big Data Consultant/Data scientist FEB ‘16’ – May ‘16’ CDH;Hadoop 1.2,Map Reduce,Hive,Pig,Sqoop,Tableau 9.2,SQL Server 2014;Big Data;Spark 1.3;Oozie;Data lake; Data Vault; Cassandra  Have analyzed the business requirement and designed the architecture.  Have used the Cloudera’s open source platform for designing and processing the big data.  Have created the data transfer between RDBMS and the HDFS using sqoop..  Have implemented the incremental mode using sqoop and created the job using oozie.  Have imported the data from the different source systems such as JSON;Parguet.  Have created the distributed storage using Hadoop HDFS for the data locality using the YARN architecture.  Have created the distributed computing using map reduce using Java  Have created the tables, dynamic partition and the buckets using HIVE for the data analysis  Have created the ETL batch processing using map reduce using pig and stream processing using Spark 1.3.  Have monitored the performance ofthe map reduce using the web url and traced the counters for the processing  Have optimized the Hive and created the Java using core Java for the faster data processing  Have used the unix shell scripting to schedule the cron job using oozie  Have created the visualization charts and dashboards STUDYGROUP INTERNATIONAL LTD Big Data Consultant APR ‘15’ – DEC ‘15’ CDH;Hadoop 1.2;Oracle 10G;SQL Server 2012;Power BI; Apache Sqoop;Apache Pig, Apache Hive; Core Java; Cassandra;Spark 1.3 ;Apache flume  Have used the Cloudera’s open source platform(CDH) for designing and processing the big data.  Have analyzed the source system and created the system design based on the business requirements  Have implemented the batch processing using map reduce paradigm using Java  Have implemented the distributed storage and the data locality using HDFS on the YARN architecture.  Have extracted the data between the RDBMS and Hadoop using Sqoop.  Have implemented the incremental load based on the latestdate and created the metadata.  Have implemented the batch processing using map reduce paradigm using Java and batch streaming using Spark using RDD for the faster processing  Have used the spork to implementthe High level map reduce paradigm.  Have created the tables and used the Spark SQL for the Sql querying.  Have monitored the streaming processing using spark url EVERSHEDS LONDON LLP [FINANCE] Big Data Consultant- DEC ‘14’ – MAR ‘15’ CDH;Hadoop 1.2;Oracle 10G;Power BI;Hadoop 1.2;Apache Sqoop;YARN;Apache Hive;Apache Pig  Have analyzed the existing system architecture and created the new architecture.  Have extracted the data between the RDBMS and Hadoop using Sqoop.
  • 3. Big Data Architect/Consultant ‘04’ – ‘16’  Have implemented the incremental load based on the latestdate and created the metadata  Have implemented the batch processing using map reduce paradigm using Java  Have implemented the distributed storage and the data locality using HDFS on the YARN architecture.  Have implemented the batch processing using map reduce paradigm using Java  Have implemented the ETL processing using pig.  Have created the tables,dynamic Partition and the bucket processing to improve the performance  Have optimized the Hive to improve the performance.  Have automated the sqoop data extraction using UnixCron Job.  Have provided the bestsolution to improve the performance.  Have monitored the map reduce counters on the to improve DTZ LONDON Big Data Consultant SEP ‘14’ – OCT ‘14’ Hortonworks ;Hadoop 1.2;Oracle 10G;Power BI;Hadoop 1.2;Apache Sqoop; YARN;Apache Hive;Apache Pig  Have used the Horton works open source platform(CDH) for designing and processing the big data.  Have implemented the batch processing using map reduce paradigm using Java  Have implemented the distributed storage and the data locality using HDFS on the YARN architecture.  Have implemented distributed computing using map reduce algorithm for transforming the data.  Have implemented the ETL processing using pig using Map Reduce Algorithm.  Have automated the sqoop data extraction using UnixCron Job. ELEKTRON TECHNOLOGY UK LTD Big Data Consultant Oct ‘13’ – June ‘14’ CDH;Hadoop 1.2;Oracle 10G;Power BI;Hadoop 1.2;Apache Sqoop;YARN;Apache Hive;Apache Pig Roles and Responsibilities:  Have analyzed the existing system architecture and created the new architecture.  Have extracted the data between the RDBMS and Hadoop using Sqoop.  Have implemented the incremental load based on the latestdate and created the metadata  Have implemented the batch processing using map reduce paradigm using Java  Have implemented the distributed storage and the data locality using HDFS on the YARN architecture.  Have implemented the batch processing using map reduce paradigm using Java  Have implemented the ETL processing using pig.  Have created the tables,dynamic Partition and the bucket processing to improve the performance  Have optimized the Hive to improve the performance.  Have automated the sqoop data extraction using UnixCron Job.  Have provided the bestsolution to improve the performance. Have monitored the map reduce counters on the to improveWetherSpoonUK LTD Mar ’11’ – Sep‘13’ SQL Server 2008R2/2012 SSIS/SSRS;Oracle 10G; Aztec, Resource Link DW &BI Consultant Roles and Responsibilities:  Responsible for the design,developmentand implementation of mappings using ETL using SSIS.  Have used the T-SQL using stored procedures, Views, user defined functions effectively in the SSIS.  Have dealt with the huge or big volume of data using SSIS.  Have created CTE’S, Temp table using T-SQL to load the ETL using SSIS.  Designed SSIS Packages to transfer data between servers, load data into database;  Used .NET Provider Data Reader Source Task to Read from RDBMS sources.  Worked on XML Web Services Control Flow task to read from XML Source.  Involved in the Database Migration from 2000 to 2008 R2..
  • 4. Big Data Architect/Consultant ‘04’ – ‘16’  Performed Unit Testing and Prepared maintenance plans.  Developed the Financial reports such as GL Summary,PL Account Summary usingSSRS that delivered more value to the business.  Involved in the database migration from 2000 to 2008 R2. BUPA Oct ’10’ – Feb ‘11’ SQL SERVER 2008 SSIS,ETL& BI Developer,Teradata . Covidien UK LTD July’10’ – Oct ‘10’ SQL SERVER 2008 SSIS/SSAS SYSTEM ANALYST Mphasis an HP Company April ‘06’ - April ‘10’ Delivered business solutions for customers: General Motors Apr ‘06’ - Apr ‘10’ SQL Server 2005 SSIS Environment: SQL Server 2008, BIDS, Integration Services (SSIS),Analysis Services (SSAS), Windows Server 2003. Ramco Systems Mar ‘05’ - April ‘06’ Galenicare, Pharmisicist Group Switzerland. SQL Server 2000.DTS Environment: Windows 2000,Sql Server 2000,Visio,QueryAnalyzer,SQL Profiler Fusiontec Software Pvt. Ltd, Chennai June ‘02’ – Mar ‘05’ ConAgra’s Foods, USA Environment: Windows 2000,Sql Server 2000,Visio,QueryAnalyzer,SQL Profiler TECHNICAL SKILLS Operating Systems - Win 2000/NT/2003/2008 server. RDBMS - MS SQL SERVER 2008/2005/2000,MS Access, Oracle8i/9i,Oracle 10G. Database Tools - SQL Query Analyzer, SQL Enterprise Manager,Management. Studio,SQL Server 2005, Query Editor, Reporting Server, Ms-Access,Pl/sql Developer,Toad. ETL Tools - Data Transformation Services (DTS), MS SQL Server Integration Services (SSIS). Reporting Tools - Crystal Reports XI/X, MS SQL Server Reporting Services (SSRS), Cognos 10.1/10.2 OLAP Tools - SSAS, Transformer 10.2,IBM Cognos Designer. EDUCATION  Qualified Professional Masters degree (Master of Computer Application) MCA from one of the India’s reputed University,University of Madras, Chennai. CERTIFICATION Certified Microsoft professional for  Implementing a Data warehouse with MicrosoftSql server 2012.  Querying a Database using MicrosoftSql server 2012.