1. SunilSunil
Mobile: 9986 573 250 E-Mail: bigdata.sunil@gmail.com
PROFESSIONAL SNAPSHOTPROFESSIONAL SNAPSHOT
• Over 3+ Years of Hands on Experience in Big-Data Technologies [Spark core, Spark
Streaming, Spark SQL, Hadoop Administration, Cluster Configuration, Java Map/Reduce
programming, YARN, Hive, Presto, Elastic Search, Flume, Logstash, SQOOP, Shell Script
Automation].
• Extensive practical experience in installing, monitoring and maintaining Multi-Node Hadoop
Clusters using Cloud-era distribution (CDH4 & CDH5) and Apache Hadoop, Presto (In-Memory
DB) Cluster, Spark Cluster and Elastic Search Cluster
• Experience in extracting Sentiment & Text Analytic features like POS Tagging, NER, Senti-
Polarity etc using Stanford Lib, Custom NLP and SentiWordNet
• Having 3+ years of experience in of IT experience in Java, J2EE, Struts, Prime faces.
• Extensively developed distributed enterprise applications by using Big-Data Technologies
• Oracle Certified Professional on Java SE 6 Programmer.
• Strong troubleshooting, quick learning, maintenance and Quality Control.
• Good logical and analytical skills with the ability to work under pressure.
• Highly motivated, quick learner, team player with good technical and analytical skills.
• Has the motivation to take independent responsibility as well as ability to contribute and be a
productive team member
PROFESSIONAL BACKGROUNDPROFESSIONAL BACKGROUND
• Currently Working as Tech Lead in Infinite Computer Solutions, Bangalore from April 2014
to till date.
• Previously worked for Tata Consultancy Services Ltd, Bangalore from August 2010 to Oct
2013.
• Previously worked for Tech Mahindra India, Bangalore, from December 2009 August 2010.
• Previously worked as Professor/Tutor for SITAMS (NON-IT) from September 2006 to
December 2009.
SKILL SET
Frameworks : SPARK Streaming & SQL, Java MapRedude, YARN,
] Hive, Presto, Elastic Search
Operating Systems : Ubuntu, Redhat, CentOS, Windows
Hadoop Distributions : Apache Hadoop 2.X, CDH3, CDH4, Cloudera Manager 4.x
Web Server : Apache Tomcat 7.x
IDE : Eclipse Mars
Database : MySql, DB2, HIVE, Spark SQL, Presto Connector
Programming Languages : Core Java, J2EE
PROJECT DETAILS:
2. Project Name Social Media Campaign Management
Client Name EA Group – center of Excellence Team
Role Architect, Design and Implementation
Description SA Campaign management current scope of work supports
Template/Campaign generation and dashboard viewing. The core features
of the Campaign management are: Register User, Create, Edit
Template/Campaign and View Dashboard as listed below.
a) User Registration for Running Campaigns: Registration
process will require user information data.
b) Sentiment Analysis of Campaigns: This consists of a
background process for capturing live twitter streaming data
and calculating sentiment of each relevant tweet for a given
campaign duration as configured by the end user using web
application.
c) SA Dashboard Viewing: The analysis done can be viewed in
the form of different graphs categorized by gender, location,
device types etc.
Period September 2015 – Till Date
Major Tasks - End-End project Architecture modeling.
- Big Data solution design.
- End to End Development, Automation , Deployment & Maintenance
Technologies Spark Streaming, Spark SQL, YARN, Presto, Hive, Presto, Shell Scripting,
Hadoop 2.6, MySQL, Redis Cache, Spring, Tomcat Server
Project Name(s) Verizon Call Drop Analysis, Decision Tree Build & Log Viewer
Client Name Verizon Wireless
Role Hadoop Technical Lead
Description
Log files generated by select mobile phones placed all around the US of A provide
key information with respect to the telecom network. Each log file generated
everyday has a size >10 GB. These log files need to be analyzed in hadoop cluster
on the key parameters provided and the captured data has to be moved to a
SQL/NoSQL system for data visualization. The unstructured data is extracted,
transformed and loaded to the SQL/NoSQL by using big data hadoop as the
backend.
Requirements for each project:
Call Drop Analysis:
Query the SQL/Hive for call drops and build various visualization dashboard charts in
tableau.
Decision Tree Build: Built a decision tree for each call drop for its root cause.
Log Viewer: Web UI for viewing the entire log file data by KPI groups.
Period April 2014 to Sep 2015
Major Tasks
- End-End project design modeling.
- Big Data solution design.
- Requirement gathering, forecasting estimates
- Manage analytics team communication, deliverables, resource management &
planning
Technologies
Database Hive, MySQL, Presto Connector
BI Tools Tableau 8.2
IDE Eclipse
Hadoop CDH3, Cloudera Manager 4.x
Project Name Onvia Recommendation Engine
Client Name Onvia
Role Hadoop Developer
3. Description
Onvia tracks, analyzes and reports the spending of tens of thousands of federal,
state and local government agencies, giving companies a single source for
conducting open, intelligent and efficient business with government. Project
agenda was to get all the customer clicked and liked data in HDFS Platform and
run MR Jobs to structure and aggregate customer related data. This data is then
feeded forward to the Mahout Recommendation engine to predict the user choice
of project.
Period July 2013 – October 2013
Major Tasks
- End-End project design modeling.
- Big Data solution design.
- End to End Development and Deployment
Technologies
Database MySQL 5.x
IDE Eclipse
Hadoop CDH3, Cloudera Manager 4.x
Project Name GTEM Applications
Client Name TCS
Role Design and Implementation
Description
GTEM is used to manage Trade Finance Facilities (TFF) for correspondent banks and
serves as a central clearinghouse to allocate limits to booking units. GTEM also
provides access to worldwide limit availability and pricing information.
Access GTEM to:
• Submit transaction requests for allocations
• Obtain reports on portfolio data and pricing information
• Find links to key resources and information about the allocation process.
Period August 2010- June 2013
Major Tasks
- Leading the team
- Brainstorming, Designing and Development of the different reusable
components / products.
- End to End documentation of the developed components
Technologies
Core Java, Jsp, Servlet, Struts Framework, Java Script, Ajax, Web logic Server
and Oracle Database
Project Name GMRE Network Analyzer Tool
Client Name Tech Mahindra
Role Data Modeler
Desc
riptio
n
GMRE Network Analyzer is a non-service impacting tool to collect
network data of GMRE nodes from GMPLS networks. The tool is capable of
supporting multiple networks data collection and processing. The tool is
based on client-server architecture. Client (GUI & CLI) can be run on
Windows & Linux environment and server (Data collecting server or DCS) on
Linux & HP-UNIX environments. The DCS collects the GMRE node’s logs and
snapshot files from the network periodically as per user scheduled time
configured and stores on a centralized server. The collected data is parsed
and ported on to the DCS database. The DCS ensures consistency of the
periodically collected data from the network so that the data at any instance
of time is available from all the network elements of the networks.
Period December 2009 to August 2010