SlideShare une entreprise Scribd logo
1  sur  24
Pallavi
Pallavimahajan2k8@gmail.com
(610)-653-9371
PROFESSIONAL SUMMARY:
• Over 8 years of experience in analysis, design, development, maintenance and testing of big data, database and
web applications.
• Worked in Hadoop Ecosystem with highly unstructured, semi-structured and structured data of around 10 TB in
size.
• Extracted the data from database servers into HDFS using Sqoop to populate Hive external tables.
• Used Flume for ingesting files directly from a given directory into HDFS with some format transformation.
• Developed Hive (version 0.14) Scripts for analytical requirements to perform Ad hoc analysis and storing data
into hive tables for visualizing it using Tableau Dashboards.
• Very good understanding of partitions, bucketing concepts in HIVE and worked on both Managed and External
Tables in HIVE.
• Applied different formatting, compression and vectorization techniques for optimizing performance of hive
queries.
• Experience in using Sequence, ORC File, AVRO and parquet File formats.
• Applied multiple transformations on raw data in HDFS using Pig queries.
• Experience of using regular expressions for parsing the input file in both Hive and Pig.
• Very good understanding of partitions, bucketing concepts in HIVE and designed both Managed and External
tables in HIVE to optimize performance.
• Developed Oozie Workflow for scheduling and planning the ETL Process.
• Very good understanding of Map Reduce 1 (Job Tracker) setup and YARN Framework.
• Good experience of using Apache Spark as an ETL process in both python and Scala.
• Experience of using Apache Spark RDDs, pair RDDs and spark SQL for loading and saving data to HDFS, hive
tables in different formats and applied different operations on them.
• Deep knowledge of different types of transformations and actions applied on Spark RDDs.
• Deep understanding of Apache Spark internal architecture and different techniques to improve performance of
spark operations.
• Very good understanding of using spark streaming (DStreams) and machine learning algorithms.
• Brief knowledge of using GraphX in Spark.
• Worked on 25 node Hadoop cluster running on CDH5 and 5 node Hadoop cluster on Amazon Web Services
EC2 with SSH connectivity on the basis of public /private key relationship.
• Experience in creating Tableau work sheets and dashboards using Tableau Desktop version 9.2.
• Ability to create interactive Dashboards using parameters and filters.
• Have a good experience of creating multiple types of graphs in tableau using dimensions, measures and
calculated fields.
• Strong experience in SQL Server 2012/2008(R2) for creating, managing and maintenance of database
applications using Stored Procedure, TSQL coding, Performance Tuning and Query Optimization.
• Highly proficient in the use of T-SQL for developing complex Stored Procedures, Triggers, Tables, User Defined
Functions, views, indexes, user profiles, Relational Database models, Data integrity, query writing and SQL joins.
• Experienced in Creating, Configuring and Fine-tuning ETL workflows designed between Homogenous and
Heterogeneous Systems using SSIS of MS SQL Server 2012/2008(R2).
• High expertise in creating Reports in SSRS, like parameterized Reports, Pivot and Tabular Reports and charts
based on client requirement.
• High experience in Analysis, Design, Development, Testing and Implementation of Crystal Reports 8.5/9/10/XI
R2/2008.
• Extensive knowledge of C# and Vb programing, Regular working experience on Microsoft visual Studio.
• Good Experience in developing java applications using jsp, servlets, Core-java.
• Excellent communication and inter-personal skills with ability to develop creative solutions for challenging client
needs.
TECHNICAL SKILLS
Big Data Ecosystems : Hadoop, Map Reduce, HDFS, Apache Spark, Hive, Pig, Sqoop, Flume, Oozie
Programming Languages : Scala,vb.net, c#.net, Java, C/C++
Declarative Languages : T-SQL, ANSI SQL
Scripting Languages : Python, java script, XML, HTML
Visualization : Tableau 8.3,9.2,10.0
Databases : Sql Server 2012/2008(R2), 2005, NoSQL, MySQL
Reporting Services : Sql Server Reporting Services (SSRS) 2012/2008(R2) ,Crystal Reports 13/10/8.5
ETL : Sql Server Integration Services (SSIS) 2012/2008(R2)
Analytical Services : Sql Server Analysis Services (SSAS) 2008(R2)
Platforms : windows (XP/2010),Linux
Streaming Services : Spark Streaming
Machine Learning : Spark Machine Learning Algorithms
CERTIFICATION(S) DONE
Cloudera Hadoop and Spark Developer Certification (CCA-175)
http://certification.cloudera.com/verify
License No- 100-016-880
h http://certificloudera.com/verify
Hortonworks Hadoop Developer Certification (HDPCD)
http://bcert.me/scfmjojg
Microsoft Certified Solutions Associate (MCSA)
• Querying Microsoft SQL Server 2012(Microsoft certified Professional)
• Administering Microsoft SQL Server 2012 Databases
• Implementing a Data Warehouse with Microsoft SQL Server 2012
EDUCATION
Bachelors of Engineering (B.Tech) in Computer science and Engineering from Beant College of Engineering and
Technology, Punjab, India.
PROFESSIONAL EXPERIENCE
Life Care Centers of America (LCCA), Hunt valley, MD
NTT Data Net Solutions Insight April 2015 to Till Date
Hadoop Developer
Insight is business intelligence software which is delivered through dashboards where you can visualize, monitor, and
analyze information, bringing you Key Performance Indicators and alerts, both clinical and financial. Insight is an
excellent tool for identifying areas to address to reduce hospital readmissions. It gives you the data you need for
analysis and provides a feedback loop to measure the success of your Quality Improvement activities. Insight supports
analysis with its ability to display and drill down to pertinent data by:
1. Changing to a different view, such as a comparison or consolidation of multiple sites, one facility, a department, or
unit.
2. Filtering and sorting data by one or more factors such as station, facility, and payer.
3. Drilling down to the source data, such as a resident, progress note, or invoice.
This project is on cluster with 25 node size.
RESPONSIBILITIES:
• Gathering the requirement from the business team, Analyzing the requirements and Preparing Functional
Design Documents.
• Imported & exported data from multiple database servers to HDFS using sqoop for fast data transfer.
• Developed ETL Transformations using Pig to parse the raw data, and stored the refined data in HIVE warehouse
directory.
• Created multiple hive tables and stored data into it using different formats.
• Created oozie workflow for scheduling the overall flow of data among different processes.
• Created Tableau worksheets and dashboards using cloudera hive server.
• Unit Test the dashboards for data quality and optimization check.
• Manage the issues raised from QA and assign the work items to the team and take up critical work items.
• Coordinating with onshore team on daily basis and help understand the business requirements.
• Code review to ensure quality standards.
Technologies used:
Oozie, Pig, HDFS, Sqoop, HIVE, Tableau 8.3 & 9.2
NTT DATA Inc., Hunt valley, MD
Performance Metrics May 2014 to March 2015
Hadoop Developer
Performance Metrics is built with an aim to calculate the performance of employees on the basis of its task
completion. It records the number of time the task is being re worked; it also estimates the time to complete a task
against the actual time taken by an employee to complete it. All these data’s are being collected, aggregated and
analyzed in a Hadoop cluster in an hourly basis. Along with it, it analyses employee needs, demands, achievements,
certifications and basic information annually, while it also analyses rewards & recognition and motivational factor of
an employee periodically. This project is on cluster with 10 node size.
Responsibilities:
• Supported code /design analysis and project planning.
• Ingesting unstructured and semi-structured data using Flume (version 0.9.4) in HDFS directory.
• Created PIG queries for preprocessing data using regular expressions and filter the data which is not
appropriate.
• Developed Hive scripts for adhoc analysis and daily analysis.
• Maintained the entire process by using OOZie workflow and scheduled it to process hourly.
• Used Sqoop to export data into MySQL from HDFS and HIVE.
Technologies used:
HDFS, Sqoop, HIVE, PIG, Oozie, Flume
Family Healthcare of Ellensburg, Barton Healthcare, Redmond, WA
NTT DATA Health Care Solution Division (HSD) Clinical eAssignment {following CCHIT} August 2012 to April 2014
SQL Server Developer
EAssignment is a mailing feature provided to the health care domain. This project will allow the facility staff to assign
tasks to others electronically. This Project includes all the features like Composing the task, assign the task, forward the
task, Provide a Reply and Reply all attribute and can also generate the report of Task as per the need as per CCHIT.
RESPONSIBILITIES:
• Created Databases, Tables, Cluster/Non-Cluster Index, Unique/Check Constraints, Views, Stored Procedures,
Triggers.
• Wrote efficient stored procedures for the optimal performance of the system.
• Monitored performance and optimized SQL queries for maximum efficiency.
• Generated custom and parameterized reports using SSRS.
• Actively involved in developing Complex SSRS Reports involving Sub Reports, Matrix/Tabular Reports, Charts and
Graphs.
• Advanced extensible reporting skills using Reporting Services (SSRS).
• Involved in complete SSIS life cycle in creating SSIS packages, building, deploying and executing the packages in
both the environments (Development and Production).
• Testing and integrating the developed code.
• Participate in the business team meetings to represent the technical team to help the business team with
technical feasibility.
• Design the architecture for new requirements and enhancements.
• Manage the issues raised from QA and assign the work items to the team and take up critical work items.
• Review code delivery and manage the Clear Case for defect tracking.
Technologies used:
Windows, Sql Server 2008(R2)/2012, SSRS 2008(R2)/2012, SSIS 2008(R2)/2012, Visual Studio 2008/2013, C#.
NTT Data – Healthcare Technologies, New Delhi, INDIA
NTT DATA Net Solutions – RAM & CLINICALS December 2010 to July 2012
SQL Server and Dot net Developer
NTT DATA Net Solutions, next generation financial and clinical software, combines the rich functionality LTC’s demand
from software with the strength of Web technology. NetSolutions offers Long Term and Post-Acute Care provider’s
maximum choice in hosting, customization, and reporting. NetSolutions is integrated system for an electronic medical
record, Billing, point-of-care, and business/clinical intelligence. The Project Comprises of full reengineering of a thick
client to a more scalable Client server application for the management of Health Care Software Solution. The project is
being made using .Net Technologies consisting of VB.Net and sql server. Project also uses Crystal Reports for billing and
reporting purposes.
RESPONSIBILITIES:
• Developing and delivering product enhancement to the clients.
• Converting vb .NET code to the Stored Procedure for speed performance using SQL Server 2008 R2.
• Maintaining the application by solving various issues coming at our end in .NET and stored procedures which we
created.
• Critical Stored proc analysis and bug fixation.
• Performed Query optimization & Performance Tuning.
• Responsible for rebuilding the indexes and tables as part of performance tuning.
• Created indexed views and appropriate indexes to reduce the running time for complex queries.
• Performing Design and writing technical design documents, test plans and system interface documents.
• Involved in developing and debugging database stored procedures
• Doing coding, unit testing and reviews.
• Testing and integrating the developed code.
• Production, customer issues, ad hoc request resolution
• Design the architecture for new requirements and enhancements.
• Manage the issues raised from QA and assign the work items to the team and take up critical work items.
• Discuss with the business team to clarify the requirements to get clarity of certain business functionalities.
• Review code delivery and manage the Clear Case for defect tracking.
• Participate in the integration testing and review the integration defects and suggest the release plan for the
defects.
• Developing and delivering product enhancement to the clients.
Technologies used:
Windows, Sql Server 2008(R2)/2012, Visual Studio 2008, C#, Crystal Report 8.5/10/13.
Zenith Computers, Chandigarh, India
Hardware Management application. November 2008 to September 2010
JAVA Developer
This project is concerned to develop a business application for Zenith Computers. The main aim of this project is to
automate the manual processing in the stock maintenance, financial transactions, and employee transactions etc. we
developed a business application for the concerned company. The objective of this project is to automate the Computer
Systems dealer transactions as a whole.
Responsibilities:
• Designing and analysis of a business requirement model.
• Development of a code using Core-java (version 5.0), J2EE, Jsp using multi-tier architecture.
• Designing and developing database architecture in MySQL for stock, financial and employee information
maintenance.
• Unit testing and peer-to-peer testing done for code quality check.
• Maintenance support provided for smooth running of application.
Technologies used:
Core-java, j2EE, JSP, servlets, MySQL

Contenu connexe

Tendances

Hadoop Big Data Resume
Hadoop Big Data ResumeHadoop Big Data Resume
Hadoop Big Data Resumearbind_jha
 
Sasmita bigdata resume
Sasmita bigdata resumeSasmita bigdata resume
Sasmita bigdata resumeSasmita Swain
 
Rama prasad owk etl hadoop_developer
Rama prasad owk etl hadoop_developerRama prasad owk etl hadoop_developer
Rama prasad owk etl hadoop_developerramaprasad owk
 
ATHOKPAM NABAKUMAR SINGH's HADOOP ADMIN
ATHOKPAM NABAKUMAR SINGH's HADOOP ADMINATHOKPAM NABAKUMAR SINGH's HADOOP ADMIN
ATHOKPAM NABAKUMAR SINGH's HADOOP ADMINAthokpam Nabakumar
 
Senior systems engineer at Infosys with 2.4yrs of experience on Bigdata & hadoop
Senior systems engineer at Infosys with 2.4yrs of experience on Bigdata & hadoopSenior systems engineer at Infosys with 2.4yrs of experience on Bigdata & hadoop
Senior systems engineer at Infosys with 2.4yrs of experience on Bigdata & hadoopabinash bindhani
 
Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Pabba Gupta
 
Amith_Hadoop_Admin_CV
Amith_Hadoop_Admin_CVAmith_Hadoop_Admin_CV
Amith_Hadoop_Admin_CVAmith R
 
Vijay_hadoop admin
Vijay_hadoop adminVijay_hadoop admin
Vijay_hadoop adminvijay vijay
 
Bharath Hadoop Resume
Bharath Hadoop ResumeBharath Hadoop Resume
Bharath Hadoop ResumeBharath Kumar
 
Big Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeNBig Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeNDataWorks Summit
 
Webinar: Selecting the Right SQL-on-Hadoop Solution
Webinar: Selecting the Right SQL-on-Hadoop SolutionWebinar: Selecting the Right SQL-on-Hadoop Solution
Webinar: Selecting the Right SQL-on-Hadoop SolutionMapR Technologies
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopHortonworks
 
Hawq wp 042313_final
Hawq wp 042313_finalHawq wp 042313_final
Hawq wp 042313_finalEMC
 
Azure_Business_Opportunity
Azure_Business_OpportunityAzure_Business_Opportunity
Azure_Business_OpportunityNojan Emad
 

Tendances (20)

Hadoop Big Data Resume
Hadoop Big Data ResumeHadoop Big Data Resume
Hadoop Big Data Resume
 
Poorna Hadoop
Poorna HadoopPoorna Hadoop
Poorna Hadoop
 
Sasmita bigdata resume
Sasmita bigdata resumeSasmita bigdata resume
Sasmita bigdata resume
 
Rama prasad owk etl hadoop_developer
Rama prasad owk etl hadoop_developerRama prasad owk etl hadoop_developer
Rama prasad owk etl hadoop_developer
 
ATHOKPAM NABAKUMAR SINGH's HADOOP ADMIN
ATHOKPAM NABAKUMAR SINGH's HADOOP ADMINATHOKPAM NABAKUMAR SINGH's HADOOP ADMIN
ATHOKPAM NABAKUMAR SINGH's HADOOP ADMIN
 
Mukul-Resume
Mukul-ResumeMukul-Resume
Mukul-Resume
 
Senior systems engineer at Infosys with 2.4yrs of experience on Bigdata & hadoop
Senior systems engineer at Infosys with 2.4yrs of experience on Bigdata & hadoopSenior systems engineer at Infosys with 2.4yrs of experience on Bigdata & hadoop
Senior systems engineer at Infosys with 2.4yrs of experience on Bigdata & hadoop
 
Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0
 
Amith_Hadoop_Admin_CV
Amith_Hadoop_Admin_CVAmith_Hadoop_Admin_CV
Amith_Hadoop_Admin_CV
 
Vijay_hadoop admin
Vijay_hadoop adminVijay_hadoop admin
Vijay_hadoop admin
 
Bharath Hadoop Resume
Bharath Hadoop ResumeBharath Hadoop Resume
Bharath Hadoop Resume
 
Big Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeNBig Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeN
 
Webinar: Selecting the Right SQL-on-Hadoop Solution
Webinar: Selecting the Right SQL-on-Hadoop SolutionWebinar: Selecting the Right SQL-on-Hadoop Solution
Webinar: Selecting the Right SQL-on-Hadoop Solution
 
PRAFUL_HADOOP
PRAFUL_HADOOPPRAFUL_HADOOP
PRAFUL_HADOOP
 
BigData_Krishna Kumar Sharma
BigData_Krishna Kumar SharmaBigData_Krishna Kumar Sharma
BigData_Krishna Kumar Sharma
 
50 Shades of SQL
50 Shades of SQL50 Shades of SQL
50 Shades of SQL
 
PRAFUL_HADOOP
PRAFUL_HADOOPPRAFUL_HADOOP
PRAFUL_HADOOP
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache Hadoop
 
Hawq wp 042313_final
Hawq wp 042313_finalHawq wp 042313_final
Hawq wp 042313_final
 
Azure_Business_Opportunity
Azure_Business_OpportunityAzure_Business_Opportunity
Azure_Business_Opportunity
 

Similaire à Pallavi_Resume

Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_SparkSunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_SparkMopuru Babu
 
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_SparkSunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_SparkMopuru Babu
 
Naman_Abinitio_7757021406
Naman_Abinitio_7757021406Naman_Abinitio_7757021406
Naman_Abinitio_7757021406Naman Gupta
 
ganesh_2+yrs_Java_Developer_Resume
ganesh_2+yrs_Java_Developer_Resumeganesh_2+yrs_Java_Developer_Resume
ganesh_2+yrs_Java_Developer_ResumeYeduvaka Ganesh
 
Deepankar Sehdev- Resume2015
Deepankar Sehdev- Resume2015Deepankar Sehdev- Resume2015
Deepankar Sehdev- Resume2015Deepankar Sehdev
 
Srikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copySrikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copysrikanth K
 
KOTI_RESUME_(1) (2)
KOTI_RESUME_(1) (2)KOTI_RESUME_(1) (2)
KOTI_RESUME_(1) (2)ch koti
 
new_Rajesh_Hadoop Developer_2016
new_Rajesh_Hadoop Developer_2016new_Rajesh_Hadoop Developer_2016
new_Rajesh_Hadoop Developer_2016Rajesh Kumar
 
Khushali Patel-resume-
Khushali Patel-resume-Khushali Patel-resume-
Khushali Patel-resume-Khushali11
 

Similaire à Pallavi_Resume (20)

Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_SparkSunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
 
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_SparkSunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
 
DeepeshRehi
DeepeshRehiDeepeshRehi
DeepeshRehi
 
Sudhanshu kumar hadoop
Sudhanshu kumar hadoopSudhanshu kumar hadoop
Sudhanshu kumar hadoop
 
Resume_2706
Resume_2706Resume_2706
Resume_2706
 
Neelima_Resume
Neelima_ResumeNeelima_Resume
Neelima_Resume
 
Robin_Hadoop
Robin_HadoopRobin_Hadoop
Robin_Hadoop
 
Naman_Abinitio_7757021406
Naman_Abinitio_7757021406Naman_Abinitio_7757021406
Naman_Abinitio_7757021406
 
Resume
ResumeResume
Resume
 
ganesh_2+yrs_Java_Developer_Resume
ganesh_2+yrs_Java_Developer_Resumeganesh_2+yrs_Java_Developer_Resume
ganesh_2+yrs_Java_Developer_Resume
 
Kiran_Profile
Kiran_ProfileKiran_Profile
Kiran_Profile
 
Sourav banerjee resume
Sourav banerjee   resumeSourav banerjee   resume
Sourav banerjee resume
 
Deepankar Sehdev- Resume2015
Deepankar Sehdev- Resume2015Deepankar Sehdev- Resume2015
Deepankar Sehdev- Resume2015
 
Srikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copySrikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copy
 
Atul Mithe
Atul MitheAtul Mithe
Atul Mithe
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
 
KOTI_RESUME_(1) (2)
KOTI_RESUME_(1) (2)KOTI_RESUME_(1) (2)
KOTI_RESUME_(1) (2)
 
new_Rajesh_Hadoop Developer_2016
new_Rajesh_Hadoop Developer_2016new_Rajesh_Hadoop Developer_2016
new_Rajesh_Hadoop Developer_2016
 
Purnachandra_Hadoop_N
Purnachandra_Hadoop_NPurnachandra_Hadoop_N
Purnachandra_Hadoop_N
 
Khushali Patel-resume-
Khushali Patel-resume-Khushali Patel-resume-
Khushali Patel-resume-
 

Pallavi_Resume

  • 2. • Over 8 years of experience in analysis, design, development, maintenance and testing of big data, database and web applications. • Worked in Hadoop Ecosystem with highly unstructured, semi-structured and structured data of around 10 TB in size. • Extracted the data from database servers into HDFS using Sqoop to populate Hive external tables. • Used Flume for ingesting files directly from a given directory into HDFS with some format transformation. • Developed Hive (version 0.14) Scripts for analytical requirements to perform Ad hoc analysis and storing data into hive tables for visualizing it using Tableau Dashboards. • Very good understanding of partitions, bucketing concepts in HIVE and worked on both Managed and External Tables in HIVE.
  • 3. • Applied different formatting, compression and vectorization techniques for optimizing performance of hive queries. • Experience in using Sequence, ORC File, AVRO and parquet File formats. • Applied multiple transformations on raw data in HDFS using Pig queries. • Experience of using regular expressions for parsing the input file in both Hive and Pig. • Very good understanding of partitions, bucketing concepts in HIVE and designed both Managed and External tables in HIVE to optimize performance. • Developed Oozie Workflow for scheduling and planning the ETL Process. • Very good understanding of Map Reduce 1 (Job Tracker) setup and YARN Framework. • Good experience of using Apache Spark as an ETL process in both python and Scala.
  • 4. • Experience of using Apache Spark RDDs, pair RDDs and spark SQL for loading and saving data to HDFS, hive tables in different formats and applied different operations on them. • Deep knowledge of different types of transformations and actions applied on Spark RDDs. • Deep understanding of Apache Spark internal architecture and different techniques to improve performance of spark operations. • Very good understanding of using spark streaming (DStreams) and machine learning algorithms. • Brief knowledge of using GraphX in Spark. • Worked on 25 node Hadoop cluster running on CDH5 and 5 node Hadoop cluster on Amazon Web Services EC2 with SSH connectivity on the basis of public /private key relationship. • Experience in creating Tableau work sheets and dashboards using Tableau Desktop version 9.2.
  • 5. • Ability to create interactive Dashboards using parameters and filters. • Have a good experience of creating multiple types of graphs in tableau using dimensions, measures and calculated fields. • Strong experience in SQL Server 2012/2008(R2) for creating, managing and maintenance of database applications using Stored Procedure, TSQL coding, Performance Tuning and Query Optimization. • Highly proficient in the use of T-SQL for developing complex Stored Procedures, Triggers, Tables, User Defined Functions, views, indexes, user profiles, Relational Database models, Data integrity, query writing and SQL joins. • Experienced in Creating, Configuring and Fine-tuning ETL workflows designed between Homogenous and Heterogeneous Systems using SSIS of MS SQL Server 2012/2008(R2).
  • 6. • High expertise in creating Reports in SSRS, like parameterized Reports, Pivot and Tabular Reports and charts based on client requirement. • High experience in Analysis, Design, Development, Testing and Implementation of Crystal Reports 8.5/9/10/XI R2/2008. • Extensive knowledge of C# and Vb programing, Regular working experience on Microsoft visual Studio. • Good Experience in developing java applications using jsp, servlets, Core-java. • Excellent communication and inter-personal skills with ability to develop creative solutions for challenging client needs. TECHNICAL SKILLS
  • 7. Big Data Ecosystems : Hadoop, Map Reduce, HDFS, Apache Spark, Hive, Pig, Sqoop, Flume, Oozie Programming Languages : Scala,vb.net, c#.net, Java, C/C++ Declarative Languages : T-SQL, ANSI SQL Scripting Languages : Python, java script, XML, HTML Visualization : Tableau 8.3,9.2,10.0 Databases : Sql Server 2012/2008(R2), 2005, NoSQL, MySQL Reporting Services : Sql Server Reporting Services (SSRS) 2012/2008(R2) ,Crystal Reports 13/10/8.5 ETL : Sql Server Integration Services (SSIS) 2012/2008(R2) Analytical Services : Sql Server Analysis Services (SSAS) 2008(R2) Platforms : windows (XP/2010),Linux
  • 8. Streaming Services : Spark Streaming Machine Learning : Spark Machine Learning Algorithms CERTIFICATION(S) DONE Cloudera Hadoop and Spark Developer Certification (CCA-175) http://certification.cloudera.com/verify License No- 100-016-880 h http://certificloudera.com/verify Hortonworks Hadoop Developer Certification (HDPCD)
  • 9. http://bcert.me/scfmjojg Microsoft Certified Solutions Associate (MCSA) • Querying Microsoft SQL Server 2012(Microsoft certified Professional) • Administering Microsoft SQL Server 2012 Databases • Implementing a Data Warehouse with Microsoft SQL Server 2012 EDUCATION
  • 10. Bachelors of Engineering (B.Tech) in Computer science and Engineering from Beant College of Engineering and Technology, Punjab, India. PROFESSIONAL EXPERIENCE Life Care Centers of America (LCCA), Hunt valley, MD NTT Data Net Solutions Insight April 2015 to Till Date Hadoop Developer
  • 11. Insight is business intelligence software which is delivered through dashboards where you can visualize, monitor, and analyze information, bringing you Key Performance Indicators and alerts, both clinical and financial. Insight is an excellent tool for identifying areas to address to reduce hospital readmissions. It gives you the data you need for analysis and provides a feedback loop to measure the success of your Quality Improvement activities. Insight supports analysis with its ability to display and drill down to pertinent data by: 1. Changing to a different view, such as a comparison or consolidation of multiple sites, one facility, a department, or unit. 2. Filtering and sorting data by one or more factors such as station, facility, and payer. 3. Drilling down to the source data, such as a resident, progress note, or invoice. This project is on cluster with 25 node size.
  • 12. RESPONSIBILITIES: • Gathering the requirement from the business team, Analyzing the requirements and Preparing Functional Design Documents. • Imported & exported data from multiple database servers to HDFS using sqoop for fast data transfer. • Developed ETL Transformations using Pig to parse the raw data, and stored the refined data in HIVE warehouse directory. • Created multiple hive tables and stored data into it using different formats. • Created oozie workflow for scheduling the overall flow of data among different processes. • Created Tableau worksheets and dashboards using cloudera hive server.
  • 13. • Unit Test the dashboards for data quality and optimization check. • Manage the issues raised from QA and assign the work items to the team and take up critical work items. • Coordinating with onshore team on daily basis and help understand the business requirements. • Code review to ensure quality standards. Technologies used: Oozie, Pig, HDFS, Sqoop, HIVE, Tableau 8.3 & 9.2 NTT DATA Inc., Hunt valley, MD Performance Metrics May 2014 to March 2015
  • 14. Hadoop Developer Performance Metrics is built with an aim to calculate the performance of employees on the basis of its task completion. It records the number of time the task is being re worked; it also estimates the time to complete a task against the actual time taken by an employee to complete it. All these data’s are being collected, aggregated and analyzed in a Hadoop cluster in an hourly basis. Along with it, it analyses employee needs, demands, achievements, certifications and basic information annually, while it also analyses rewards & recognition and motivational factor of an employee periodically. This project is on cluster with 10 node size. Responsibilities:
  • 15. • Supported code /design analysis and project planning. • Ingesting unstructured and semi-structured data using Flume (version 0.9.4) in HDFS directory. • Created PIG queries for preprocessing data using regular expressions and filter the data which is not appropriate. • Developed Hive scripts for adhoc analysis and daily analysis. • Maintained the entire process by using OOZie workflow and scheduled it to process hourly. • Used Sqoop to export data into MySQL from HDFS and HIVE. Technologies used: HDFS, Sqoop, HIVE, PIG, Oozie, Flume
  • 16. Family Healthcare of Ellensburg, Barton Healthcare, Redmond, WA NTT DATA Health Care Solution Division (HSD) Clinical eAssignment {following CCHIT} August 2012 to April 2014 SQL Server Developer EAssignment is a mailing feature provided to the health care domain. This project will allow the facility staff to assign tasks to others electronically. This Project includes all the features like Composing the task, assign the task, forward the task, Provide a Reply and Reply all attribute and can also generate the report of Task as per the need as per CCHIT.
  • 17. RESPONSIBILITIES: • Created Databases, Tables, Cluster/Non-Cluster Index, Unique/Check Constraints, Views, Stored Procedures, Triggers. • Wrote efficient stored procedures for the optimal performance of the system. • Monitored performance and optimized SQL queries for maximum efficiency. • Generated custom and parameterized reports using SSRS. • Actively involved in developing Complex SSRS Reports involving Sub Reports, Matrix/Tabular Reports, Charts and Graphs. • Advanced extensible reporting skills using Reporting Services (SSRS).
  • 18. • Involved in complete SSIS life cycle in creating SSIS packages, building, deploying and executing the packages in both the environments (Development and Production). • Testing and integrating the developed code. • Participate in the business team meetings to represent the technical team to help the business team with technical feasibility. • Design the architecture for new requirements and enhancements. • Manage the issues raised from QA and assign the work items to the team and take up critical work items. • Review code delivery and manage the Clear Case for defect tracking. Technologies used:
  • 19. Windows, Sql Server 2008(R2)/2012, SSRS 2008(R2)/2012, SSIS 2008(R2)/2012, Visual Studio 2008/2013, C#. NTT Data – Healthcare Technologies, New Delhi, INDIA NTT DATA Net Solutions – RAM & CLINICALS December 2010 to July 2012 SQL Server and Dot net Developer NTT DATA Net Solutions, next generation financial and clinical software, combines the rich functionality LTC’s demand from software with the strength of Web technology. NetSolutions offers Long Term and Post-Acute Care provider’s maximum choice in hosting, customization, and reporting. NetSolutions is integrated system for an electronic medical
  • 20. record, Billing, point-of-care, and business/clinical intelligence. The Project Comprises of full reengineering of a thick client to a more scalable Client server application for the management of Health Care Software Solution. The project is being made using .Net Technologies consisting of VB.Net and sql server. Project also uses Crystal Reports for billing and reporting purposes. RESPONSIBILITIES: • Developing and delivering product enhancement to the clients. • Converting vb .NET code to the Stored Procedure for speed performance using SQL Server 2008 R2. • Maintaining the application by solving various issues coming at our end in .NET and stored procedures which we created.
  • 21. • Critical Stored proc analysis and bug fixation. • Performed Query optimization & Performance Tuning. • Responsible for rebuilding the indexes and tables as part of performance tuning. • Created indexed views and appropriate indexes to reduce the running time for complex queries. • Performing Design and writing technical design documents, test plans and system interface documents. • Involved in developing and debugging database stored procedures • Doing coding, unit testing and reviews. • Testing and integrating the developed code. • Production, customer issues, ad hoc request resolution • Design the architecture for new requirements and enhancements.
  • 22. • Manage the issues raised from QA and assign the work items to the team and take up critical work items. • Discuss with the business team to clarify the requirements to get clarity of certain business functionalities. • Review code delivery and manage the Clear Case for defect tracking. • Participate in the integration testing and review the integration defects and suggest the release plan for the defects. • Developing and delivering product enhancement to the clients. Technologies used: Windows, Sql Server 2008(R2)/2012, Visual Studio 2008, C#, Crystal Report 8.5/10/13.
  • 23. Zenith Computers, Chandigarh, India Hardware Management application. November 2008 to September 2010 JAVA Developer This project is concerned to develop a business application for Zenith Computers. The main aim of this project is to automate the manual processing in the stock maintenance, financial transactions, and employee transactions etc. we developed a business application for the concerned company. The objective of this project is to automate the Computer Systems dealer transactions as a whole. Responsibilities:
  • 24. • Designing and analysis of a business requirement model. • Development of a code using Core-java (version 5.0), J2EE, Jsp using multi-tier architecture. • Designing and developing database architecture in MySQL for stock, financial and employee information maintenance. • Unit testing and peer-to-peer testing done for code quality check. • Maintenance support provided for smooth running of application. Technologies used: Core-java, j2EE, JSP, servlets, MySQL