SlideShare une entreprise Scribd logo
1  sur  2
info@digitalnest.in Digital Nest 8088998664
http://www.digitalnest.in/r-programming-for-data-science-course-hyderabad-india/
1. Introduction
Readingdatain a statistical systemforanalysisandexportof resultstoanothersystem
for reportwritingcanbe frustratingtasksthatcan take a lotlongerthan statistical analysis
itself,althoughmostreaderswill findthe latter muchmore attractive.
Thismanual describesthe importandexportfacilitiesavailable eitherinRor by
packagesavailable atCRAN or elsewhere.
Unlessotherwise indicated,everythingdescribedinthismanual is(atleastinprinciple)available
on all platformsrunningR.
In general,statistical systemslikeRare notparticularlysuitedtomanipulationsof
large scale data. Some othersystemsare betterthanR at that, and some of the pushof
thismanual isto suggestthat ratherthan duplicatingthe functionalityinR,we can doanother
systemdothe work!(For example,TherneauandGrambsch(2000) indicatedthattheyprefer
do some data manipulationinSASandthenuse the package survival (https://CRAN.R-project.
org / package = survival) inSforthe analysis.) Database manipulationsystemsare oftenvery
suitable formanipulatingandretrievingdata:multiplepacketstointeractwithDBMS are
discussedhere.
There are packagesto allowfeaturesdevelopedinlanguages suchasJava,perl and
pythonto be directlyintegratedwiththe Rcode,makinguse of the facilitiesinthese languages even
more appropriate.(See the rJavapackage (https://CRAN.R-project.org/package=rJava)
of CRAN andthe SJava,RSPerl andRSPythonpackagesof the Omegahatproject,http://
www.omegahat.net.)
It shouldalsobe rememberedthatR as S comesfromthe Unix traditionof small re-usable
tools,andit can be rewardingtouse toolssuchas awk andperl to manipulate the databefore
importor afterexport.The case studyin Becker,Chambers&Wilks(1988, Chapter9) isa
example,where Unix toolswere usedtocheckandmanipulate databefore entering
S. Traditional Unix toolsare nowmuchmore widelyavailable,includingforWindows.
Thismanual was writtenforthe firsttime in2000, and the numberof R boxeshasincreased
a hundredtimessince.Forspecializeddataformats,itisuseful tolookforan appropriate package
Alreadyexists.
info@digitalnest.in Digital Nest 8088998664
http://www.digitalnest.in/r-programming-for-data-science-course-hyderabad-india/
1.1 Imports
The easiestformof data to importintoR is a simple textfile,whichis oftenacceptable for
small or mediumscale problems.The mainfunctiontoimportfromatextfile isscan,and
thisunderliesmostof the more practical functionsdiscussedinChapter2[Spreadsheet-like
data],page 8.
However,all statistical consultants are familiarwiththe presentationbyaclient
a USB stick(formerlyafloppydiskora CD-R) of data ina certainproprietarybinaryformat,
for example "anExcel spreadsheet"or"anSPSSfile".Oftenthe easiestthingtodoisto use
the original applicationtoexportthe dataasa textfile (and
have copiesof the most commonapplicationsontheircomputersforthispurpose).however,
thisisnot alwayspossible,andChapter3 [Importfromotherstatistical systems],page 14,
discussesthe facilitiesavailabletoaccessthese filesdirectlyfromR.ForExcel spreadsheets,
the available methodsare summarizedinChapter9[ReadingExcel Spreadsheets],page 29.
In some cases,the data has beenstoredinabinaryform forcompactnessandspeedof access.
An applicationof whatwe have seenmanytimesisdataimaging,whichisnormallystored
as a stream of bytesas representedinmemory,possiblyprecededbyaheader.Suchdata formats
are discussedinChapter5[BinaryFiles],page 22,and Section7.5 [BinaryConnections],page 26.
For much largerdatabases,itiscommonto manage data usingdatabase management
system(DBMS).It isagainpossible touse the DBMS to extracta simple file,but
Chapter1: Introduction5
for manyof these DBMS,the extractionoperationcanbe carriedout directlyfromapacket R: See
Chapter4 [RelationalDatabases],page 16.Importingdata overnetworkconnections
inChapter8 [NetworkInterfaces],page 28.
1.1.1 Encodings
Unlessthe file toimportisentirelyinASCII,itisusuallynecessarytoknow how
has beencoded.Fortextfiles,agoodwayto findsomethingaboutitsstructure isthe file

Contenu connexe

Tendances

Lecture 24
Lecture 24Lecture 24
Lecture 24Shani729
 
Bigdata & Hadoop
Bigdata & HadoopBigdata & Hadoop
Bigdata & HadoopPinto Das
 
Presentation on BigData by Swapnaja
Presentation on BigData by Swapnaja Presentation on BigData by Swapnaja
Presentation on BigData by Swapnaja Swapnaja Tandale
 
Michael Stonebraker How to do Complex Analytics
Michael Stonebraker How to do Complex AnalyticsMichael Stonebraker How to do Complex Analytics
Michael Stonebraker How to do Complex AnalyticsMassTLC
 
Analysis of historical movie data by BHADRA
Analysis of historical movie data by BHADRAAnalysis of historical movie data by BHADRA
Analysis of historical movie data by BHADRABhadra Gowdra
 
TCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYATCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYAAditya Srinivasan
 
Web Oriented FIM for large scale dataset using Hadoop
Web Oriented FIM for large scale dataset using HadoopWeb Oriented FIM for large scale dataset using Hadoop
Web Oriented FIM for large scale dataset using Hadoopdbpublications
 
Aginity "Big Data" Research Lab
Aginity "Big Data" Research LabAginity "Big Data" Research Lab
Aginity "Big Data" Research Labkevinflorian
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5RojaT4
 
Tech Talk - Underutilized Resources in Distributed System
Tech Talk - Underutilized Resources in Distributed SystemTech Talk - Underutilized Resources in Distributed System
Tech Talk - Underutilized Resources in Distributed SystemRishabh Dugar
 
Big Data and Dataflow: Made for each other
Big Data and Dataflow: Made for each otherBig Data and Dataflow: Made for each other
Big Data and Dataflow: Made for each otherJim Falgout
 
1.demystifying big data & hadoop
1.demystifying big data & hadoop1.demystifying big data & hadoop
1.demystifying big data & hadoopdatabloginfo
 
Diplo cloud efficient and scalable management of rdf data in the cloud
Diplo cloud efficient and scalable management of rdf data in the cloudDiplo cloud efficient and scalable management of rdf data in the cloud
Diplo cloud efficient and scalable management of rdf data in the cloudieeepondy
 
Lecture 25
Lecture 25Lecture 25
Lecture 25Shani729
 
Introduction to Numetric (1)
Introduction to Numetric (1)Introduction to Numetric (1)
Introduction to Numetric (1)Matt Polson
 

Tendances (19)

Lecture 24
Lecture 24Lecture 24
Lecture 24
 
Bigdata & Hadoop
Bigdata & HadoopBigdata & Hadoop
Bigdata & Hadoop
 
WELCOME TO BIG DATA TRANING
WELCOME TO BIG DATA TRANINGWELCOME TO BIG DATA TRANING
WELCOME TO BIG DATA TRANING
 
Presentation on BigData by Swapnaja
Presentation on BigData by Swapnaja Presentation on BigData by Swapnaja
Presentation on BigData by Swapnaja
 
Michael Stonebraker How to do Complex Analytics
Michael Stonebraker How to do Complex AnalyticsMichael Stonebraker How to do Complex Analytics
Michael Stonebraker How to do Complex Analytics
 
Analysis of historical movie data by BHADRA
Analysis of historical movie data by BHADRAAnalysis of historical movie data by BHADRA
Analysis of historical movie data by BHADRA
 
TCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYATCS_DATA_ANALYSIS_REPORT_ADITYA
TCS_DATA_ANALYSIS_REPORT_ADITYA
 
Web Oriented FIM for large scale dataset using Hadoop
Web Oriented FIM for large scale dataset using HadoopWeb Oriented FIM for large scale dataset using Hadoop
Web Oriented FIM for large scale dataset using Hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
Aginity "Big Data" Research Lab
Aginity "Big Data" Research LabAginity "Big Data" Research Lab
Aginity "Big Data" Research Lab
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5
 
Tech Talk - Underutilized Resources in Distributed System
Tech Talk - Underutilized Resources in Distributed SystemTech Talk - Underutilized Resources in Distributed System
Tech Talk - Underutilized Resources in Distributed System
 
Big Data and Dataflow: Made for each other
Big Data and Dataflow: Made for each otherBig Data and Dataflow: Made for each other
Big Data and Dataflow: Made for each other
 
1.demystifying big data & hadoop
1.demystifying big data & hadoop1.demystifying big data & hadoop
1.demystifying big data & hadoop
 
Big data
Big dataBig data
Big data
 
Diplo cloud efficient and scalable management of rdf data in the cloud
Diplo cloud efficient and scalable management of rdf data in the cloudDiplo cloud efficient and scalable management of rdf data in the cloud
Diplo cloud efficient and scalable management of rdf data in the cloud
 
Lecture 25
Lecture 25Lecture 25
Lecture 25
 
Hadoop
HadoopHadoop
Hadoop
 
Introduction to Numetric (1)
Introduction to Numetric (1)Introduction to Numetric (1)
Introduction to Numetric (1)
 

Similaire à R programming analysis

Unit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptxUnit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptxMalla Reddy University
 
Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?Ahmed Kamal
 
Scalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee EdlefsenScalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee EdlefsenRevolution Analytics
 
Sybase IQ ile Analitik Platform
Sybase IQ ile Analitik PlatformSybase IQ ile Analitik Platform
Sybase IQ ile Analitik PlatformSybase Türkiye
 
Building an analytical platform
Building an analytical platformBuilding an analytical platform
Building an analytical platformDavid Walker
 
European Pharmaceutical Contractor: SAS and R Team in Clinical Research
European Pharmaceutical Contractor: SAS and R Team in Clinical ResearchEuropean Pharmaceutical Contractor: SAS and R Team in Clinical Research
European Pharmaceutical Contractor: SAS and R Team in Clinical ResearchKCR
 
Analyzing Big data in R and Scala using Apache Spark 17-7-19
Analyzing Big data in R and Scala using Apache Spark  17-7-19Analyzing Big data in R and Scala using Apache Spark  17-7-19
Analyzing Big data in R and Scala using Apache Spark 17-7-19Ahmed Elsayed
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezBig Data Spain
 
Unstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus ModelUnstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus ModelEditor IJCATR
 
Big data analysis using spark r published
Big data analysis using spark r publishedBig data analysis using spark r published
Big data analysis using spark r publishedDipendra Kusi
 
Big Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and StoringBig Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and StoringIRJET Journal
 
Aginity Big Data Research Lab V3
Aginity Big Data Research Lab V3Aginity Big Data Research Lab V3
Aginity Big Data Research Lab V3mcacicio
 
Aginity Big Data Research Lab
Aginity Big Data Research LabAginity Big Data Research Lab
Aginity Big Data Research Labasifahmed
 
Aginity Big Data Research Lab
Aginity Big Data Research LabAginity Big Data Research Lab
Aginity Big Data Research Labdkuhn
 
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
A Survey on Data Mapping Strategy for data stored in the storage cloud  111A Survey on Data Mapping Strategy for data stored in the storage cloud  111
A Survey on Data Mapping Strategy for data stored in the storage cloud 111NavNeet KuMar
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studioDerek Kane
 
Dashboards for Business Intelligence
Dashboards for Business IntelligenceDashboards for Business Intelligence
Dashboards for Business IntelligencePetteriTeikariPhD
 

Similaire à R programming analysis (20)

Unit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptxUnit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptx
 
Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?
 
Scalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee EdlefsenScalable Data Analysis in R -- Lee Edlefsen
Scalable Data Analysis in R -- Lee Edlefsen
 
Sybase IQ ile Analitik Platform
Sybase IQ ile Analitik PlatformSybase IQ ile Analitik Platform
Sybase IQ ile Analitik Platform
 
Building an analytical platform
Building an analytical platformBuilding an analytical platform
Building an analytical platform
 
Big data analysis concepts and references
Big data analysis concepts and referencesBig data analysis concepts and references
Big data analysis concepts and references
 
European Pharmaceutical Contractor: SAS and R Team in Clinical Research
European Pharmaceutical Contractor: SAS and R Team in Clinical ResearchEuropean Pharmaceutical Contractor: SAS and R Team in Clinical Research
European Pharmaceutical Contractor: SAS and R Team in Clinical Research
 
Analyzing Big data in R and Scala using Apache Spark 17-7-19
Analyzing Big data in R and Scala using Apache Spark  17-7-19Analyzing Big data in R and Scala using Apache Spark  17-7-19
Analyzing Big data in R and Scala using Apache Spark 17-7-19
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier Dominguez
 
Unstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus ModelUnstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus Model
 
[IJCT-V3I2P32] Authors: Amarbir Singh, Palwinder Singh
[IJCT-V3I2P32] Authors: Amarbir Singh, Palwinder Singh[IJCT-V3I2P32] Authors: Amarbir Singh, Palwinder Singh
[IJCT-V3I2P32] Authors: Amarbir Singh, Palwinder Singh
 
Big data analysis using spark r published
Big data analysis using spark r publishedBig data analysis using spark r published
Big data analysis using spark r published
 
Big Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and StoringBig Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and Storing
 
Aginity Big Data Research Lab V3
Aginity Big Data Research Lab V3Aginity Big Data Research Lab V3
Aginity Big Data Research Lab V3
 
Aginity Big Data Research Lab
Aginity Big Data Research LabAginity Big Data Research Lab
Aginity Big Data Research Lab
 
Aginity Big Data Research Lab
Aginity Big Data Research LabAginity Big Data Research Lab
Aginity Big Data Research Lab
 
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
A Survey on Data Mapping Strategy for data stored in the storage cloud  111A Survey on Data Mapping Strategy for data stored in the storage cloud  111
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studio
 
Dashboards for Business Intelligence
Dashboards for Business IntelligenceDashboards for Business Intelligence
Dashboards for Business Intelligence
 
Big data
Big dataBig data
Big data
 

Dernier

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 

Dernier (20)

INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 

R programming analysis

  • 1. info@digitalnest.in Digital Nest 8088998664 http://www.digitalnest.in/r-programming-for-data-science-course-hyderabad-india/ 1. Introduction Readingdatain a statistical systemforanalysisandexportof resultstoanothersystem for reportwritingcanbe frustratingtasksthatcan take a lotlongerthan statistical analysis itself,althoughmostreaderswill findthe latter muchmore attractive. Thismanual describesthe importandexportfacilitiesavailable eitherinRor by packagesavailable atCRAN or elsewhere. Unlessotherwise indicated,everythingdescribedinthismanual is(atleastinprinciple)available on all platformsrunningR. In general,statistical systemslikeRare notparticularlysuitedtomanipulationsof large scale data. Some othersystemsare betterthanR at that, and some of the pushof thismanual isto suggestthat ratherthan duplicatingthe functionalityinR,we can doanother systemdothe work!(For example,TherneauandGrambsch(2000) indicatedthattheyprefer do some data manipulationinSASandthenuse the package survival (https://CRAN.R-project. org / package = survival) inSforthe analysis.) Database manipulationsystemsare oftenvery suitable formanipulatingandretrievingdata:multiplepacketstointeractwithDBMS are discussedhere. There are packagesto allowfeaturesdevelopedinlanguages suchasJava,perl and pythonto be directlyintegratedwiththe Rcode,makinguse of the facilitiesinthese languages even more appropriate.(See the rJavapackage (https://CRAN.R-project.org/package=rJava) of CRAN andthe SJava,RSPerl andRSPythonpackagesof the Omegahatproject,http:// www.omegahat.net.) It shouldalsobe rememberedthatR as S comesfromthe Unix traditionof small re-usable tools,andit can be rewardingtouse toolssuchas awk andperl to manipulate the databefore importor afterexport.The case studyin Becker,Chambers&Wilks(1988, Chapter9) isa example,where Unix toolswere usedtocheckandmanipulate databefore entering S. Traditional Unix toolsare nowmuchmore widelyavailable,includingforWindows. Thismanual was writtenforthe firsttime in2000, and the numberof R boxeshasincreased a hundredtimessince.Forspecializeddataformats,itisuseful tolookforan appropriate package Alreadyexists.
  • 2. info@digitalnest.in Digital Nest 8088998664 http://www.digitalnest.in/r-programming-for-data-science-course-hyderabad-india/ 1.1 Imports The easiestformof data to importintoR is a simple textfile,whichis oftenacceptable for small or mediumscale problems.The mainfunctiontoimportfromatextfile isscan,and thisunderliesmostof the more practical functionsdiscussedinChapter2[Spreadsheet-like data],page 8. However,all statistical consultants are familiarwiththe presentationbyaclient a USB stick(formerlyafloppydiskora CD-R) of data ina certainproprietarybinaryformat, for example "anExcel spreadsheet"or"anSPSSfile".Oftenthe easiestthingtodoisto use the original applicationtoexportthe dataasa textfile (and have copiesof the most commonapplicationsontheircomputersforthispurpose).however, thisisnot alwayspossible,andChapter3 [Importfromotherstatistical systems],page 14, discussesthe facilitiesavailabletoaccessthese filesdirectlyfromR.ForExcel spreadsheets, the available methodsare summarizedinChapter9[ReadingExcel Spreadsheets],page 29. In some cases,the data has beenstoredinabinaryform forcompactnessandspeedof access. An applicationof whatwe have seenmanytimesisdataimaging,whichisnormallystored as a stream of bytesas representedinmemory,possiblyprecededbyaheader.Suchdata formats are discussedinChapter5[BinaryFiles],page 22,and Section7.5 [BinaryConnections],page 26. For much largerdatabases,itiscommonto manage data usingdatabase management system(DBMS).It isagainpossible touse the DBMS to extracta simple file,but Chapter1: Introduction5 for manyof these DBMS,the extractionoperationcanbe carriedout directlyfromapacket R: See Chapter4 [RelationalDatabases],page 16.Importingdata overnetworkconnections inChapter8 [NetworkInterfaces],page 28. 1.1.1 Encodings Unlessthe file toimportisentirelyinASCII,itisusuallynecessarytoknow how has beencoded.Fortextfiles,agoodwayto findsomethingaboutitsstructure isthe file