Introduction to Analytics with Azure Notebooks and Python for Data Science and Business Intelligence. This is one part of a full day workshop on moving from BI to Analytics
6. Using the Jupyter Notebook, you can author documents that
combine your code with comments, equations, images,
video, and visualizations.
The documents can be shared with others on GitHub,
Dropbox, and the Jupyter Notebook Viewer.
7. Uses include:
Data cleaning and transformation
numerical simulation
statistical modeling
Machine learning
Data Visualization and much more.
8. Leverage big data tools
Apache Spark, from Python, R and Scala.
Explore with pandas, scikit-learn, ggplot2, dplyr,
etc.
9. Notebook mode supports literate computing and
reproducible sessions
Code chunks alongside the results and additional
comments
19. What Is Python?
• Powerful, interpreted language
• iPython notebook is easy to use
• Created by Guido van Rossum
• Scripting language
• Identation for statement grouping
• Fast
• High-level data types
20. Python vs. R
• Both of these languages are free and very
popular for data analysis. There are some
differences, however.
• R has a long, trusted history, and a lot of
support in the data industry.
21. Python vs. R
• Python is easier to master than R,
especially if you have previously learned
an object-oriented programming language
like Java or C++.
• It is more of a general-purpose
programming language
22. Choosing Between Python and
R
• Python doesn’t have as many packages
and libraries as R
• Python has a lot of tools such as Pandas,
Numpy, Scipy, Seaborn
• Personal Preference: maths and stats
folks tend to prefer R, Computer scientists
tend to prefer Python
23. Python Data Structures
• It’s important to familiarise yourself with
the common data structures in Python in
order to use them appropriately.
• Lists – One of the most versatile data
structures in Python.
24. Python Data Structures
● Strings - Strings can simply be defined by
use of single ( ‘ ), double ( ” ) or triple ( ”’ )
inverted commas. Strings enclosed in triple
quotes ( ”’ ) can span over multiple lines and
are used frequently in docstrings (Python’s
way of documenting functions). is used as
an escape character. Please note that
Python strings are immutable, so you can’t
change part of strings.
25. Python Data Structures
● Tuples - represented by a number of values separated by commas.
● A tuple is a sequence of immutable Python objects. Tuples are sequences, just like lists.
The differences between tuples and lists are, the tuples cannot be changed unlike lists
and tuples use parentheses, whereas lists use square brackets.
● Creating a tuple is as simple as putting different comma-separated values. Optionally
you can put these comma-separated values between parentheses also.
26. User-Defined Functions
Here are some basic guidelines to follow when defining a function in Python:
● Function blocks begin with the keyword def followed by the function name and
parentheses ( ( ) ).
● Any input parameters or arguments should be placed within these parentheses. You
can also define parameters inside these parentheses.
● The first statement of a function can be an optional statement - the documentation
string of the function or docstring.
● The code block within every function starts with a colon (:) and is indented.
● The statement return [expression] exits a function, optionally passing back an
expression to the caller. A return statement with no arguments is the same as return
None.
27. Installing Python
• Generally, installing Python on your
system is very easy. The majority of
Linux and UNIX distributions include a
recent version of Python. Some
Windows computers also come with
Python pre-installed. If you don’t have
Python already, installation is
unremarkable on almost all platforms.
28. Who Can Use Python?
Because Python is so easy to learn and logic-based, it isn’t just reserved for
programmers and data scientists the way that some programming languages feel like
they are. Because of this, it’s being increasingly adopted by non-programmers, and
‘regular’ users with less to no experience.
The dynamic of the ‘typical’ Python user, and therefore the typical coder or programmer,
is evolving because of Python’s accessibility.
Python is also evolving very quickly within data science circles. There are so many data
science tools in the Python ecosystem now that a large amount of the work being carried
out in data science is being done using Python.