18. Why Python
Simple, clean, easy to learn
‘Close to the metal’
‘Can keep it in my head’
Cross-platform and open source
Vibrant and diverse community support
Become dangerous in a weekend
And useful in a week
19. Why Python
Hello World in Java
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, World");
}
}
Hello World in Python
print('Hello, World')
20. Why Python
Reverse String in Java
import java.util.Scanner;
class ReverseofaString
{
public static void main(String[] arg)
{
ReverseofaString rev=new ReverseofaString();
Scanner sc=new Scanner(System.in);
System.out.print("Enter a string : ");
String str=sc.nextLine();
System.out.println("Reverse of a String
is : "+rev.reverse(str));
}
static String reverse(String s)
{
String rev="";
for(int j=s.length();j>0;--j)
{
rev=rev+(s.charAt(j-1));
}
return rev;
}
}
Reverse String in Python
def reverse(s):
reverse_string = ''
for c in s:
reverse_string = c + reverse_string
return reverse_string
word = input('Enter a string')
print(reverse(word))
21. Why Python
Reverse String in Java
import java.util.Scanner;
class ReverseofaString
{
public static void main(String[] arg)
{
ReverseofaString rev=new ReverseofaString();
Scanner sc=new Scanner(System.in);
System.out.print("Enter a string : ");
String str=sc.nextLine();
System.out.println("Reverse of a String
is : "+rev.reverse(str));
}
static String reverse(String s)
{
String rev="";
for(int j=s.length();j>0;--j)
{
rev=rev+(s.charAt(j-1));
}
return rev;
}
}
Reverse String in Python
word = input('Enter a string')
print("".join(reversed(word)))
29. “The Jupyter Notebook is an open-source web
application that allows you to create and share
documents that contain live code, equations,
visualizations and narrative text.”
Interactive computing environment
Based on IPython
30. IPython enhances the default Python REPL
Automatic Indentation
Syntax highlighting
Tab Completion
And more!
42. “pandas is an open source, BSD-licensed library providing
high-performance, easy-to-use data structures and data
analysis tools for the Python programming language.”
A more intellectually palatable API on top of numpy
No more ‘big globs of numbers’ to worry about
43. >>> import numpy as np
>>> import pandas as pd
>>> df = pd.read_csv(‘dow_jones_index.csv’)
>>> df
>>> df.columns
quarter stock date open high low close volume
0 1 AA 1/7/2011 $15.82 $16.72 $15.78 $16.42 239655616
1 1 AA 1/14/2011 $16.71 $16.71 $15.64 $15.97 242963398
2 1 AA 1/21/2011 $16.19 $16.38 $15.60 $15.79 138428495
3 1 AA 1/28/2011 $15.87 $16.63 $15.82 $16.13 151379173
4 1 AA 2/4/2011 $16.18 $17.39 $16.18 $17.14 154387761
5 1 AA 2/11/2011 $17.33 $17.48 $16.97 $17.37 114691279
6 1 AA 2/18/2011 $17.39 $17.68 $17.28 $17.28 80023895
7 1 AA 2/25/2011 $16.98 $17.15 $15.96 $16.68 132981863
8 1 AA 3/4/2011 $16.81 $16.94 $16.13 $16.58 109493077
9 1 AA 3/11/2011 $16.58 $16.75 $15.42 $16.03 114332562
10 1 AA 3/18/2011 $15.95 $16.33 $15.43 $16.11 130374108
Index(['quarter', 'stock', 'date', 'open', 'high', 'low', 'close', 'volume’,
'percent_change_price', 'percent_change_volume_over_last_wk’, 'previous_weeks_volume’,
'next_weeks_open', 'next_weeks_close','percent_change_next_weeks_price’,
'days_to_next_dividend','percent_return_next_dividend’], dtype='object')
44. >>> df[‘stock’]
>>> df.columns[1:8]
>>> v = df.loc[:, df.columns[1:8]].copy()
stock date open high low close volume
0 AA 1/7/2011 $15.82 $16.72 $15.78 $16.42 239655616
1 AA 1/14/2011 $16.71 $16.71 $15.64 $15.97 242963398
2 AA 1/21/2011 $16.19 $16.38 $15.60 $15.79 138428495
3 AA 1/28/2011 $15.87 $16.63 $15.82 $16.13 151379173
4 AA 2/4/2011 $16.18 $17.39 $16.18 $17.14 154387761
5 AA 2/11/2011 $17.33 $17.48 $16.97 $17.37 114691279
6 AA 2/18/2011 $17.39 $17.68 $17.28 $17.28 80023895
7 AA 2/25/2011 $16.98 $17.15 $15.96 $16.68 132981863
8 AA 3/4/2011 $16.81 $16.94 $16.13 $16.58 109493077
9 AA 3/11/2011 $16.58 $16.75 $15.42 $16.03 114332562
10 AA 3/18/2011 $15.95 $16.33 $15.43 $16.11 130374108
>>> v
>>> v.volume.max()
1453438639
>>> v.close[0]
'$16.42'
45. >>> for column in v.columns[2:6]:
v.loc[:, column] = v.loc[:, column].apply(lambda x: float(x[1:]), 1)
stock date open high low close volume
0 AA 1/7/2011 15.82 16.72 15.78 16.42 239655616
1 AA 1/14/2011 16.71 16.71 15.64 15.97 242963398
2 AA 1/21/2011 16.19 16.38 15.60 15.79 138428495
3 AA 1/28/2011 15.87 16.63 15.82 16.13 151379173
4 AA 2/4/2011 16.18 17.39 16.18 17.14 154387761
5 AA 2/11/2011 17.33 17.48 16.97 17.37 114691279
6 AA 2/18/2011 17.39 17.68 17.28 17.28 80023895
7 AA 2/25/2011 16.98 17.15 15.96 16.68 132981863
8 AA 3/4/2011 16.81 16.94 16.13 16.58 109493077
9 AA 3/11/2011 16.58 16.75 15.42 16.03 114332562
>>> v
>>> v[v.stock == 'DIS']
>>> v[v.stock == 'DIS']['close']
>>> close_index = v[v.stock == 'DIS']['close'].idxmax()
>>> v.loc[close_index, 'volume']
53096584
46. “Matplotlib is a Python 2D plotting library which produces
publication quality figures in a variety of hardcopy formats
and interactive environments across platforms.”
Visualizations
47. >>> import numpy as np
>>> x = np.linspace(0, 2 * np.pi, 361)
>>> y = np.sin(x)
>>> import matplotlib.pyplot as plt
>>> plt.plot(x, y)
>>> y2 = np.cos(x)
>>> plt.plot(x, y)
plt.plot(x, y2, color=‘r’)
48. >>> plt.figure(figsize=(3, 6)) # height is 2x the width
plt.subplot(2, 1, 1) # 2 rows, 1 column, position 1
plt.plot(x, y)
plt.subplot(2, 1, 2) # position 2
plt.plot(x, y2, color=‘r’)
49. fns = [np.sin, np.cos, lambda x: x ** 2, lambda x: np.sin(x) ** 2, lambda x: np.cos(x) ** 2, np.log]
colors = list('rgbcmk')
markers = list('.ov+xd')
data = zip(fns, colors, markers)
plt.figure(figsize=(30, 20))
for i, (fn, color, marker) in enumerate(data):
plt.subplot(2, 3, i + 1) # 1-3 on first row, 4-6 on second
plt.plot(x[np.arange(0, 360, 6)], fn(x[np.arange(0, 360, 6)]), color=color, marker=marker)
52. scikit-learn is an open source Python machine learning package with
implementations of many popular machine learning algorithms.
>>> (X_train, X_test), (y_train, y_test) = get_data()
>>> reg = linear_model.LinearRegression()
>>> reg.fit(X_train, y_train)
>>> pred = reg.predict(X_test)
>>> points, targets = make_blobs()
>>> clf = GaussianNB()
>>> clf.fit(points, targets)
>>> clf.predict(np.array([5, -10]).reshape(1, -1)
array([0])
>>> clf.predict(np.array([10, 10]).reshape(1, -1)
array([1])
53. SymPy is a Python library for symbolic mathematics. It aims to
become a full-featured computer algebra system (CAS) while
keeping the code as simple as possible in order to be
comprehensible and easily extensible.
>>> (x, y) = symbols(‘x y’)
>>> z = x ** 2 + y
>>> z
x ** 2 + y
>>> z.subs([(x, 3), (y, 4)])
13
>>> init_printing()
>>> z
>>> diff(4 * x ** 3 + 2 * x ** 2 - 5, x)
>>> Derivative(2 * x ** 2, x)
54. Statsmodels is a Python module that provides classes and
functions for the estimation of many different statistical
models, as well as for conducting statistical tests, and statistical
data exploration.
>>> sm.OLS(y, X).fit()
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.215
Model: OLS Adj. R-squared: 0.198
Method: Least Squares F-statistic: 13.25
Date: Mon, 24 Jun 2019 Prob (F-statistic): 8.15e-06
Time: 17:30:48 Log-Likelihood: -15.067
No. Observations: 100 AIC: 36.13
Df Residuals: 97 BIC: 43.95
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 1.5386 0.077 19.869 0.000 1.385 1.692
x1 -0.0672 0.101 -0.664 0.508 -0.268 0.134
x2 0.5048 0.099 5.090 0.000 0.308 0.702
==============================================================================
Omnibus: 24.327 Durbin-Watson: 2.228
Prob(Omnibus): 0.000 Jarque-Bera (JB): 6.273
Skew: 0.253 Prob(JB): 0.0434
Kurtosis: 1.883 Cond. No. 5.44
==============================================================================
55. • The defacto data science language
• Open source implementation of the S language
• Does one thing and does it well
• Very quirky
• High-level and fast
• Can rival native languages
• Like a Pythonic R (kinda sorta)
• Relatively new