The document provides an overview of various Python machine learning libraries and tools, including Orange, MDP, PyMC, PyML, hcluster, NLTK, mlpy, LIBSVM, PyEvolve, FANN, Theano, PyBrain, Shogun, ffnet. For each library, it gives information on the homepage, dependencies, installation/source options, key developers and details. It also discusses machine learning and Python in general terms, noting the large amount of activity but also varying documentation quality and lack of packaging.
81. Dimensionality reduction import mdp x = mdp.numx_rand.random((100, 25)) # 25 variables, 100 observations y = mdp.pca(x) z = mdp.fastica(x, dtype='float32')
82. Nodes: training and usage n = mdp.nodes.PCANode() n.train(x) # learn PC of x n.stop_training() print n1.output_dim print n1.explained_variance z = n.execute(y) # project y on PC learned in training
83. Inverting the flow print n.is_invertible() # true for PCA node print n.inverse(z) # get y back
101. Model module, part 1 x = numpy.array([-.86,-.3,-.05,.73]) alpha = pymc.Normal('alpha',mu=0,tau=.01) beta = pymc.Normal('beta',mu=0,tau=.01) @pymc.deterministic def theta(a=alpha,b=beta): return pymc.invlogit(a+b*x)
102. Model module, part 2 # Binomial likelihood for data d = pymc.Binomial( 'd' ,n=numpy.ones(4,dtype=int)*5 ,p=theta ,value=numpy.array([0.,1.,3.,5.]) ,observed=True )
103. Sampling from a distribution import pymc import model S=pymc.MCMC(model,db='pickle') S.sample(iter=10000,burn=5000,thin=2)
159. “You are a creative genius. Your creative genius is so accomplished that it appears, to you and to others, as effortless. Yet it far outstrips the most valiant efforts of today's fastest supercomputers. To invoke it, you need only open your eyes.” – Donald D. Hoffman