22. Quick Sort
隐式转换
type Segment = (List[Int], List[Int], List[Int])
implicit class ListWithPartition(list: List[Int]) {
def partitionBy(p: Int): Segment = {
val idenElem = (List[Int](), List[Int](), List[Int]())
def partition(result: Segment, x: Int): Segment = {
val (left, mid, right) = result
if (x < p) (x :: left, mid, right)
else if (x == p) (left, x :: mid, right)
else (left, mid, x :: right)
}
list.foldLeft(idenElem)(partition)
}
}
23. 副作用
值与址
class Pair[A](var x: A, var y: A) {
def modifyX(x: A) = this.x = x
def modifyY(y: A) = this.y = y
}
var pair = new Pair(1, 2)
var pair1 = new Pair(pair, pair)
var pair2 = new Pair(pair, new Pair(1, 2))
pair.modifyX(3)
34. 惰性求值
Lazy val x = 3 + 3
def number = {println("OK"); 3 + 3}
class LazyValue(expr: => Int) {
var evaluated: Boolean = false
var value: Int = -1
def get: Int = {
if (!evaluated) {
value = expr
evaluated = true
}
value
}
}
Call By Name
val lazyValue = new LazyValue(number)
println(lazyValue.get)
println(lazyValue.get)
Thinking in Java
Map可以用装饰器模式来实现
35. Higher-Order Functions
map(f: T => U): A[U]
filter(f: T => Boolean): A[T]
flatMap(f: T => A[T]): A[T]
groupBy(f: T => K): A[(K, List[T])]
sortBy(f: T => K): A[T]
NEW
Count: Int
Force: A[T]
Reduce(f: (T, T) => T): T
T
r
a
n
f
o
r
m
a
t
i
o
n
A
c
t
i
o
n
37. 一些语法糖
class Sugar(i: Int) {
def unary_- = -i
def apply(expr: => Unit) = for (j <- 1 to i) expr
def +(that: Int) = i + that
def +:(that: Int) = I + that
}
目的是为了做好DSL
和延续函数式编程习惯
val sugar = new Sugar(2)
请注意谨慎使用
-sugar
sugar(println("aha"))
sugar + 5
5 + sugar
前缀
中缀
省略方法名
所有字母
|
^
&
< >
= !
: 注意右结
合
+ -
* / %
其他字符
右结合
38.
39. Trait & Mix-in
Mix-in是一种多继承的手段,同Interface一样,通过限制第二个父类的方式
来限制多继承的复杂关系,但它具有默认的实现。
1.通常的继承提供单一继承
2.第二个以及以上的父类必须是Trait
3.不能单独生成实例
Scala中的Trait可以在编译时进行混合也可以在运行时混合。
设想我们要描述一种鸟,它可以唱歌也可以跑;由于它是一只鸟,它当然可
以飞。
abstract class Bird(kind: String) {
val name: String
def singMyName = println(s"$name is singing")
val capability: Int
def run = println(s"I can run $capability meters!!!")
def fly = println(s"flying of kind: $kind")
}
但显然,一个人也可以跑可以唱歌……..不过他还可以编程.
(虽然我不歧视鸟类,不过如果碰到会编程的鸟请通知我)
继承
40. trait Runnable {
val capability: Int
def run = println(s"I can run $capability meters!!!")
}
trait Singer {
val name: String
def singMyName = println(s"$name is singing")
}
abstract class Bird(kind: String) {
def fly = println(s"flying of kind: $kind")
}
继承
41. class Nightingale extends Bird("Nightingale") with Singer with Runnable {
val capability = 20
val name = "poly"
}
val myTinyBird = new Nightingale
myTinyBird.fly
myTinyBird.singMyName
myTinyBird.run
class Coder(language: String) {
val capability = 10
val name = "Handemelindo"
def code = println(s"coding in $language")
}
val me = new Coder("Scala") with Runnable with Singer
me.code
me.singMyName
me.run
继承
43. 一些小伙伴
Case Class与ADT
abstract class Tree
case class Leaf(info: String) extends Tree
case class Node(left: Tree, right: Tree) extends Tree
def traverse(tree: Tree): Unit = {
tree match {
case Leaf(info) => println(info)
case Node(left, right) => {
traverse(left)
traverse(right)
}
}
}
val tree: Tree = new Node(new Node(new Leaf("1"), new Leaf("2")), new Leaf("3"))
traverse(tree)
45. *
Any
Int
1
Pair[Int, Int]
(1, 2)
List[Int]
[1, 2, 3]
* * * * *
List Pair
Kind
Type
Value
类型构造器
类别
子类型
Generics of a Higher Kind - Martin Odersky
=> => =>
Proper
Type
46. type Int :: *
type String :: *
type (Int => String) :: *
type List[Int] :: *
type List :: ?
type Function1 :: ??
做一些抽象练习吧
type List :: * => *
type function1 :: * => * => * Function1[-T, +R]
def id(x: Int) = x
type Id[A] = A
def id(f: Int => Int, x: Int) = f(x)
type id[A[_], B] = A[B]
47. 设想,我们的程序要返回结果:
(Set(x,x,x,x,x), List(x,x,x,x,x,x,x,x,x,x))
(* -> *) -> (* -> *) -> *
type Pair[K[_], V[_]] = (K[A], V[A]) forSome { type A }
val pair: Pair[Set, List] = (Set(“42”), List(52))
val pair: Pair[Set, List] = (Set(42), List(52))
做一些抽象练习吧
49. trait Monoid[A]{
val zero: A
def append(x: A, y: A): A
}
object IntNum extends Monoid[Int] {
val zero = 0
def append(x: Int, y: Int) = x + y
}
object DoubleNum extends Monoid[Double] {
val zero = 0d
def append(x: Double, y: Double) = x + y
}
def sum[A](nums: List[A])(tc: Monoid[A]) =
nums.foldLeft(tc.zero)(tc.append)
sum(List(1, 2, 3, 5, 8, 13))(IntNum)
sum(List(3.14, 1.68, 2.72))(DoubleNum)
对态射进行抽象
50. trait Monoid[A]{
val zero: A
def append(x: A, y: A): A
}
object IntNum extends Monoid[Int] {
val zero = 0
def append(x: Int, y: Int) = x + y
}
object DoubleNum extends Monoid[Double] {
val zero = 0d
def append(x: Double, y: Double) = x + y
}
def sum[A](nums: List[A])(implicit tc: Monoid[A]) =
nums.foldLeft(tc.zero)(tc.append)
sum(List(1, 2, 3, 5, 8, 13))
sum(List(3.14, 1.68, 2.72))
implicit
implicit
Type Class
1.抽象分离
2.可组合
3.可覆盖
4.类型安全
Type Class
val list = List(1,3,234,56,5346,34)
list.sorted sorted[B >: A](implicit ord: math.Ording[B])
51. 逆变与协变
List[+T]
class Person(name: String) {
def shut = println(s"I am $name")
}
class Coder(language: String, name: String) extends Person(name) {
def code = println(s"Coding in $language")
}
val persons: List[Coder] = List(new Coder("Java", "Jeff"),
new Coder("Haskell", "Harry"))
def traverse(persons: List[Person]) = persons.foreach(_.shut)
traverse(persons)
59. Monad
自函子上的幺半群
回想一下幺半群的单位元
回想一下fold函数
什么是自函子上的单位元呢?
什么是自函子上的结合运算呢?
Unit x >>= f ≡ f x
M >>= unit ≡ m
(m >>= f) >>= g ≡ m >>= (λx . F x >>= g)
单位元:将元素提升进计算语境
结合律:结合简单运算形成复杂运算
60. 一些常见Monad
Option
Option或叫Maybe,表示可能失败的计算
由Some(Value)或None表示
Some(x) fMap (f: A => Some[B]) = Some(f(x))
None fMap(f: A => Some[B]) = None
Unit = Some
val maybe: Option[Int] = Some(4)
val none: Option[Int] = None
def calculate(maybe: Option[Int]):
Option[Int] = for {
value <- maybe
} yield value + 5
calculate(maybe)
calculate(none)
61. 一些常见Monad
List
集合本身是Proper type,它代表的是不确定性
Unit = List
val list1 = List(2, 4, 6, 8)
val list2 = List(1, 3, 5, 7)
for {
value1 <- list1
value2 <- list2
} yield value1 + value2
69. map(f: T => U)
filter(f: T => Boolean)
flatMap(f: T => Seq[U])
sample(fraction: Float)
groupByKey()
reduceByKey(f: (V, V) => V)
mapValues(f: V => W)
RDD
NEW
Count()
Collect()
Reduce(f: (T, T) => T)
Lookup(k: K)
Save(path: String)
take(n: Int)
T
r
a
n
f
o
r
m
a
t
i
o
n
A
c
t
i
o
n
union()
join()
cogroup()
crossProduct
sort(c Comparator[K])
partitionBy(p: Partitioner[K])
70. Word Count
[(K1, V1)] -> [(K2, [V2])] -> [(K2, V3)]
lines = spark.textFile("hdfs://...")
words = lines.flatMap(_.split(“//s+”))
wordCounts = words.map((_, 1))
result = wordCounts.reduceByKey(_ + _)
result.save(“hdfs://…”)
RDD
79. MLlib
SVM with SGD
NB
各类决策树
Classification
LabeledPoint(Double, Vector)
val data = sc.textFile(“….")
val parsedData = data.map { line =>
val parts = line.split(' ')
LabeledPoint(parts(0).toDouble, parts.tail.map(x => x.toDouble).toArray)
}
val numIterations = 20
val model = SVMWithSGD.train(parsedData, numIterations)
val labelAndPreds = parsedData.map { point =>
val prediction = model.predict(point.features)
(point.label, prediction)
}
val trainErr = labelAndPreds.filter(r => r._1 != r._2).count.toDouble / parsedData.count
80. MLlib
逻辑回归Regression
岭回归与
拉锁回归
LabeledPoint(Double <- Vector)
val data = sc.textFile(“….")
val parsedData = data.map { line =>
val parts = line.split(',')
LabeledPoint(parts(0).toDouble, parts(1).split(' ').map(x => x.toDouble).toArray)
}
val numIterations = 20
val model = LinearRegressionWithSGD.train(parsedData, numIterations)
val valuesAndPreds = parsedData.map { point =>
val prediction = model.predict(point.features)
(point.label, prediction)
}
val MSE = valuesAndPreds.map{ case(v, p) =>
math.pow((v - p), 2)}.reduce(_ + _) / valuesAndPreds.count
81. MLlib
Clustering
Clustering:
k均值
及其变种k均值++ Vector
val data = sc.textFile(“….")
val parsedData = data.map( _.split(' ').map(_.toDouble))
val numIterations = 20
val numClusters = 2
val clusters = KMeans.train(parsedData, numClusters, numIterations)
val WSSSE = clusters.computeCost(parsedData)
82. MLlib
支持显性和隐性的ALS
Collaborate Filtering
Rating(Int, Int, Double)
val data = sc.textFile(“….")
val ratings = data.map(_.split(',') match {
case Array(user, item, rate) => Rating(user.toInt, item.toInt, rate.toDouble)
})
val numIterations = 20
val model = ALS.train(ratings, 1, 20, 0.01)
val usersProducts = ratings.map{ case Rating(user, product, rate) => (user, product)}
val predictions = model.predict(usersProducts).map{
case Rating(user, product, rate) => ((user, product), rate)
}
val ratesAndPreds = ratings.map{
case Rating(user, product, rate) => ((user, product), rate)
}.join(predictions)
val MSE = ratesAndPreds.map{
case ((user, product), (r1, r2)) => math.pow((r1 - r2), 2)
}.reduce(_ + _) / ratesAndPreds.count