- Supervised Learning
- Linear Regression
- Classification
- Logistic Regression
- Support Vector Machine: good for non-linear classification
- Unsupervised Learning
- Lower dimension representation
- Principle Component Analysis
- Spare representation
- K-Means
- Gaussian Mixture Models
- Independent representation
- Principle Component Analysis
Cost Function
- Regularization
- Maximum Likelihood
- KL divergence
- cross-entropy
Graph Processing
- frameworks
- PageRank: direct graph by Google
- Pregel
- Giraph
- GraphLab
- GraphX
GraphX
GraphX abstracts a graph with an RDD of vertices and an RDD of edges
- Connectd Components:
org.apache.spark.graphx.lib.connectedComponents
- Triangle Counting:
org.apache.spark.graphx.lib.triangleCount
- Shortest Paths:
org.apache.spark.graphx.lib.Shortestpaths
Comments !