Gangmax Blog

Spark Example

From here.

This is a simplest example to run Spark. In my case, I didn’t install Spark following the instructions in the post. Instead, I use “sdkman“ which is much simpler. After your Spark is ready, I run the commands below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Execute "spark-shell" to start the REPL.
// Check Spark version. For me it's "2.3.0".
sc.version
// Load a file into the scala-shell with the help of sparkcontext:
var Data = sc.textFile("/Users/maxhuang/.sdkman/candidates/spark/current/README.md")
// Split each line into tokens of separate words:
var tokens = Data.flatMap(s => s.split(" "))
// Append 1 with each word:
var tokens_1 = tokens.map(s => (s,1))
// Calculate frequency of each word by adding all the one’s against one word:
var sum_each = tokens_1.reduceByKey((a, b) => a + b)
// Let’s check the output:
sum_each.collect()
// Save the output to a file in local file system:
sum_each.saveAsTextFile("/Users/maxhuang/spart_out")

Comments