This is a simplest example to run Spark. In my case, I didn’t install Spark following the instructions in the post. Instead, I use “sdkman“ which is much simpler. After your Spark is ready, I run the commands below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
// Execute "spark-shell" to start the REPL. // Check Spark version. For me it's "2.3.0". sc.version // Load a file into the scala-shell with the help of sparkcontext: varData = sc.textFile("/Users/maxhuang/.sdkman/candidates/spark/current/README.md") // Split each line into tokens of separate words: var tokens = Data.flatMap(s => s.split(" ")) // Append 1 with each word: var tokens_1 = tokens.map(s => (s,1)) // Calculate frequency of each word by adding all the one’s against one word: var sum_each = tokens_1.reduceByKey((a, b) => a + b) // Let’s check the output: sum_each.collect() // Save the output to a file in local file system: sum_each.saveAsTextFile("/Users/maxhuang/spart_out")