Spark Streaming :- Word Counts in a folder for text files



In this blog i will write a program for spark streaming which will read textfiles in a folder on regular basis and print the word counts

program as follows

imp :- i use maven in this( u can find mavan dependency in my previous post)
import org.apache.spark.SparkConf
import org.apache.spark.streaming._
import org.apache.spark.streaming.StreamingContext._
import org.apache.spark._
import scala.collection.mutable.ArrayBuffer
object teststreaming {

def main(args: Array[String]) {

System.setProperty(“hadoop.home.dir”, “c://winutil//”)
val conf = new SparkConf().setAppName(“Application”).setMaster(“local[2]”)
//val sc = new SparkContext(conf)
val ssc = new StreamingContext(conf,Seconds(30))
val input=ssc.textFileStream(“file:///C://Users//HA848869//Desktop//sparkdata//”)
val lines=input.flatMap(_.split(” “))
val counts=words.reduceByKey(_+_)
val arr = new ArrayBuffer[String]();



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s