Spark : Maximum salary according to city

Hi,

 

in  this blog i will create a program to find out maximum salary of employee according to city

 

Input :-

003 Amit Delhi India 12000
004 Anil Delhi India 15000
005 Deepak Delhi India 34000
006 Fahed Agra India 45000
007 Ravi Patna India 98777
008 Avinash Punjab India 120000
009 Saajan Punjab India 54000
001 Harit Delhi India 20000
002 Hardy Agra India 20000

 

Our program is like :-

import org.apache.spark._
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql._

object salmax extends App {

System.setProperty(“hadoop.home.dir”, “c://winutil//”)
val conf=new SparkConf().setMaster(“local[2]”).setAppName(“testfilter”)
val sc = new SparkContext(conf)
val rdd2=sc.textFile(“file:\\D://sparkprog//inputdata/maxsalary”).map{ x => x.split(” “) }
.map{ x =>((x(2)),(x(1),x(4).toDouble))}.groupByKey
for(i<-rdd2)
{
println(i._1,i._2.filter(x=>x._2==i._2.map(x=>x._2).max))
}

 

 

our output will be like :-

(Delhi,List((Deepak,34000.0)))
(Patna,List((Ravi,98777.0)))
(Punjab,List((Avinash,120000.0)))
(Agra,List((Fahed,45000.0)))

 

 

hope you guys understand the program , any doubt pls comment , like and share

thanks.

CHeers

}

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s