Group by aggregation in spark to sum marks

How to use groupby in spark to sum values.

Input:

id,name,subject,marks

1,Shiva,Electrical,90

2,Ram,Electrical,85

3,Mohan,Electrical,95

4,Raju,Electrical,96

1,Shiva,Computers,91

2,Ram,Computers,80

3,Mohan,Computers,97

4,Raju,Computers,90


Code:

Let suppose my dataframe name is df and sum the marks of each subject adding all subjects per student.

result_df = df.groupBy("id","name").sum("marks")

result_df.show()


Output:

 id name sum(marks)

3 Mohan 192

1 Shiva 181

4 Raju 186

2   Ram 165






 

Comments