How to work with joins in spark.
Below are the different joins.
Inner join
left join
Right join
Full outer join
Inputs
Table : marks
id name marks
1 Shiva 90
2 Ram 85
3 Mohan 95
4 Raju 96
Table: location
id location
1 Hyderabad
2 Chennai
5 Banglore
marks = spark.read.option("delimiter",",").option("header","true").option("inferSchema","true").csv("C:\\demo\\files\\marks_data.csv")
location = spark.read.option("delimiter",",").option("header","true").option("inferSchema","true").csv("C:\\demo\\files\\location.csv")
Inner Join:
joined_df = marks.join(location,location["id"] == marks["id"],"inner")
joined_df.show()
Left join:
Right Join:
joined_df = marks.join(location,location["id"] == marks["id"],"right")
joined_df.show()
Full Outer Join:
joined_df = marks.join(location,location["id"] == marks["id"],"full_outer")
joined_df.show()
Comments
Post a Comment