In Spark structured streaming, joins can be applied in only certain scenarios. A few such scenarios are given below:
Stream with Stream:
i. If both the dfs are stream, then all joins, such as inner, left and right, are supported, since the resulting frame will be a stream. The only exception is that a full join will not be supported since both of the dfs are stream, and if one df is late, the other has to wait until its data arrives.
Static and Stream
ii. If one df is static and the other is a stream, then we can perform an inner join. However, left and right joins are supported only when the left table is a stream in case of a left join and the right table is a stream in case of a right
join. The reason behind this is simple: if the right table is a stream and you perform a left join, then the result will be a static table only.
iii. A full join between static and stream is not supported for the same reason. The stream df may have late-arriving data and the join may have to be delayed, thus producing inconsistent results.