Spark SQL支持批处理-流处理、批处理-批处理以及流处理-流处理的JOIN,语义和传统批处理JOIN一致。

语法

tableReference [, tableReference ]* | tableexpression
[ joinType ] JOIN tableexpression [ joinCondition ];

约束

当进行流数据的JOIN操作时,有一些JOIN类型是不支持的,具体请参见Spark官方文档说明,下面简要列举一些类型:
左表 右表 Join 类型 是否支持
Stream Static Inner Supported, not stateful
Left Outer Supported, not stateful
Right Outer Not supported
Full Outer Not supported
Static Stream Inner Supported, not stateful
Left Outer Not supported
Right Outer Supported, not stateful
Full Outer Not supported
Stream Stream Inner Supported, optionally specify watermark on both sides + time constraints for state cleanup.
Left Outer Conditionally supported, must specify watermark on right + time constraints for correct results, optionally specify watermark on left for all state cleanup.
Right Outer Conditionally supported, must specify watermark on left + time constraints for correct results, optionally specify watermark on right for all state cleanup.
Full Outer Not supported