HiveContext vs Spark SQLContext

Hive Context: org.apache.spark.sql.hive.HiveContext

Spark SQLContext: org.apache.spark.sql.SQLContext.

Following are the reasons you need to use hive context, if you want to

  • Use hive metastore tables/views

  • Launch thrift server

  • Run window functions (rank, dense_rank, lag, lead)

  • Use hive udf

  • Persist the table schema for Spark SQL

  • Enable thrift JDBC service for Spark SQL

HiveContext is more battle tested.

Spark 2 is introducing window functions natively and would be compliant with ANSI SQL 2003 standard.