作者是通過metastore方式實(shí)現(xiàn)spark連接hive數(shù)據(jù)庫,所以首先啟動metastore:
hive --service metastore
另外需要將core-site.xml、hdfs-site.xml、hive-site.xml三個文件復(fù)制到的spark/conf文件夾下。

image.png
hive-site.cml中要包含metastore的地址:

image.png
spark代碼:
# -*- coding: utf-8 -*-
from pyspark.sql import SparkSession
spark = SparkSession\
.builder\
.appName('spark read hive')\
.master('local')\
.enableHiveSupport()\
.getOrCreate()
hive_data = spark.sql("show databases")
hive_data.show()
# 讀數(shù)據(jù)庫
read_data = spark.sql("select * from database.table limit 10")
read_data.show()
# 寫數(shù)據(jù)庫 append:追加模式;overwrite:清空表重寫
read_data.write.format("hive").mode("append").saveAsTable('database.table2')
本篇spark連接的是本機(jī)hive,若是遠(yuǎn)程連接參考spark遠(yuǎn)程讀寫hive數(shù)據(jù)庫