實現(xiàn)spark遠程連接hive數(shù)據(jù)庫,需要將服務(wù)端mysql數(shù)據(jù)庫里的hive數(shù)據(jù)表DBS和SDS里的localhost改為可以訪問到的IP地址或域名。

image.png
更改方式參考:hive修改localhost
本機spark訪問服務(wù)端應(yīng)采用域名的方式。
配置hostname
作者采用dbhive作為統(tǒng)一訪問hostname
服務(wù)端修改/etc/hosts:
nano /etc/hosts
# 10.211.55.2是服務(wù)端內(nèi)網(wǎng)地址
10.211.55.2 dbhive
本機修改/etc/hosts:
nano /etc/hosts
# 119.23.5.xx是服務(wù)端外網(wǎng)地址
119.23.5.xx dbhive
服務(wù)端配置
hadoop下的core-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://0.0.0.0:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/Users/zhi/Documents/app/hadoopdata</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
</configuration>
- fs.defaultFS: 注意將ip設(shè)為0.0.0.0,允許外部訪問。
hadoop下的hdfs-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.datanode.hostname</name>
<value>dbhive</value>
</property>
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>true</value>
</property>
</configuration>
hive下的hive-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://0.0.0.0:9083</value>
<description>Thrift uri for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>localhost</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hivedata?characterEncoding=UTF-8</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>228228</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>hive.server2.thrift.client.user</name>
<value>hive</value>
<description>Username to use against thrift client</description>
</property>
<property>
<name>hive.server2.thrift.client.password</name>
<value>hive228228</value>
<description>Password to use against thrift client</description>
</property>
</configuration>
注意將hive.metastore.uris的ip段設(shè)為0.0.0.0。
本機spark配置文件:
在spark/conf下有三個文件core-site.xml、hdfs-site.xml、hive-site.xml(從服務(wù)端拷貝過來)。
core-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://dbhive:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/Users/zhi/Documents/app/hadoopdata</value>
</property>
</configuration>
- fs.defaultFS: 服務(wù)端的hdfs地址;
- hadoop.tmp.dir: 和服務(wù)端保持一致。
hdfs-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.datanode.hostname</name>
<value>dbhive</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>true</value>
</property>
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
</property>
</configuration>
- dfs.datanode.hostname: 設(shè)置datanode的hostname;
- dfs.datanode.use.datanode.hostname: datanode通訊采用域名方式;
- dfs.client.use.datanode.hostname: 客戶端訪問datanode采用域名方式。
hive-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.metastore.uris</name>
<value>thrift://dbhive:9083</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://dbhive:3306/hivedata?characterEncoding=UTF-8</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
</configuration>
- hive.metastore.uris: 訪問metastore采用域名方式;
- javax.jdo.option.ConnectionURL: 服務(wù)器mysql地址;
- javax.jdo.option.ConnectionUserName: 數(shù)據(jù)庫用戶;
- javax.jdo.option.ConnectionPassword: 數(shù)據(jù)庫密碼。
之后本機spark連接遠程的metastore就可以讀寫了。
配置遠程訪問最重要的是要配置hadoop下的hdfs-site.xml,設(shè)置dfs.datanode.hostname域名,以及開啟dfs.client.use.datanode.hostname和dfs.datanode.use.datanode.hostname開啟域名訪問。hdfs和metastore地址設(shè)為0.0.0.0.