Spring集成Hadoop和Hbase

hadoop是大數(shù)據(jù)環(huán)境下必備的一套系統(tǒng),使用hadoop集群可以充分的共享服務(wù)器資源,在離線處理上已經(jīng)有了多年的應(yīng)用。

Spring Hadoop簡(jiǎn)化了Apache Hadoop,提供了一個(gè)統(tǒng)一的配置模型以及簡(jiǎn)單易用的API來(lái)使用HDFS、MapReduce、Pig以及Hive。還集成了其它Spring生態(tài)系統(tǒng)項(xiàng)目,如Spring Integration和Spring Batch.。

Spring Hadoop2.5的官方文檔及API地址:

spring-hadoop文檔

spring-hadoop API

Spring Hadoop

  1. 添加倉(cāng)庫(kù),配置依賴(lài)
<repositories>
    <repository>
      <id>cloudera</id>
      <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
      <releases>
        <enabled>true</enabled>
      </releases>
      <snapshots>
        <enabled>false</enabled>
      </snapshots>
    </repository>
  </repositories>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <hadoop.version>2.6.0-cdh5.7.0</hadoop.version>
  </properties>

  <dependencies>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-client</artifactId>
      <version>${hadoop.version}</version>
      <scope>provided</scope>
    </dependency>
    <!-- 添加UserAgent解析的依賴(lài) -->
    <dependency>
      <groupId>com.kumkee</groupId>
      <artifactId>UserAgentParser</artifactId>
      <version>0.0.1</version>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.10</version>
      <scope>test</scope>
    </dependency>
    <!-- 添加Spring Hadoop的依賴(lài) -->
    <dependency>
      <groupId>org.springframework.data</groupId>
      <artifactId>spring-data-hadoop</artifactId>
      <version>2.5.0.RELEASE</version>
    </dependency>
  </dependencies>
  1. 在Spring的配置文件中添加hadoop配置
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xmlns:hdp="http://www.springframework.org/schema/hadoop"
      xmlns:context="http://www.springframework.org/schema/context"
      xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
       http://www.springframework.org/schema/hadoop
       http://www.springframework.org/schema/hadoop/spring-hadoop.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd">
   ......
   
   <!-- 加載屬性文件 -->
   <context:property-placeholder location="application.properties"/>
   <hdp:configuration id="hadoopConfiguration">
       <!-- 服務(wù)器的url -->
       fs.defaultFS=${spring.hadoop.fsUri}
   </hdp:configuration>
   <!-- 裝配文件系統(tǒng)bean以及操作用戶(hù) -->
   <hdp:file-system id="fileSystem" configuration-ref="hadoopConfiguration" user="root"/>
</beans>

只需配置hadoop服務(wù)器,服務(wù)器的url和加載屬性文件。

然后再創(chuàng)建一個(gè)屬性文件application.properties,添加hadoop配置信息。

  1. test
public class SpringHadoopApp {

    private ApplicationContext ctx;
    private FileSystem fileSystem;

    @Before
    public void setUp() {
        ctx = new ClassPathXmlApplicationContext("applicationContext.xml");
        fileSystem = (FileSystem) ctx.getBean("fileSystem");
    }

    @After
    public void tearDown() throws IOException {
        ctx = null;
        fileSystem.close();
    }

    /**
     * 在HDFS上創(chuàng)建一個(gè)目錄
     * @throws Exception
     */
    @Test
    public void testMkdirs()throws Exception{
        fileSystem.mkdirs(new Path("/SpringHDFS/"));
    }
}

或者可以采用直接加載hadoop的配置文件的方式進(jìn)行配置
將<HADOOP_DIR>/etc/hadoop/core-site.xml和<HADOOP_DIR>/etc/hadoop/hdfs-site.xml拷貝過(guò)來(lái)進(jìn)行配值

Spring Data Hbase

  1. 添加依賴(lài)
<dependency> 
        <groupId>org.apache.hadoop</groupId> 
        <artifactId>hadoop-auth</artifactId> 
    </dependency> 
    <dependency> 
        <groupId>org.apache.hbase</groupId> 
        <artifactId>hbase-client</artifactId> 
        <version>1.2.3</version> 
        <scope>compile</scope> 
        <exclusions> 
            <exclusion> 
                <groupId>log4j</groupId> 
                <artifactId>log4j</artifactId> 
            </exclusion> 
            <exclusion> 
                <groupId>org.slf4j</groupId> 
                <artifactId>slf4j-log4j12</artifactId> 
            </exclusion> 
        </exclusions> 
    </dependency> 
  1. 拷貝Hbase配置文件,整合applictionContext.xml

將HBase的配置文件hbase-site.xml復(fù)制到resources下,新建Spring配置文件applicationContext.xml

<?xml version="1.0" encoding="UTF-8"?> 
<beans xmlns="http://www.springframework.org/schema/beans" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
       xmlns:context="http://www.springframework.org/schema/context" 
       xmlns:hdp="http://www.springframework.org/schema/hadoop" 
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd 
    http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd 
    http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd"> 
 
    <context:annotation-config/> 
    <context:component-scan base-package="com.sample.hbase"/> 
    <hdp:configuration resources="hbase-site.xml"/> 
    <hdp:hbase-configuration configuration-ref="hadoopConfiguration"/> 
    <bean id="hbaseTemplate" class="org.springframework.data.hadoop.hbase.HbaseTemplate"> 
        <property name="configuration" ref="hbaseConfiguration"/> 
    </bean> 
</beans> 

配置HbaseTemplate,和hbase配置文件位置

  1. test
@RunWith(SpringJUnit4ClassRunner.class) 
@ContextConfiguration(locations = {"classpath*:applicationContext.xml"}) 
public class BaseTest { 
 
    @Autowired 
    private HbaseTemplate template; 
 
    @Test 
    public void testFind() { 
        List<String> rows = template.find("user", "cf", "name", new RowMapper<String>() { 
            public String mapRow(Result result, int i) throws Exception { 
                return result.toString(); 
            } 
        }); 
        Assert.assertNotNull(rows); 
    } 
 
    @Test 
    public void testPut() { 
        template.put("user", "xiaogao", "cf", "name", Bytes.toBytes("Alice")); 
    } 
} 
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容