Configuring Apache Hive Connections for Big Data

Prerequisites:

Apache Hive 0.14.0 is installed and configured to connect to the big data source.

When you require data analytics for your organization, you can use to OneData to exchange master data with big data sources such as Apache Hadoop. OneData establishes a JDBC connection to Apache Hive, which acts as the intermediate system that fetches data from the big data source.

To configure an Apache Hive connection

1. On the Menu toolbar, select Administer > System > Connection Manager.

2. In the Connection Type drop down, select JDBC.

3. Do one of the following:

Click Add Connection to add a new connection.

Click the Edit icon to edit an existing connection.

4. Configure the connection details, using the following table as a guide:

Property	Description
Connection Name	Mandatory. Enter the unique connection name of 100 characters or less. The name can include spaces.
Description	Optional. Description of how the connection is used.
Connection Type	Displays the connection type selected in the previous screen.

5. Configure the connection parameters, using the following table as a guide:

Property	Description
Database	Mandatory. Select Hive.
Database Version	Database version. Optional.
Connection Type	Mandatory. Select Direct Connection.
Application Server	Mandatory. Select Other.
Application Server Version	Optional. Application server version.
Connection String/ Data Source Name	Specify the connection string the following format: jdbc:hive2://Server Name:Port Number/ Example: jdbc:hive2://10.60.2.37:10000/
Driver Class	Enter the following driver class: org.apache.hive.jdbc.HiveDriver
User-ID	Mandatory. User ID to connect to Hive.
Password	Mandatory. Password of Hive user ID.
Schema Name (If different from User-ID)	Not required.
Target Server Name	Optional for information purposes only. Enter the name of the target server.
Associated Hook	Optional. The hook to be executed at connection logon.

6. Click Save to save the new connection. Click Test Connection to verify the connection details.

Note: OneData automatically tests the connection when you save the connection details. You can also test the connection from the main Connection Manager screen.

7. In Hive, do the following:

a. Navigag\te to the location of the hive-site.xml file and add the following parameters:

<property>
<name>hive.support.concurrency</name>
<value>true</value>
</property>
<property>
<name>hive.enforce.bucketing</name>
<value>true</value>
</property>
<property>
<name>hive.exec.dynamic.partition.mode</name>
<value>nonstrict</value>
</property>
<property>
<name>hive.txn.manager</name>
<value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
</property>
<property>
<name>hive.compactor.initiator.on</name>
<value>false</value>
</property>
<property>
<name>hive.compactor.worker.threads</name>
<value>10</value>
</property>

b. If you want to OneData to perform ACID (atomicity, consistency, isolation, durability) transactions (insert, update, and delete) on any table, set the property transactional=true on the particular table.