Loading data in Hive
Loading data in Hive :
Loading data to internal table :
from local system
:
LOAD DATA LOCAL INPATH
'/home/user/users_table.txt' INTO TABLE IT ;
from HDFS system
:
LOAD DATA INPATH '/home/user/users_table.txt'
INTO TABLE IT ;
CREATE EXTERNAL TABLE TABLE1(
firstname VARCHAR(64),
lastname VARCHAR(64),
address STRING,
country VARCHAR(64),
city VARCHAR(64),
state VARCHAR(64),
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
LOCATION '/user/data/staging/'
Loading data using sqoop :
sqoop import --connect jdbc:teradata://192.168.25.25/Database=retail
--connection-manager org.apache.sqoop.teradata.TeradataConnManager --username dbc
--password dbc --table SOURCE_TBL --target-dir /user/hive/base_table -m 1
Incremental load using sqoop :
sqoop import --connect jdbc:teradata://192.168.25.25/Database=retail --connection-manager org.apache.sqoop.teradata.TeradataConnManager --username dbc --password dbc --table SOURCE_TBL --target-dir /user/hive/incremental_table -m 1 --check-column modified_date --incremental lastmodified --last-value '2017-01-01'
Using Query on source table:
sqoop import --connect jdbc:teradata://192.168.25.25/Database=retail
--connection-manager org.apache.sqoop.teradata.TeradataConnManager --username dbc
--password dbc --target-dir /user/hive/incremental_table -m 1
--query 'select * from SOURCE_TBL where modified_date > '2017-01-01' AND $CONDITIONS’
Incremental load in HDFS on Hive :
0 Comments