Sqoop
sqoop is a Tool in Hadoop that pulls data from RDBMS tables to HDFS ( Import )
and pushes data from HDFS to RDBMS tables ( Export )
Sqoop in data lakes :
Incremental loads for sqoop incase if the job failed we need to delete the records which were uploaded wrongly
using awk command and need to rerun the job by taking back up of the file By default, sqoop-export appends new rows to a table; each input record is transformed into an
INSERT statement that will adds a row to the target database table. If your table has constraints
(e.g., a primary key column whose values must be unique) and already contains data, We need to be aware to avoid inserting records that violate these constraints. The export process will fail if an
INSERT statement fails. This mode is mainly intended for exporting records to a new,
empty table intended to accept these results.
$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES \
-m 8
$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES \
--fields-terminated-by '\t' --lines-terminated-by '\n' \
--optionally-enclosed-by '\"'
$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES \
--hive-import
$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES \
--where "start_date > '2010-01-01'"
$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES \
--split-by dept_id
$ sqoop import-all-tables --connect jdbc:mysql://db.foo.com/corp
$ sqoop create-hive-table --connect jdbc:mysql://db.example.com/corp \
--table employees --hive-table emps
$ sqoop eval --connect jdbc:mysql://db.example.com/corp \
--query "SELECT * FROM employees LIMIT 10"
$ sqoop eval --connect jdbc:mysql://db.example.com/corp \
-e "INSERT INTO foo VALUES(42, 'bar')"
$ sqoop export --connect jdbc:mysql://db.example.com/foo --table bar \
--export-dir /results/bar_data
0 Comments