T PUMP in Teradata
T PUMp in teradata is a utilitiy for Data loading which helps maintain (update, delete, insert, and atomic upsert) the data in Teradata Database.
It Allows the near-real time data to be achieved in the data warehouse.
In real time Teradata TPump updates information by acquiring data from the client system with low processor utilization.
uses row hash locks rather than table level locks,instead of updating Teradata Databases overnight, or in batches throughout the day,
This will allow queries to be run while Teradata TPump is running.
Simple, hassle-free setup so does not require staging of data, intermediary files, or any special hardware.
Efficient, time-saving operation so jobs could continue running in spite of database restarts ,dirty data, and network slowdowns. Jobs will restart without any intervention.
How TPT Works :
One or more distinct Teradata TPump tasks can be executed in series with any Teradata TPump support commands, with a single invocation of Teradata TPump,
The Teradata TPump task provides the acquisition of data from client files for the application to target tables using INSERT, UPDATE, or DELETE statements that specify the full primary index of the tables. Data is fetched from the client, and is sent as transaction rows to Teradata Database, which will be immediately applied to the various target tables.
Each Teradata TPump task acquires data from one or many client files with same or different layouts. From each source record, one or more INSERT, UPDATE, or DELETE statements can be made and directed to any target table.
Benefits of TPump:
The following concepts tells us how Teradata TPump is understood.
• To describe the task
which needs to be accomplished,the language of Teradata TPump commands and statements is use.
• Teradata TPump examines all the commands and statements for a task, from the BEGIN
LOAD command till the END LOAD command, before actually executing the task.
• After all commands and statements in a given task have been processed and
validated by Teradata TPump, then Teradata TPump task will be executed .
•Teradata TPump supports data serialization for a provided row optionally, which guarantees that if a row insert is immediately followed by a row update, the insert is processed first. This is done by hashing records to a given session.
• It supports bulletproof restartability using time-based checkpoints. Using
frequent checkpoints provides a greater ease in restarting, but at the expense of the
checkpointing overhead.
• It supports upsert logic similar to MultiLoad.
• It supports insert/update/delete statements in multiple-record requests.
• It uses macros to minimize network overhead.
• It supports interpretive, record manipulating and restarting features similar to MultiLoad.
• It supports conditional apply logic similar to MultiLoad.
• It supports error treatment options similar to MultiLoad.
• It runs as single process.
• It supports Teradata Database internationalization features such as kanji
character sets.
• Up to 2430 operations could be packed into a single request for network efficiency. This limit of 2430 may change as the overall limit for a request is one megabyte. Teradata TPump
assumes that every statement will be one- or two- (for fallback) step request.
Operating Modes
Teradata TPump runs in the below operating modes:
1)Interactive
• Interactive processing will require more or less continuous participation of the user.
2) Batch
• Batch programs will process data in discrete groups of previously scheduled operations, mostly in a separate operation, rather than interactively or in real-time.
Limitations:
Max Rowsize of data plus indicator is 64k.SELECT command , CONCATENATION and AGGREGATE Functions are not allowed .
With one TPUMP File, there is Limit of Four Import Commands .
For the dates before 1900 or after 1999, the year portion of the date should be represented by four numerals (yyyy).
Unlike Mload and Fast load which uses Access logging , TPUMP does not use access logging for performance benefits.
Specify the values for the partitioning column set while performing Teradata TPump deletes and updates to avoid lock contention problems which can degrade performance. We should Avoid updating the primary index and partitioning columns with Teradata TPump to reduce performance degradation.
0 Comments