Teradata Architecture

Symmetric multiprocessing (SMP) - A single node that contains multiple CPUs sharing a memory pool.

Massively parallel processing (MPP) - Multiple SMP nodes working together comprise a larger configuration. The nodes are connected using the BYNET, which allows multiple virtual processors on multiple system nodes to communicate with each other.

Shared Nothing Architecture(MPP) - means that each vproc(Access Module Processors and Parsing Engines are Virtual processors) is responsible for its own portion of the database and do not share common components.each AMP manages its own dedicated memory space and the data on its own vdisk -- these are not shared with other AMPs. Each AMP uses system resources independently of the other AMPs so they can all work in parallel for high system performance overall

A node is made up of various hardware and softwares

A clique is a set of Teradata nodes that share a common set of disk arrays. Cabling a subset of nodes to the same disk arrays creates a clique.

A disk array is a configuration of disk drives that utilizes specialized controllers to manage and distribute data and parity acroos the disks while providing fast access and data integrity

RAID 5 Data and parity protection striped across multiple disks

RAID 1 Each disk has a physical mirror replicating the data

Teradata Storage Process

The Parsing Engine interprets the SQL command and converts the data record from the host into an AMP message
The BYNET distributes the row to the appropriate AMP
The AMP formats the row and writes it to its associated disks
The disk holds the row for subsequent access

Teradata Retrieval Process

The Parsing Engine dispatches a request to retrieve one or more rows
The BYNET ensures that appropriate AMP(s) are activated
The AMPs locate and retrieve desired rows in parallel access and will sort, aggregate or format if needed
The BYNET returns retrieved rows to parsing engine
The Parsing Engine returns row(s) to requesting client application

The BYNET is responsible for

Point-to-point communications between nodes and virtual processors
Merging answer sets back to the PE
making Teradata parallelism possible

The Parsing Engine is responsible for

Managing Individual sessions (up to 120)
Parsing and optimizing SQL requests
Dispatching the optimized plan to the AMPs
Sending the answer set response back to the requesting client

The AMP is responsible for

Storing and retrieving rows to and from the disks
Lock Management
Sorting rows and aggregating columns
Join Processing
Output conversions and formatting
Creating answer sets for clients
Disk space management and accounting

Datawarehousing Concepts and Tools

Search This Blog

Teradata Architecture

Labels

Comments

Post a Comment

Popular posts from this blog

Data virtualization

Informatica Powercenter Partitioning

Difference between server jobs and parallel jobs in Datastage