Teradata really moveable space

Because of its particular features and ‘always parallel’ implementation, Teradata must have a very good distribution amongst AMP.

This will enhance performances, query time execution, and space.

In this example, I’m going to explain how you can dramatically affect your space management with a bad distribution

Here is the postulate (I make it simple purposely in order things to be clear);

  • Your system has 10 AMP
  • One database on your system is 100Gb large (100 Gb of maxperm).
  • There is one table in the database

That means, the system automatically dispatches the space, dividing 100Gb by 10 AMP, so 10Gb by AMP

Now if you choose bad distribution for your table, by choosing badly your PI (Primary Index) that is not reflecting a good selectivity, you can have the following figure

 

AMP_repartition

In theory, on 100Gb, only 20Gb are used and you can pick back 80Gb of free space.

In theory, yes you can! But practically, you can not

Just try to perform against your database a move space for 80Gb, and you will receive an error message saying you can not perform this operation…

Why? Because if you try to remove 80GB amongst 10AMP, the system tries to remove equally 80/10 so 8Gb on each AMP. It will work for AMP 1 to 9, but not for AMP 10!!! Data are stored on this AMP.

The real space you can only pick back is the following

  • You take the most loaded AMP, take the Maxperm (10Gb) minus currentperm for this AMP (9Gb), so 1Gb
  • You multiply this 1Gb by the number of AMP (10) and you obtain the quantity of space you can really pick back, that means, 10 X 1 = 10Gb

If you compare this 10Gb to 80Gb theoretically moveable, what a difference, right?

70GB will stay free, but never really usable, and never moveable…

So here is a query you can perform against your system to know, the amount of space you can really move:

SELECT
TRIM(C.databasename) AS DATABASENAME,
C.VPROC_COUNT AS VPROC_COUNT,
C.BIGGER_SPACE AS AMP_WITH_MORE_USED_SPACE,
C.MAXPERM AS MAXPERM_BY_AMP,

CASE
  WHEN C.BIGGER_SPACE = 0 THEN 0
  WHEN C.BIGGER_SPACE <> 0 AND C.MAXPERM <>0 THEN
  (C.MAXPERM - C.BIGGER_SPACE) * VPROC_COUNT
  ELSE 0
END
AS REUSABLE_SPACE

FROM
(
SELECT
DatabaseName,
COUNT(VProc) ,
CAST(MAX(CurrentPerm) AS BIGINT),
CAST(MAX(MaxPerm) AS BIGINT)
FROM DBC.DiskSPace
GROUP BY 1
) C  (DatabaseName, VPROC_COUNT, BIGGER_SPACE, MAXPERM)

WHERE
  REUSABLE_SPACE > 1000000000
  AND DatabaseName NOT IN
  ('DBC', 'Crashdumps', 'viewpoint', 'SystemFe', 'SYSBAR')

ORDER BY
REUSABLE_SPACE DESC