Search This Blog

Loading...

Saturday, January 10, 2015

[Teradata] Temporal table example

I wanted to show you an example of the TEMPORAL feature in Teradata. Its a pretty neat feature.

Basically, it will allow us to keep a history of changes where required. For example, the FULL REFRESH dimensions, will change (based on the customers feedback) to a CDC (change data capture). This is normally a pretty complex process.

With the TEMPORAL feature, coupled with UPSERTS (MERGE), it becomes quite simple.


First - TEMPORAL. Basically Teradata under the covers of a table will manage a "PERIOD" data type - instead of a process having to manage the classic START_DT and END_DT logic. In fact, you dont even need to know about the column.

Also - when a user does a select * from the table, they will only get the CURRENT rows returned.

Please see this example to help.

Example FULL REFRESH Table that will move to this process:

CREATE MULTISET TABLE EDW_STAGE.tempo_test_auto_purchase_pack_size ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      auto_purchase_pack_size_id INTEGER,
      duration_mins INTEGER,
      pack_size INTEGER,
      display_order INTEGER,
      is_active CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
      is_default CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
      created_dt TIMESTAMP(6),
      created_by_user_id VARCHAR(50) CHARACTER SET LATIN NOT CASESPECIFIC,
      last_update_dt TIMESTAMP(6),
      last_updated_by_user_id VARCHAR(50) CHARACTER SET LATIN NOT CASESPECIFIC,
      is_deleted CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
      source_sys_id SMALLINT,
      company_brand_cd VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC,
       duration PERIOD(TIMESTAMP(6) WITH TIME ZONE) NOT NULL
         AS TRANSACTIONTIME
      
      )
 PRIMARY INDEX ( auto_purchase_pack_size_id );
 
NOTE the column called duration - it will NOT have values inserted...Teradata manages this.
I will do an initial insert of 3 columns and those look like this when we select them:

select * from tempo_test_auto_purchase_pack_size;

 
Note The 2nd row (ID of 3) where the column duration_mins is 40.

We issue an update (or merge) to this table:
update tempo_test_auto_purchase_pack_size set duration_mins = 95 where auto_purchase_pack_size_id = 3;

Now look at the select * from tempo_test_auto_purchase_pack_size;

 
Note that the new value is the LATEST row....BUT if we want to see history:

  TRANSACTIONTIME AS OF TIMESTAMP '2015-01-10 11:24:10' select * from tempo_test_auto_purchase_pack_size;


 
And finally - so see them all:

 NONSEQUENCED TRANSACTIONTIME select * from tempo_test_auto_purchase_pack_size;


 
Note that with this query you see the duration column for this temporal feature.


Hope this helps.

Wednesday, December 31, 2014

[SQL Server] Change Schema Name Of Table In SQL Server

I want to change schema name of table Employees in Database. In the current table Employees database schema name is dbo I want to change it to exe. How can I do it ?

example:

FROM
dbo.Employees
TO
exe.Employees

I tried with this query.
ALTER SCHEMA exe TRANSFER dbo.Employees

But this give me error.
Error:
Cannot alter the schema 'exe', because it does not exist or you do not have permission.
What I missed or did wrong ?

Solution:

Create Schema :
IF (NOT EXISTS (SELECT * FROM sys.schemas WHERE name = 'exe')) 
BEGIN
    EXEC ('CREATE SCHEMA [exe] AUTHORIZATION [dbo]')
END
 
ALTER Schema :
ALTER SCHEMA exe 
    TRANSFER dbo.Employees

 

Tuesday, December 9, 2014

[DI] ETL versus ELT

When does ETL win?

  • Ordered transformations not well suited to set processing.
  • Integration of third party software tools best managed by Informatica outside of the RDBMS (e.g., name and address standardization utilities).
  • Maximize in-memory execution for multiple step transformations that do not require access to large volumes of historical or lookup data (note: caching plays a role).
  • Streaming data loads using message-based feeds with "real-time" data acquisition.

When does ELT win?

  • Leverage of high performance DW platform for execution reduces capacity requirements on ETL servers - this is especially useful when peak requirements for data integration are in a different window than peak requirements for data warehouse analytics.
  • Significantly reduce data retrieval overhead for transformations that require access to historical data or large cardinality lookup data already in the data warehouse.
  • Batch or mini-batch loads with reasonably large data sets, especially with pre-existing indices that may be leveraged for processing.
  • Optimize performance for large scale operations that are well suited for set operations such as complex joins and large cardinality aggregations.

[DI] A Taxonomy of Data Integration Techniques

 
 There are three main approaches:


1.ETL Approach: (1) Extract from the source systems, (2) Transform inside the Informatica engine on integration engine servers, and (3) Load into target tables in the data warehouse.

2.ELT Approach: (1) Extract from the source systems, (2) Load into staging tables inside the data warehouse RDBMS servers, and (3) Transform inside the RDBMS engine using generated SQL with a final insert into the target tables in the data warehouse.

3.Hybrid ETLT Approach: (1) Extract from the source systems, (2) Transform inside the Informatica engine on integration engine servers, (3) Load into staging tables in the data warehouse, and (4) apply further Transformations inside the RDBMS engine using generated SQL with a final insert into the target tables in the data warehouse.


Tuesday, February 19, 2013

[Informatica] Partitioning option license required to run sessions with user-defined partition points

Partitioning option license required to run sessions with user-defined partition points.


Quote: TM_6281: Partitioning option license required to run sessions with user-defined partition points.

When I de-select the "Sorted Input" property, it runs fine but it spends needless time caching my entire table to file. Does anyone know why this might be happening and/or a possible resolution? " " To run a PowerCenter session with a sorted Aggregator transformation set the server parameter AggSupprtWithNoPartLic to "Yes" as follows:

Unix Using a text editor open the PowerCenter server configuration (pmserver.cfg) file.
Add the following entry to the end of the file:
AggSupprtWithNoPartLic = Yes
Re-start the PowerCenter server (pmserver).

Windows To set this parameter on Windows refer to article 11486
(article 11486) To add a PowerCenter Server parameter (that is not in the server configuration dialog) on Windows enter it in the registry as follows:

Click Start, click Run, type regedit, click OK. Go to the following registry key: HKEY_LOCAL_MACHINE\ SYSTEM\CurrentControlSet\Services\PowerMart\Paramet ers\Configuration
On the Edit menu, point to New, and then click String Value. Enter the String Value "SERVER_PARAMETER". Go to Edit > Modify. Enter "VALUE" and then click OK Re-start the PowerCenter server (Informatica Service). Where SERVER_PARAMETER is the name of the parameter and VALUE is the setting of the parameter."

Saturday, November 3, 2012

[SQL Server] Code to Enable TDE on a Database

-- The master key must be in the master database.

USE Master;

GO

-- Create the master key.

CREATE MASTER KEY ENCRYPTION

BY PASSWORD='MyStrongPassword';

GO

-- Create a certificate.

CREATE CERTIFICATE MySQLCert

WITH SUBJECT='MyDatabase DEK';

GO

-- Use the database to enable TDE.

USE MyDatabase

GO

-- Associate the certificate to MyDatabase.

CREATE DATABASE ENCRYPTION KEY

WITH ALGORITHM = AES_128

ENCRYPTION BY SERVER CERTIFICATE MySQLCert;

GO

-- Encrypt the database.

ALTER DATABASE MyDatabase

SET ENCRYPTION ON;

GO

Saturday, August 25, 2012

[T-SQL] Database backup procedure - advanced

I have got this script from one of the blogs:

if exists (select * from sysobjects where id = object_id(N'[dbo].[spBackupAllDatabases]') and OBJECTPROPERTY(id, N'IsProcedure') = 1)
drop procedure [dbo].[spBackupAllDatabases]
go

Create Procedure spBackupAllDatabases
@Path varchar(128) ,
@Type varchar(4) -- Full / Log
as
/*
Backup file format
_Full_yyyymmdd_hhmmss.bak
_Log_yyyymmdd_hhmmss.bak

exec spBackupAllDatabases 'c:\SQLBackups\', 'Full'
*/
/*
drop table DatabaseBackup

Create table DatabaseBackup
(
Name varchar(128) primary key nonclustered ,
BackupFlagFull varchar(1) not null check (BackupFlagFull in ('Y','N')) ,
BackupFlagLog varchar(1) not null check (BackupFlagLog in ('Y','N')) ,
RetentionPeriodFull datetime not null ,
RetentionPeriodLog datetime not null
)
*/
set nocount on
declare @sql varchar(1000)

-- Get all database names
create table #DBName
(
ID int identity (1,1) ,
Name varchar(128) not null ,
RetentionPeriod datetime null
)

insert #DBName
(Name)
select name
from master..sysdatabases

-- Include any new databases in the backup
insert DatabaseBackup
(
Name ,
BackupFlagFull ,
BackupFlagLog ,
RetentionPeriodFull ,
RetentionPeriodLog
)
select #DBName.Name ,
'Y' ,
'N' ,
'7 jan 1900' , -- default one week
'1 jan 1900'
from #DBName
left outer join DatabaseBackup
on DatabaseBackup.Name = #DBName.Name
where DatabaseBackup.Name is null
and lower(#DBName.Name) <> 'tempdb'

-- Remove any non-existant databases
delete DatabaseBackup
where not exists
(
select *
from #DBName
where #DBName.Name = DatabaseBackup.Name
)

delete #DBName

create table #ExistingBackups
(
Name varchar(128) ,
ID int identity (1,1)
)

-- loop through databases
declare @Name varchar(128) ,
@RetentionPeriod datetime ,
@LastBackupToKeep varchar(8) ,
@ID int ,
@MaxID int

insert #DBName
(Name, RetentionPeriod)
select Name, case when @Type = 'Full' then RetentionPeriodFull else RetentionPeriodLog end
from DatabaseBackup
where (@Type = 'Full' and BackupFlagFull = 'Y')
or (@Type = 'Log' and BackupFlagLog = 'Y')

select @MaxID = max(ID) ,
@ID = 0
from #DBName

while @ID < @MaxID begin -- get next database to backup select @ID = min(ID) from #DBName where ID > @ID

select @Name = Name ,
@RetentionPeriod = RetentionPeriod
from #DBName
where ID = @ID

-- Delete old backups
delete #ExistingBackups
select @sql = 'dir /B ' + @Path
select @sql = @sql + @Name + '_' + @Type + '*.*'

insert #ExistingBackups exec master..xp_cmdshell @sql

if exists (select * from #ExistingBackups where Name like '%File Not Found%')
delete #ExistingBackups

select @LastBackupToKeep = convert(varchar(8),getdate() - @RetentionPeriod,112)
delete #ExistingBackups where Name > @Name + '_' + @Type + '_' + @LastBackupToKeep

declare @eID int ,
@eMaxID int ,
@eName varchar(128)

-- loop round all the out of date backups
select @eID = 0 ,
@eMaxID = coalesce(max(ID), 0)
from #ExistingBackups

while @eID < @eMaxID begin select @eID = min(ID) from #ExistingBackups where ID > @eID
select @eName = Name from #ExistingBackups where ID = @eID

select @sql = 'del ' + @Path + @eName
exec master..xp_cmdshell @sql
end
delete #ExistingBackups

-- now do the backup
select @sql = @Path + @Name + '_' + @Type + '_'
+ convert(varchar(8),getdate(),112) + '_'
+ replace(convert(varchar(8),getdate(),108),':','') + '.bak'
if @Type = 'Full'
backup database @Name
to disk = @sql
else
backup log @Name
to disk = @sql
end

go