Muhammad Abbas - Data Engineering Architect: January 2011

Friday, January 28, 2011

[Informatica] Software Architecture illustrated

Informatica >> Beginners >> Informatica System Architecture

Informatica Software Architecture illustrated
Informatica ETL product, known as Informatica Power Center consists of 3 main components.

1. Informatica PowerCenter Client Tools:
These are the development tools installed at developer end. These tools enable a developer to

•Define transformation process, known as mapping. (Designer)
•Define run-time properties for a mapping, known as sessions (Workflow Manager)
•Monitor execution of sessions (Workflow Monitor)
•Manage repository, useful for administrators (Repository Manager)
•Report Metadata (Metadata Reporter)

2. Informatica PowerCenter Repository:
Repository is the heart of Informatica tools. Repository is a kind of data inventory where all the data related to mappings, sources, targets etc is kept. This is the place where all the metadata for your application is stored. All the client tools and Informatica Server fetch data from Repository. Informatica client and server without repository is same as a PC without memory/harddisk, which has got the ability to process data but has no data to process. This can be treated as backend of Informatica.

3. Informatica PowerCenter Server:
Server is the place, where all the executions take place. Server makes physical connections to sources/targets, fetches data, applies the transformations mentioned in the mapping and loads the data in the target system.

[SQLServer] Scaling Up Your Data Warehouse with SQL Server 2008 R2

Abstract from the Microsoft Official site:

SQL Server 2008 introduced many new functional and performance improvements for data warehousing, and SQL Server 2008 R2 includes all these and more. This paper discusses how to use SQL Server 2008 R2 to get great performance as your data warehouse scales up. We present lessons learned during extensive internal data warehouse testing on a 64-core HP Integrity Superdome during the development of the SQL Server 2008 release, and via production experience with large-scale SQL Server customers. Our testing indicates that many customers can expect their performance to nearly double on the same hardware they are currently using, merely by upgrading to SQL Server 2008 R2 from SQL Server 2005 or earlier, and compressing their fact tables. We cover techniques to improve manageability and performance at high-scale, encompassing data loading (extract, transform, load), query processing, partitioning, index maintenance, indexed view (aggregate) management, and backup and restore.

Tuesday, January 18, 2011

[T-SQL] DB_ID() and DB_NAME() - PURPOSE

-- Get Current DatabaseID
SELECT DB_ID() DatabaseID;
-- Get Current DatabaseName
SELECT DB_NAME() DatabaseName;
-- Get DatabaseName from DatabaseID
SELECT DB_NAME(4) DatabaseID;
-- Get DatabaseID from DatabaseName
SELECT DB_ID('tempdb') DatabaseName;
-- Get all DatabaseName and DBID
SELECT name,database_id
FROM sys.databases;

[T-SQL] Different methods to find leap year

Method 1 : General method

declare @year int
set @year=2000

select
case
when @year%400=0 then 1
when @year%100=0 then 0
when @year%4=0 then 1
else 0
end as is_leap_year
Method 2 : Find if the day difference between Jan 01 of this year to Jan 01 of next year is 366

declare @year int
set @year=2000

select
case
when datediff(day,DATEADD(year,@year-1900,0),DATEADD(year,@year-1900+1,0))=366 then 1
else 0
end as is_leap_year
Method 3 : Find if the day difference between Feb 01 to Mar 01 of this year is 29

select
case
when datediff(day,dateadd(month,1,DATEADD(year,@year-1900,0)),dateadd(month,2,DATEADD(year,@year-1900,0)))=29 then 1
else 0
end as is_leap_year
Method 4 : Find if the 60th day of this year falls in February last day (Feb 29)

declare @year int
set @year=2000

select
case
when day(dateadd(day,59,DATEADD(year,@year-1900,0)))=29 then 1
else 0
end as is_leap_year

Tuesday, January 4, 2011

[T-SQL] Delete vs Truncate

Here is good comparision between delete and truncate

DELETE:

DELETE supports a WHERE clause
DELETE removes rows from a table, row-by-row
Because DELETE moves row-by-row, it acquires a row-level lock
Depending upon the recovery model of the database, DELETE is a fully-logged operation.
Because DELETE moves row-by-row, it can fire off triggers

TRUNCATE:

TRUNCATE does not support a WHERE clause
TRUNCATE works by directly removing the individual data pages of a table
TRUNCATE directly occupies a table-level lock.
(Because a lock is acquired, and because TRUNCATE can also participate in a transaction, it has to be a logged operation)
TRUNCATE is, therefore, a minimally-logged operation; again, this depends upon the recovery model of the database
Triggers are not fired when TRUNCATE is used (because individual row deletions are not logged)

[T-SQL] Data Cache in Memory

Here is the query by which you can see data cache in memory:

USE AdventureWorks
GO
SELECT COUNT(*) AS cached_pages_count,
name AS BaseTableName, IndexName,
IndexTypeDesc
FROM sys.dm_os_buffer_descriptors AS bd
INNER JOIN
(
SELECT s_obj.name, s_obj.index_id,
s_obj.allocation_unit_id, s_obj.OBJECT_ID,
i.name IndexName, i.type_desc IndexTypeDesc
FROM
(
SELECT OBJECT_NAME(OBJECT_ID) AS name,
index_id ,allocation_unit_id, OBJECT_ID
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.hobt_id
AND (au.TYPE = 1 OR au.TYPE = 3)
UNION ALL
SELECT OBJECT_NAME(OBJECT_ID) AS name,
index_id, allocation_unit_id, OBJECT_ID
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.partition_id
AND au.TYPE = 2
) AS s_obj
LEFT JOIN sys.indexes i ON i.index_id = s_obj.index_id
AND i.OBJECT_ID = s_obj.OBJECT_ID ) AS obj
ON bd.allocation_unit_id = obj.allocation_unit_id
WHERE database_id = DB_ID()
GROUP BY name, index_id, IndexName, IndexTypeDesc
ORDER BY cached_pages_count DESC;
GO

[T-SQL] Plan Cache in Memory

Here is the query by which you can see plan cache in memory:

USE AdventureWorks
GO
SELECT [text], cp.size_in_bytes, plan_handle
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
WHERE cp.cacheobjtype = N'Compiled Plan'
ORDER BY cp.size_in_bytes DESC
GO

[T-SQL] DBCC LOGINFO

Do you know that there are multiple small log files in the large log file (LDF). So now your next question would be how can we know that.

Veri simple method

DBCC LOGINFO

You would find the file size and other relevant information in the resultant.

Muhammad Abbas - Data Engineering Architect

Pages

Search This Blog