Performance Improvement Techniques in DataStage

May 15, 2008 · Filed Under DataStage Articles · 20 Comments 

I have come up with two set of superb documents on performance tuning:

 

1) DataStage Enterprise Edition (PX – Parallel Extender)

2) DataStage Server Edition

 

 These documents are a result of experience gained across numerous successful deployments. It presents a practical handbook for performance improvement techniques which can be used for ETL architecture, job design and development as well as for review, analysis and performance optimization.

Read more

Information Server Overview

May 13, 2008 · Filed Under DataStage Articles · Comment 

This is a sequel of DataStage Overview - Information Server Overview.
Information Server - One of the most significant IBM software releases of recent times. A revolutionary new software platform that helps organizations derive more value from the complex, heterogeneous information spread across their systems This ppt deck provides an overview of Information Server.
Read more

Process Management in DataStage

May 12, 2008 · Filed Under DataStage Articles · 4 Comments 

Purpose :

  • How to avoid “Kill -9” or “SIGKILL” use.
  • How to release a stuck job from DataStage Director.
  • How to kill orphaned / runaway processes using DS Administrator.
  • Use Kill command wihout “-9″.
  • Clean up zombies / orphan phantom processes.
  • How to increase number of jobs / processes to run in DataStage.

What are Phantom processes?
Read more

DataStage Overview | Tutorial | Essentials

May 4, 2008 · Filed Under DataStage Articles, Tech Articles · 30 Comments 

Finally I got enough time to make “DataStage Overview” PPT deck online.  A series of simple and informative slides about this great BI tool.  This ppt deck provides an overview of DataStage and is a result of feedbacks and experience gained across numerous training sessions.

Read more

Surrogate Key Generation in DataStage - An elegant way

April 26, 2008 · Filed Under DataStage Articles, Tech Articles · 1 Comment 

An elegant and fast way to generate surrogate keys in a parallel job!

This is a hot topic discussed and attempted by most of the ETL architects, designers and developers. This article looks at an elegant way for Surrogate Key Generation in a DataStage Parallel job, without having the overhead of creating multiple jobs or state file maintenance. This might fall slightly into the advanced way or for power users, as this includes creation of a parallel routine using DataStage Development Kit (Job Control Interfaces). But the strategy is definitely simple and elegant, and you can do it in one job and maintain the surrogate key in a centralised and editable location – an environment Variable defined in Administrator. Gives you wings to use it across the project in different jobs as well.
Read more

Next Page »