1. How did you
handle reject data?
Ans: Typically a Reject-link is defined and the rejected data is loaded back into data warehouse. So Reject link has to be defined every Output link you wish to collect rejected data. Rejected data is typically bad data like duplicates of Primary keys or null-rows where data is expected.
Ans: Typically a Reject-link is defined and the rejected data is loaded back into data warehouse. So Reject link has to be defined every Output link you wish to collect rejected data. Rejected data is typically bad data like duplicates of Primary keys or null-rows where data is expected.
2. If worked
with DS6.0 and latest versions what are Link-Partitioner and Link-Collector
used for?
Ans: Link Partitioner - Used for partitioning the data.
Link Collector - Used for collecting the partitioned data.
Ans: Link Partitioner - Used for partitioning the data.
Link Collector - Used for collecting the partitioned data.
3. What are
Routines and where/how are they written and have you written any routines
before?
Ans: Routines are stored in the Routines branch of the DataStage Repository, where you can create, view or edit. The following are different types of routines:
1) Transform functions
2) Before-after job subroutines
3) Job Control routines
Ans: Routines are stored in the Routines branch of the DataStage Repository, where you can create, view or edit. The following are different types of routines:
1) Transform functions
2) Before-after job subroutines
3) Job Control routines
4. What are
OConv () and Iconv () functions and where are they used?
Ans: IConv() - Converts a string to an internal storage format
OConv() - Converts an expression to an output format.
Ans: IConv() - Converts a string to an internal storage format
OConv() - Converts an expression to an output format.
5. How did you
connect to DB2 in your last project?
Ans: Using DB2 ODBC drivers.
Ans: Using DB2 ODBC drivers.
6. Explain
METASTAGE?
Ans: MetaStage is used to handle the Metadata which will be very useful for data lineage and data analysis later on. Meta Data defines the type of data we are handling. This Data Definitions are stored in repository and can be accessed with the use of MetaStage.
Ans: MetaStage is used to handle the Metadata which will be very useful for data lineage and data analysis later on. Meta Data defines the type of data we are handling. This Data Definitions are stored in repository and can be accessed with the use of MetaStage.
7. Do you know
about INTEGRITY/QUALITY stage?
Ans: Qulaity Stage can be integrated with DataStage, In Quality Stage we have many stages like investigate, match, survivorship like that so that we can do the Quality related works and we can integrate with datastage we need Quality stage plugin to achieve the task.
Ans: Qulaity Stage can be integrated with DataStage, In Quality Stage we have many stages like investigate, match, survivorship like that so that we can do the Quality related works and we can integrate with datastage we need Quality stage plugin to achieve the task.
8. Explain the
differences between Oracle8i/9i?
Ans: Oracle 8i does not support pseudo column sysdate but 9i supports
Oracle 8i we can create 256 columns in a table but in 9i we can upto 1000 columns(fields)
Ans: Oracle 8i does not support pseudo column sysdate but 9i supports
Oracle 8i we can create 256 columns in a table but in 9i we can upto 1000 columns(fields)
9. How do you
merge two files in DS?
Ans: Either use Copy command as a Before-job subroutine if the metadata of the 2 files are same or create a job to concatenate the 2 files into one if the metadata is different.
Ans: Either use Copy command as a Before-job subroutine if the metadata of the 2 files are same or create a job to concatenate the 2 files into one if the metadata is different.
10. What is DS
Designer used for?
Ans: You use the Designer to build jobs by creating a visual design that models the flow and transformation of data from the data source through to the target warehouse. The Designer graphical interface lets you select stage icons, drop them onto the Designer work area, and add links.
Ans: You use the Designer to build jobs by creating a visual design that models the flow and transformation of data from the data source through to the target warehouse. The Designer graphical interface lets you select stage icons, drop them onto the Designer work area, and add links.
11. What is DS Administrator used for?
Ans: The Administrator enables you to set up DataStage users, control the purging of the Repository, and, if National Language Support (NLS) is enabled, install and manage maps and locales.
Ans: The Administrator enables you to set up DataStage users, control the purging of the Repository, and, if National Language Support (NLS) is enabled, install and manage maps and locales.
12. What is DS
Director used for?
Ans: datastage director is used to run the jobs and validate the jobs.
we can go to datastage director from datastage designer it self.
Ans: datastage director is used to run the jobs and validate the jobs.
we can go to datastage director from datastage designer it self.
13. What is DS
Manager used for? [ Merged with Designer in 8.x versions ]
Ans: The Manager is a graphical tool that enables you to view and manage the contents of the DataStage Repository
Ans: The Manager is a graphical tool that enables you to view and manage the contents of the DataStage Repository
14. What are
Static Hash files and Dynamic Hash files?
Ans: As the names itself suggest what they mean. In general we use Type-30 dynamic Hash files. The Data file has a default size of 2Gb and the overflow file is used if the data exceeds the 2GB size.
Ans: As the names itself suggest what they mean. In general we use Type-30 dynamic Hash files. The Data file has a default size of 2Gb and the overflow file is used if the data exceeds the 2GB size.
15. What is Hash
file stage and what is it used for?
Ans: Used for Look-ups. It is like a reference table. It is also used in-place of ODBC, OCI tables for better performance.
Ans: Used for Look-ups. It is like a reference table. It is also used in-place of ODBC, OCI tables for better performance.
16. How are the
Dimension tables designed?
Ans: Find where data for this dimension are located.
Figure out how to extract this data.
Determine how to maintain changes to this dimension.
Change fact table and DW population routines.
Ans: Find where data for this dimension are located.
Figure out how to extract this data.
Determine how to maintain changes to this dimension.
Change fact table and DW population routines.
17. What is the
importance of Surrogate Key in Data warehousing?
Ans : Surrogate Key is a Primary Key
for a Dimension table. Most importance of using it is independent of underlying
database. i.e Surrogate Key is not affected by the changes going on with a
database
18. What does a
Config File in parallel extender consist of?
Ans: Config file consists of the
following.
a) Number of Processes or Nodes.
b) Actual Disk Storage Location.
a) Number of Processes or Nodes.
b) Actual Disk Storage Location.
19. How many
places you can call Routines?
Ans:
Four Places you can call
(i) Transform of routine
(A) Date Transformation
(B) Upstring Transformation
(ii) Transform of the Before & After Subroutines
(iii) XML transformation
(iv)Web base
(i) Transform of routine
(A) Date Transformation
(B) Upstring Transformation
(ii) Transform of the Before & After Subroutines
(iii) XML transformation
(iv)Web base
20. How did you
handle an 'Aborted' sequencer?
Ans: In
almost all cases we have to delete the data inserted by this from DB manually
and fix the job and then run the job again.
21. Is it
possible to calculate a hash total for an EBCDIC file and have the hash total
stored as EBCDIC using Datastage ?
Ans: Currently, the total is converted to
ASCII, even tho the individual records are stored as EBCDIC.
22. Compare and
Contrast ODBC and Plug-In stages?
Ans: ODBC :
a)
Poor Performance.
b)
Can be used for Variety of Databases.
c)
Can handle Stored Procedures.
Plug-In: a) Good Performance. b) Database
specific.(Only one database)
23. What is
Functionality of Link Partitioner and Link Collector?
Ans: Link Partitioner is partitioned the
data the node defined in Configuration File.
Link Collector is collect the partitioned data.
Link Collector is collect the partitioned data.
24. Containers :
Usage and Types?
Containers is a collection of stages used
for the purpose of Reusability. There are 2 types of Containers. a) Local
Container: Job Specific b) Shared Container: Used in any job within a project.
25. Explain
Dimension Modeling types along with their significance
Ans: Data Modelling is Broadly classified
into 2 types. a) E-R Diagrams (Entity - Relatioships). b) Dimensional Modelling.
26. Did you
Parameterize the job or hard-coded the values in the jobs?
Ans: Always parameterized the job. Either
the values are coming from Job Properties or from a ‘Parameter Manager’ – a
third part tool. There is no way you will hard–code some parameters in your
jobs.
27. How did you
connect with DB2 in your last project?
Ans: Most of the times the data was sent
to us in the form of flat files. The data is dumped and sent to us. In some
cases were we need to connect to DB2 for look-ups as an instance then we used
ODBC drive.
28. What are the
often used Stages or stages you worked with in your last project?
Ans: A) Transformer, ORAOCI8/9, ODBC,
Link-Partitioner, Link-Collector, Hash, ODBC, Aggregator, Sort.