Thursday, July 11, 2013

Is Modify Stage can be a alternate for Transformer



The Modify stage is a processing stage. It can have a single input link and
a single output link. It can also be used to modify schema, Keep or Drop the input columns.If you are using transformer just to drop or change schema of columns then it is better we use modify stage.

Below is the syntax of each of the operation possible in modify stage and below mentioned syntax can be used in specification box of modify stage..

'keep field1, field2, ... fieldn; '
'drop field1, field2, ... fieldn;' 

 ‘newField1=oldField1; newField2=oldField2;...newFieldn=oldFieldn;’

a_1 = a
Date field conversions:
dateField = date_from_days_since[date](int32Field)
dateField = date_from_julian_day(uint32Field)
dateField = date_from_string[date_format | date_uformat] (stringField)
dateField = date_from_timestamp(tsField
int8Field = month_day_from_date(dateField)
int8Field = weekday_from_date[originDay](dateField)
int16Field = year_day_from_date(dateField)
int32Field = days_since_from_date[source_date]
uint32Field = julian_day_from_date(dateField)
int8Field = month_from_date(dateField)
dateField = next_weekday_from_date[day](dateField)
dateField= previous_weekday_from_date[day]
stringField = string_from_date [date_format | ufornat] (dateField)
tsField = timestamp_from_date[time](dateField)
int16Field = year_from_date(dateField)
int8Field=year_week_from_date(dateField)

Decimal Field Conversions:
decimal from decimal decimalField = decimal_from_decimal[r_type](decimalField)
dfloat from dfloat dfloatField = mantissa_from_dfloat(dfloatField)
int32 from decimal int32Field = int32_from_decimal[r_type, fix_zero](decimalField)
int64 from decimal int64Field = int64_from_decimal[r_type, fix_zero](decimalField)
string from decimal stringField = string_from_decimal[fix_zero][suppress_zero](decimalField)
uint64 from decimal uint64Field = uint64_from_decimal[r_type, fix_zero](decimalField)

string Conversions:

hue:string[10] = string_trim['Z', end, begin](color)
stringField=substring(string, starting_position, length)
decimalField = decimal_from_string(stringField) Converts strings to decimals.
stringField = string_from_decimal[fix_zero] [suppress_zero] (decimalField) Converts decimals to strings
dateField = date_from_string [date_format | date_uformat] (stringField)
stringField = string_from_date [date_format | date_uformat] (dateField)
int32Field=string_length(stringField)stringField=lowercase_string (stringField)
stringField=uppercase_string (stringField)
stringField = string_from_time [time_format | time_uformat ] (timeField)
int8Field = hours_from_time(timeField) hours from time
int32Field = microseconds_from_time(timeField) microseconds from time
int8Field = minutes_from_time(timeField) minutes from time
dfloatField = seconds_from_time(timeField) seconds from time
dfloatField = midnight_seconds_from_time(timeField) seconds-from-midnight from time
stringField = string_from_time [time_format | time_uformat] (timeField)


Wednesday, July 10, 2013

Quick view of Funnel Stage

Funnel Stage:

Funnel stage is processing stage is Datastage, which is used to combine more than one file into single file. It can support n #.of Input links and one output link.But prerequisite for it is that each and every file in source should have same Metadata. Here metadata means Data types and Column names too.

Funnel Stage provides 3 funnel types in combining the files.Below same has been described.
v     Sequence Funnel.
v     Continuous Funnel.
v     Sort Funnel.

Below is the Stage Page of Funnel Stage.





Continuous Funnel:
            In this type of funneling, all the input rows sent to output as they arrived for processing.

Below example explains in detail.
Below are 3 input files data.

File 1:
EID
ENAME
DEPT NO
MGR_ID
102
Joe
10
105
103
Latha
10
102
101
Leela
20
101

File 2:
EID
ENAME
DEPT NO
MGR_ID
110
Vidhya
30
109
134
Rama
20
101
112
Neethu
10
111

File 3:
EID
ENAME
DEPT NO
MGR_ID
456
Yogesh
10
324
345
Jeevan
20
101
909
Varu
10
101


Below can be the output of Continuous Funnle(you can’t guarantee the Order).
Output:

EID
ENAME
DEPT NO
MGR_ID
102
Joe
10
105
103
Latha
10
102
101
Leela
20
101
456
Yogesh
10
324
134
Rama
20
101
112
Neethu
10
111
110
Vidhya
30
109
345
Jeevan
20
101
909
Varu
10
101


Sequence Funnel:
Sequence Funnel copies all records from the first input data set to the output data  set, then all the records from the second input data set, and so on.

Output would be below for above input.

EID
ENAME
DEPT NO
MGR_ID
102
Joe
10
105
103
Latha
10
102
101
Leela
20
101
110
Vidhya
30
109
134
Rama
20
101
112
Neethu
10
111
456
Yogesh
10
324
345
Jeevan
20
101
909
Varu
10
101


Sort Funnel:
            Sort Funnel combines the input records in the order defined by the Value of one or more key columns and the order of the output records is determined by these sorting keys.         
Typically all input data sets for a sort funnel operation are hash-partitioned before they’re sorted (choosing the auto partitioning method will ensure that this is done). Hash partitioning guarantees that all records with the same key column values are located in the same partition and so are processed on the same node. If sorting and partitioning are carried out on separate stages before the Funnel stage, this partitioning must be
Preserved.
The sort funnel operation allows you to set one primary key and multiple
Secondary keys. The Funnel stage first examines the primary key in each input record. For multiple records with the same primary key value, it then examines secondary keys to determine the order of records it will output.

Output would be below for above 3 input files.

101
Leela
20
101
102
Joe
10
105
103
Latha
10
102
110
Vidhya
30
109
112
Neethu
10
111
134
Rama
20
101
345
Jeevan
20
101
456
Yogesh
10
324
909
Varu
10
101


 

Datastage Doctrina Copyright © 2011 -- Template created by O Pregador -- Powered by Blogger

Receive all updates, tips and tricks via Facebook. Just Click the Like Button Below