Sunday 29 April 2018

Syntel interview - ETL Developer

Syntel Interview

Date - 14/04/2018
Post - ETL Developer(Informatica)

There were 2 rounds of interview.
L1  and L2 followed by HR duscussion  .

L1 round - Few of the technical Questions asked -
1. Draw informatica architecture and explain the same
2. Transaction Control -explain with example
3. Can we cantrol the way target are loaded in mapping and how --> Question was regarding target load plan option present in informatica.
4.what are nodes, Grid  and why do we use with greed ?
5. 2 sources present say A and B having employeeid as common attribute . Scenario is to obtain below 3 results -
Target1 - records matching in both of them.
Target2 - all those records which are present in A and not present in B
Target3 - all those records which are present in B and not present in A
6. Write a shell script which will below file validations -
1.Check the file name
2.Check the file extension
3.check the file delimiter
4.check the file header
Send a mail if any of above fails.

7. Few questions on grep and cut comands


L2  was more of mangerial round.
Basic questions. Not much technical.

Hope these questions help you to be prepared. Do let me know if u need answers to these questions.

😊😊


Viewics INC (A ROCHE COMPANY) Interview


Interview date - 28/04/2018
Post - ETL Developer
Located - Besides Rahul theatre,Shivajinagar




Appeared for a technical test -
1 hour test.
Paper comprised of questions from informatica scenarios, SQl analytical queries, Unix shell scripting and DWH concepts.
Sharing below the questions asked. Will soon post the answers to them.

Informatica Scenario Questions-

1.Design a mapping to create 2 target table - one with unique and other with duplicates.
Input     Target1     Target2
A                 A             C
B                 B              C
C                 D              C
C                                  E
D                                  E
E
E

2. Design a mapping to get salary of next employee in currrent record. If no next employee present set next Salary as NULl.

Input                                      Output
id  Salary                  Id  Salary Next_Salary
10 10000.                    10  10000  20000
20  20000.                   20  20000  30000
30  30000.                   30  30000   NULL

3. Design a mapping to generate a fibonaccu series. Source as id from 1-2000 .

Source
1
2
3
.
.
.
2000

Target
1
2
3
5
8
13
.
.

4. Design a mapping for below-
There arr only 3 Product type . But we dont know how many product id as available for each product type. Create a mapping so that output as only 3 records from audio 3 from video and 3 from movie. In case if any of the product type is less than 3 then  either of the other product type can be sent to output . Output must have only 9 records.


Product_id, product_type
10, video
10, Audio
20, Audio
30, Audio
40, Audio
50, Audio
10, Movie
20, Movie
30, Movie
40, Movie
50, Movie
60, Movie


5 Design a mapping for below

Input
id     val
1      a,b,c
2      pq,m,n
3      asz,ro,liqt
Output
id     val
1      a
1      b
1      c
2      pq
2      m
2      n
3      asz
3      ro
3      liqt


Shell scripting Questions-
1. How to list all the files names present  in a folder to a file.
2.How to search which filenames has a particular string or pattern.
3.What is nohup in unix
4.file contains name age salary
How fo find the sorted distinct names ans ita count from that file
5. How to get 10-20 th charactet from a file .


DWH concepts questions -
1. what are junk dimensions
2.What are degenerate dimensions
3.Difference between star and snowflake schema.


Hope it helps everyone.😊😊


Saturday 21 April 2018

Target Update Override - Ways to Update the target without any primary key defined

When we have to perform update on the target table there are 2 ways to perform it :-

  1. Using a UPDATE STRATEGY transformation(DD_UPDATE)
  1. Configuring Session and defining properties to "Treat all rows as UPDATE"

But both these approach works only when there is a Primary Key defined in the target table
The session will fail when there will not be any key defined in the target.

Alternatively, if you want to perform TARGET updates,  with putting condition on COLUMNs not included as PRIMARY KEY in target table, it will not be feasible with out changing the Keys on target table. i.e. If PRIMARY KEY is defined on COLUMN_A in TABLE_A and you want to Update TABLE_A with conditions on or including COLUMN_B


In such cases the UPDATE_OVERRIDE property of TARGET Object can be useful.

  • Where to loacte UPDATE_OVERRIDE property?

  1. Go to Mapping
  1. Double Clink on the concerned TARGET or edit the Target
  1. Click on Properties Tab
  1. The second Transformation Attribute is the property we are looking for

  • Syntax for UPDATE_OVERIDE SQL:

UDATE      <TARGET TABLE>
SET            <COLUMN NAME to be updated> = :TU.<TARGET COLUMN PORT NAME (As in Designer)>   
                   , [Other columns need to be updated]
WHERE      <COLUMN NAME to be treated as KEY > = :TU.<corresponding TARGET COLUMN PORT NAME (As in Designer)>
AND          [other conditions]

  • Example: consider u want to update the post and date in employee table based on the name of the employee which is not the primary key in the target.

UPDATE    EMPLOYEE
SET            POST = :TU.POST
                , UPDATE_DATE = :TU.UPDATE_DATE
WHERE EMPLOYEE_NAME = :TU.EMPL_NAME


Following is an example showing Problem statement and step by step Instructions, of two approaches, of suggested solution.

Problem Scenario:

  1. Target table has not defined Primary key


  
  1. Informatica target object has no Keys - Since the Database table has not Keys the Imported Object in Informatica will not have any Key in the Target definition



  1. Original Mapping :  The mapping created using this very Target definition



  1. Defining update logic: configuring session for UPDATE Logic 
  1. Configure session for Treat All Source Row as "UPDATE"



  1. Configure in the session target property to UPDATE as UPDATE - optional




  1. The error you will get while execution - the session will fail




Solution:


Approach 1: Using Update in the Target

  1. Go to properties tab of Target in Mapping:



  1. Edit the "Update Override" attribute - generate SQL and add where Clause with the column name on which you want to update:




  1. Please note that you need to remove the very column, used in WHERE clause from the SET Clause list 
  1. Go to Workflow and refresh changes in the mapping and save them 
  1. Execute the Workflow 
  1. Now it has successfully updated the rows:



Approach 2: Forcing Keys in the Target object definition in designer

  1. Edit the Target in the target designer




  1. Define your Primary key column. Please note that in this case the Database need not have keys defined on the same table:



  1. Save the changes 
  1. Execute the workflow 
  1. The workflow will update the records in the target table:

Thursday 12 April 2018

Concurrent Workflow Execution

A concurrent workflow is a workflow that can run as multiple instances concurrently. A workflow instance is a representation of a workflow. We can configure two types of concurrent workflows. 
1. Allow concurrent workflows with the same instance name. Configure one workflow instance to run multiple times concurrently. Each instance has the same source, target, and variables parameters. The Integration Service identifies each instance by the run ID.
2. Configure unique workflow instances to run concurrently. Define each workflow instance name and configure a workflow parameter file for the instance. You can define different sources, targets, and variables in the parameter file.

Concurrent Workflows Configuration

For the demonstration, lets consider a scenario where we need to load daily transaction data from North America, Europe region.  These two files are expected to be available around the same time.

Here we will create one workflow to load the sales transaction data and the same work will be used to load both the file, which can execute concurrently.
Once the workflow is created, enable concurrent execution as shown in below image.
Informatica concurrent workflow configuration
Now Click on “Configure Concurrent Execution” and given the properties as in below image. Provide two different parameter files, which contains the source file information of corresponding region.
Informatica PowerCenter Concurrent Workflow parame
With that concurrent workflow configuration is done. Now to trigger the workflow, you can start the workflow using "Start Workflow Advanced" option as shown below.
Informatica PowerCenter Concurrent Workflow RunningChoose the workflow instance name from the pop up window and click OK to run the selected workflow instance.
Informatica PowerCenter Concurrent Workflow triggering
Form the workflow monitor you can see the running instance of the workflow. As shown in below image you can see the workflow run instance, which is running concurrently.
Informatica PowerCenter Concurrent Workflow Running

Note-  In production environment we execute the workflows from batch or shell scripts. You can also trigger multiple instances of the workflow from pmcmd command if the workflow is enabled for concurrent execution. In this case you can leave the property (a) in step 3 selected.
Below is the command using which you can start and name a particular instance of the same workflow.
pmcmd %workflow_name% %informatica_folder_name% -paramfile %paramfilepathandname% -rin %instance_name%
Using this command you can run as many instances of a single workflow with unique instance names appended to the original workflow name.
This will help you in creating minimal code for maximum functionality.

Thursday 5 April 2018

Email Task In Informatica - Configuration With Example


Email task is used to send emails while running the workflow.Using email task we can get failure and success notifications


Use Email tasks in any of the following locations:


Session properties. You can configure the session to send email when the session completes or fails.


Workflow properties. You can configure the workflow to send email when the workflow is interrupted.


Workflows or worklets. You can include an Email task anywhere in the workflow or worklet to send email based on a condition you define.


To explain this lets take the example of a workflow with two sessions.Session1 and Session2
a)Go to create task and create an Email Task and name it as Test_email

b)Also Enter the following details
  • Email User Name : 
  • Email subject   :
  • Email text:

Email User Name either you can hard code with an email id or give a workflow variable.To give Workflow variable, goto the workflow variable and create a new one to hold email address $$Email_Failure   

 

c)You can give value to workflow variable using parameter File
Emails in the Parameter file
[Sandesh.Workflow:wkfTest]
$$Email_Failure=xyz@mail.com


d)Also make the email task as resuable so that we can use it in multiple sessions. To make it reusable create it using Task Developer.

e)To attach files through Email use the below codes in Email Text

  • %a<>To attach file, Absolute path need to be given <>.
  • %s Session name
  • %e Session status
  • %b Session start time
  • %c Session completion time
  • %i Session elapsed time
  • %l Total records loaded
  • %r Total records rejected
  • %t Target table details
  • %m Name of the mapping used in the session
  • %n Name of the folder containing the session
  • %d Name of the repository containing the session
  • %g Attach the session log to the message


1)To add Email Task via session Level
Go to each session and Components level give the On Failure Email as Test_Email, also give Type as Reusable and Save.This will trigger the email for Failure

2)To add Email Task via Workflow level
Goto workflow and in the Suspension Email enter the Email task name, Also give Suspend on Error and save