Informatica ETL Training
Informatica ETL Training Introduction:
Informatica ETL Training is extraction transformation and load it means extraction of data, Extraction means reading the data. Transformation means the modifying the data and then load nothing but the adding the load or insert and update the data. They are different ETL tools available in the market they are informatica power center, Data Stage, Ab intio, pentaho and SSIS. All this tools are integrating the data from table in to the warehouse.
Informatica ETL training by Global Online Trainings is done on virtual interactive platform and on flexi hour arrangement so that on-job professional can attain this course while doing their regular service/business. ETL Testing Online training is rendered by the best subject matter experts and the tutorials prepared by these expert industry allied tutors are made with latest industry updates.
Prerequisites for Informatica ETL training:
To learn Informatica ETL Training, you must have basic knowledge on
- Oracle 10g, SQL and PLSQL,
- Unix Shell Scripting, Business Intelligence and Data Warehousing,
- ERwin, sql server 2000 2008, DB2, db2 / SQL and Oracle SQL Database.
Informatica ETL Training Course Content
Overview of Informatica ETL Training:
Firstly known the data base to understand Informatica ETL training, it means it is the collection of data. In technical terms all the databases, today have they are called as relational data base management system (RDBMS). Various RDBMS databases have they are basically Oracle, DB2 data base from IBM, SQL server and SYBASE from SAP. All this data bases can stores the data in the form of table.
In order to store data or retrieve the data in data base, it need some language i.e. is SQL language. SQL (structured query language) is the language it helps to read the data and also update the data from database.
Data warehouse is nothing but the collection of data, collecting data from different sources. This is used for business analysis, in this also the data stored in the form of tables. Data warehouse is nothing it store the data in table for analysis.
Online analytical processing (OATP) system which is basically used for analysis and online transaction processing system this are the system used for day to day operations for custom. The applications of OLTP are E-commerce (Amazon, Internet Banking, EBAY and Retail).
Informatica it helps to create a great data. Informatica needs for know the marketing skills using fantastic fun products and solutions for solve different types of data problems. Informatica is very powerful ETL tool in market. Why informatica compare to other tools, means it is simple, powerful, capable to do all ETL with efficient results.
ETL stands for extract transform and load ,as the data is read from the heterogeneous sources from the source system and loaded in to staging table and then finally after the transformations are complete it is wrote into a data warehouse. So the complete process of extracting the data, transforming the data and loading the data into warehouse is called Extract Transform and Load.
It not a one off process for dataware housing because there are in the operational systems and the source systems keeps on changing so this is a regular process. Where the data has been regularly fetched from the social system and loaded into a warehouse after transformations are complete.
So important aspect of the warehousing because if the source data from proportional system and so system is not extracted cleansed and integrated into a proper format in the warehouse then it will be difficult to perform the query processing efficiently which is ultimate purpose of the dataware housing .
ETL has three phases:
- Data Extraction
- Data Transformation
- Data Load of the extractor and the transform data into the warehouse.
In the process of data extraction in this Informatica ETL training, it has mentioned the data from various sources has been extracted into the staging area as a first step in the dataware housing. So there are couples of data extraction strategy.
- First is full extraction where in all the data from the operational systems and all the source systems gets extracted in to a staging area.
- This is generally happens in two scenarios one in for the initial load when the dataware house is getting popular in first time or any scenarios with would have nay strategy for identifying that changed record.
- So extract the full data and do all the transformation and identify all the changed or the modified records on the staging area.
- The next strategies is be partial extraction with a bit notification so sometimes get the notification from the source systems that which data has been updated which has been deleted and which data is the new data so this is also called the data and in this strategy only extract the data .
- Which has been modified and it is easy and quick as compared to the full extraction and the next partial extraction without update notification, so in this strategy do not extract the full data set from system. Extraction the data based on certain keys or certain strategies.
The data extracted into a staging server from the source system is in a raw format and we cannot use that data as it is it has to be cleansed, map as to the requirement and transformed before it is finally ready for loading into a data warehouse, so this stage all the transformations all the cleansing and all the mappings happen.
- Basic transformation tasks are first is the selection, in this select the data which is required to be loaded in to it they are get up warehouse or which is actually meant to be transformed so in this step would select those data.
- Next step is the matching in this step will look up the data from various local files and then match the data that needs to be transformed .and then data cleansing or enrichment so data is not cleans in our source systems.
- It is not standardized because we are fetching the data from than one source system so it has to be standardized or normalized hence we do the data cleansing and enrichment.
- Last is consolidations or summarization as it mentioned earlier, we consolidate and aggregate the data from the source system because do not want to load same data from social system into warehouse.
Loading the data into the warehouse from staging server purchased the prepared data in the transformation phase and load into a data house.
Types of Loading:
Initial Load: When the load the data for the very first time, we do not care about identifying the newly or the modified records we generally take the whole data set from the staging server and load it into it data warehouse, so this is the one u process and it is generally done when the data is going to be populated with it on the for the very first time in the data of in warehouse.
Incremental Load: Applying all the ongoing changes from the system into the data warehouse periodically so in this we only load the records which has either changed or the newly inserted course into social systems.
Full Refresh: This is basically completely erasing the contents of one or more tables and loading the fresh data for certain tables. And load all the data which is extracted from social systems so this completes about the loading strategies.
Role of ETL testing in Informatica ETL training:
ETL testing in Informatica ETL training is validating the data between OLTP system and warehouse. Once the data being loaded into the warehouse, so the reports will generated out of the warehouse.
Generation of reports can be bar and pie chart, so the generation of reports from warehouse is known as BI (business intelligence). Similarly to ETL tools it will have the different BI tools they are BO, OBIEE and COGNOS.
Basically ETL testing is divided in to:
- Smoke testing
- Functional testing
- Integration testing
- Performance testing
- Regression testing
Smoke testing: In this whenever the migration happens, in order to check the whether the all the programs developed in the environment. It can validate the tables present in the dataware house database and also validate the ETL programs.
Functional testing: It is performs major role in ETL testing, in this it can validate the count between two tables. And it can do duplicate validation, which is present in target table. And it has done null value validation, column level validating using the minus or file comparison depending upon. Functional testing it also has done referential integrity validation.
Integration testing or system testing: In this system testing or informatica testing it can validate job schedules and email validation.
Performance testing: Performance testing, in this basically all ETL programs are performing properly or not according to the expectations.
Regression testing: Regression testing is enhanced data tables, so it is also plays major role in ETL tool testing.
ETL Testing Work Flow Process:
ETL testing process in Informatica ETL training is similar to other testing process. The ETL testing process is done in two ways either Manual Testing or Data Warehouse Testing. The main difference in this testing other testing is to identifying the source and data model.
In this we explain the step by step process of ETL testing online training. The first step is to estimation, in this number of tables’ complexity of business rules and volume of the data what we are going to move from source to staging and the performance of the table.
Then next one is planning, in this it involve what is in scope and out scope and what are the dependency and what are the risk and migration plans.
Test design, it depends on mapping document which would be shot by business team or development team. An also it involves in sql scripts and test scenarios.
Execution it is the most important step in ETL testing, in this execution it involves to monitor jobs. And also defect retesting and regression testing.
Finally the step is closure, the closure activity as usual so giving the sign off and promoting the code to next phase.
Features and Capabilities of ETL tools:
- The data connectivity with the source and the target system, the ETL tools will allow to connect put the source and the target data systems efficiently and seamlessly.
- Scalability and performance, we should not be able to change a job in order to perform it better, whenever the data has been increased.
- ETL tool should have pre-built transformation connectors which allow doing the transformation easily and quickly.
- ETL tool should have a data profiling and data cleaning components, which allows standardizing and cleaning the data.
- It also have a can abilities easily performed logging and exception handling.
- It should have robust administration features, which allow us to schedule the jobs.
- And it is connect easily to get with web services.
- It has capability to perform the operations in batch as well as the real time.
There are number f ETL tools like Informatica DVO and Informatica Powercenter. DVO means data validation option in the point of view informatica the DVO means automated ETL testing tool. The Informatica DVO it supports SQL views and lookups. It is connected different data bases and transfers that actual data to from each table. Data validation option it is also supports some command line utilities and then finally the dvo it means it is an automating testing tool.
Informatica powercenter it is also the ETL tool. The ETL tool in the informatica powercenter is known as sample mapping, the sample mapping is nothing but we have a source it is in the OLTP system and target it is present in the Data warehouse.
To extract the data from source and load the data in target. In between transformation of source to target, it has different operations have for transferring data.