ETL stands for Extract-Transform-Load and it is a process of how data is loaded from the source system to the target System.
For example, there is a retail store which has different departments like sales, marketing, logistics etc. Each of them is handling the customer information independently, and the way they store that data is quite different. The sales department have stored it by customer’s name, while marketing department by customer id.
Now if they want to check the history of the customer and want to know what the different products he/she bought owing to different marketing campaigns; it would be very tedious.
The solution is to use a Data warehouse to store information from different sources in a uniform structure using ETL.
The following diagram gives you the ROAD MAP of the ETL process.
ETL Testing : It is a process to validate the data in the source and the target system based on the business requirement.
ETL Testing Process:
Similar to other Testing Process, ETL also go through different phases. The different phases of ETL testing process is as follows :
1. Identifying data sources and requirements.
2. Data acquisition.
3. Implement business logic and dimensional Modelling.
4. Build and populate data.
5. Build Reports.
To understand more on the ETL scenario, please find the below diagram:
The reason because the ETL testing is slightly different from the normal manual testing. Here the Test Engineer should have the following skills to test the business expectations.
1) Ability to write the SQL (Oracle/Microsoft/Mysql) queries.
2) Understanding the SQL Joins.
3) Understanding of the Data warehousing Concepts.
4) Data Analyzing capability.
5) Knowledge on the ETL Tools eg. Informatica, DataStage, SSIS etc.
6) Knowledge of UNIX.
7) understanding of job Scheduling tools ex: Control M.
8) Understanding the End-End data flow through the source and the Target Systems.
9) Ability to understand and create the test Data.
10) The Testing Fundamentals.
ETL Testing assures that information is not just loaded correctly but it is exactly aggregated and properly used for the function. So it is important.
Actually there is a wrong concept prevails among the people like if they have tested the database they’ve tested the data warehouse. But it is not correct.
OLTP (on-line transaction processing) testing is database testing. But OLAP (on-line analytical processing) system testing applies to the data warehouse, and more specifically, ETL testing. The difference is this: Database testing compares data from source to target tables. Data warehouse (or ETL) testing traces the accuracy of information and date present throughout the data warehouse.