DWH Informatica Session (1).pdf | Data Warehouse | Oracle Database
Short Description
By. Data Warehousing & Informatica. Deepthi.G. AGENDA. INTRODUCTION TO DATAWAREHOUSING DATAWAREHOUSING ARCHITECTURE ...
Description
By
Data Warehousing & Informatica
Deepthi.G
AGENDA
INTRODUCTION TO DATAWAREHOUSING
DATAWAREHOUSING DATAWAREHOUSING ARCHITECTURE
STEPS FOR BULIDING DATAWAREHOUSE
TYPES OF SCHEMAS
CONCEPTS OF DATAWAREHOUSING
LIST OF TOOLS
AGENDA
INTRODUCTION TO DATAWAREHOUSING
DATAWAREHOUSING DATAWAREHOUSING ARCHITECTURE
STEPS FOR BULIDING DATAWAREHOUSE
TYPES OF SCHEMAS
CONCEPTS OF DATAWAREHOUSING
LIST OF TOOLS
INTRODUCTION TO INFORMATICA
INFORMATICA ARCHITECTURE
COMPONENTS OF INFORMATICA
WORKING WITH INFORMATICA
INSTALLATION AND CONFIGURATION
AS PER W.H.INMON DATWAREHOUSING IS
Subject-Oriented
Integrated
Time-Variant
Non-volatile
The other name of data warehousing is Decision support system (DSS)
Subject Oriented Analysis Process Oriented
Subject Oriented
Entry Sales Rep Quantity Sold Prod Number Date Customer Name Product Description Unit Price Mail Address
Transactional Storage
Sales Sales Customers Customers Products Products
Data Warehouse Storage
Integration of Data
App l. A - M, F App l. B - 1, 0 App l. C - X, Y
M, F
Unit of Attributes
App l. A - pi pel ine c m. App l. B - pi pel ine inc hes App l. C - pi peli ne m cf
pipeline cm
Physical Attributes
App l. A - b alan ce d ec(13,2) App l. B - bal ance PIC 9(9)V99 App l. C - bal ance fl oat
balance dec(13, 2)
Naming Conventions
App l. A - bal -on -han d App l. B - c ur ren t_b alan ce App l. C - b alance
balance
Data Consistency
App l. A - dat e (Julian) App l. B - d ate (yymmd d) App l. C - dat e (absol ut e)
date (Julian)
Encoding
Transactional Storage
Data Warehouse Storage
Volatility of Data Non-Volatile
Volatile
Insert
Change
Access
Delete Insert
Load
Change Access Record-by-Record Data Manipulation
Transactional Storage
Mass Load / Access of Data
Data Warehouse Storage
Time Variant Data Analysis
Current Data
Historical Data Sales ( Region , Year - Year 97 - 1st Qtr) 20 15 Sales ( in lakhs 10 )
East West North
5 0 January February
March
Year97
Transactional Storage
Data Warehouse Storage
Data warehouses store large volumes of data which are frequently used by DSS
It is maintained separately from the organization’s operational databases
Data warehouses are relatively static with only infrequent updates
A data warehouse is a stand-alone repository of information, integrated from several, possibly heterogeneous operational databases
Is the enabling technology that facilitates improved business decision-making
It’s a process, not a product
A technique for assembling and managing a wide variety of data from multiple operational systems for decision support and analytical processing
It’s a journey not an destination……
Data Warehouse Architecture Source
Staging Area
Data Warehouse
Data Mart
Analysis
Oracle Metadata
Reporting
Teradata Raw Data
Summary Data
DB2 Data Mining
SQL Server
Source: It’s Database where data is extracted Ex : Oracle Teradata Sybase DB2
Staging area:
It’s a temporary storage area used for the process of data
Meta Data:
Data about the data.
Or
Description of the data.
Data Mart :
A Data mart is nothing but a Data warehouse but for specific domain
A Data mart can be divided into two types: Independent Data mart Dependent Data mart
Steps For Building A Data warehouse
Identify key business drivers, sponsorship, risks . Survey information needs and identify desired functionality and define functional requirements for initial subject area. Architect long-term, data warehousing architecture Evaluate and Finalize DW tool & technology Conduct Proof-of-Concept Design target data base schema Build data mapping, extract, transformation, cleansing and aggregation/summarization rules Build initial data mart, using exact subset of enterprise data warehousing architecture and expand to enterprise architecture over subsequent phases Maintain and administer data warehouse
Snow Flake Schema
Same use star flake schema but the cube will have at least one dimension with two/more levels under at least Two hierarchy.
List Of Tools ETL TOOLS
Informatica,Ascential Data , IBM Visual Warehouse Oracle warehouse Builder
OLAP SERVER
Oracle Express Server, Hyperion Essbase, IBM DB OLAP Server, Microsoft SQ Server OLAP Services, Se HOLOS, SAS/MDDB .
OLAP TOOLS
Oracle Express Suite, Bu Objects, Web Intelligence, Cognos Powerplay/Improm
KALIDO, MicroStrategy, B Query, MetaCube . O
f
INTRODUCTION TO INFORMATICA
It is an ETL TOOL.
Extracting of data from sources
Performing the Transformations
Loading the data in to target
INFORMATICA ARCHITECTURE Informatica Repository manager
Source
Repository Server
Informatica server
validation session
Designer
Repository Admin console
Workflow Manager
Target
Status
Workflow Monitor
Components of Informatica
REPOSITORY MANAGER
DESIGNER
SERVER MANAGER
REPOSITORY MANAGER
REPOSITORY SECURITY
FOLDER MANAGEMENT
METADATA REPORTING
REPOSITORY MAINTENANCE
ANALYSIS WINDOW NAVIGATOR WINDOW
DEPENDENCY WINDOW
OUTPUT WINDOW
REPOSITORY SECURITY ♦
CREATE USERS
♦
CREATE GROUPS
♦
ASSIGN PRIVILEGES
♦
MOVE USERS INTO GROUPS
♦
ASSIGN ADDITIONAL PRIVILEGES
TO USERS
REPOSITORY SECURITY ♦
LOCK TYPES ( READ, WRITE, EXEC, FETCH, SAVE )
♦
OBJECT LOCKS ( FOLDERS, SOURCE DEF., TARGET DEF. )
♦
VIEW LOCKS ( EDIT| SHOW LOCKS )
♦
UNLOCKING OBJECTS
FOLDER MANAGEMENT ♦
FOLDER ATTRIBUTES * OWNER * PERMISSIONS * SHARED * SHORTCUT * VERSIONS
DESIGNER
SOURCE ANALYZER TO CREATE SOURCE DEFINITIONS
WAREHOUSE DESIGNER TO CREATE TARGET DEFINITIONS
TRANSFORMATION DEVELOPER TO CREATE REUSABLE TRANSFORMATIONS
MAPPLET DESIGNER TO CREATE REUSABLE MAPPINGS
MAPPING DESIGNER TO CREATE SOURCE TO TARGET MAPPINGS
Designer
Mapping = Source +Transformation+Target
Transformation : 2 Types
Active Transformation Passive Transformation
ACTIVE TRANSFORMATION
PASSIVE TRANSFORMATION
Sorter
Lookup
Rank
Expression
Router
Stored Procedure
Normalizer
Sequence generator
Source Qualifier
External Procedure
Joiner
XML Source Qualifier
Aggregator Advance external Procedure Update Strategy Custom Transformation Transformation control Union
SERVER MANAGER
CONFIGURE SERVER
CREATE SESSION
START SESSION
MONITOR SESSION
VIEW LOGS
CORRECT SESSION PROBLEMS
Important Bottlenecks:
TEST SQL QUERY CHECK SESSION LOG FOR ERRORS CHECK PERFORMANCE DETAILS REDUCE NUMBER OF RECORDS PROCESSED INDEX THE SOURCE REPLACE DEFAULT QUERY WITH AN OPTIMIZED QUERY DROP INDEXES BEFORE LOADING CONSIDER INCEASING COMMIT LEVEL.
INSTALLATION AND CONFIGURATION
SYSTEM REQUIREMENTS :
OPERATING SYSTEM ( WINDOW 95/98/ NT 4.0 )
DISK SPACE ( 120 MB )
RAM ( 32 MB)
CONNECTIVITY ( MERANT ODBC 3.50 )
NETWORK SUPPORT ( TCP/IP OR IPX/SPX )
View more...
Comments