Sunday, 11 January 2015

DATAWAREHOUSE CONCEPTS- TOP 10 TOPICS OF 2014

This post is just a recap of the most discussed topic in Dwhlaureate for the year 2014.


  1. VIRTUAL COLUMN IN ORACLE 11G 
    Virtual Column is one of the new features introduced in Oracle 11g….. 

  1. OBIEE 11.1.1.7- New Features 
    OBIEE 11.1.1.7 has been released by Oracle and they included more than two hundred enhancements, designed to add new functionality for improved performance and deliver an enhanced end user experience….. 

  1. Hierarchical Retrieval In ORACLE or Hierarchical Queries 
    There is provision to display the reporting pattern in an organization….. 

  1. Invisible Indexes - Oracle 11g New Feature 
    Before 11g, 
    Oracle didn’t have the facility to make an Index Invisible or visible…… 

  1. “ VoltDB ” - NewSQL DBMS For High Velocity Applic... 
    VoltDB is a high performance scalable RDBMS and 
    NewSQL Database primarily for Big Data, high velocity OLTP and Real-time analytics designed by…… 

  1. Introduction To NewSQL 
    NEWSQL is a new class of Databases products that offers high performance and scalability like 
    NoSQL for online transaction processing and also 

  1. What is Massively Parallel Processing (MPP) With the advent of Big Data sets mainstream technologies like Massively Parallel Processing (MPP) systems is experiencing vital growth…. 
  1. Oracle Database 12C New Feature-Fetch& Offset 
    In this post we will discuss about a new feature that was introduced in 
    Oracle 12c to limit the rows returned in an easier and effective manner compared to methods …. 

  2. How to Unlock the Locked Table in ORACLE 
    Oracle puts locks while performing any DDL or DML operation on oracle tables. When table locks is present on any tables in Oracle we cannot run DDL on those tables. 

  1. Informatica Scenario: How to Load Data in Cyclic Order 
    In this 
    scenario we will load the records present in the DEP table to Flat files in a cyclic order such that the first record will go to TargetFile01 ,2nd to TargetFile 02 ,3rd to Target File03 and 4th record again to TargetFile01. 


Friday, 12 December 2014

Informatica PowerCenter Session Partitioning-Type of Informatica Partitions


Informatica provides you the option of enhancing the performance of the Informatica session by the The PowerCenter® Partitioning Option. After tuning all the performance bottlenecks we can further improve the performance by addition partitions.




We can either go for Dynamic partitioning (number of partition passed as parameter) or Non-dynamic partition (number of partition are fixed while coding). Apart from used for optimizing the session, Informatica partition become useful in situations where we need to load huge volume of data or when we are using Informatica source which already has partitions defined, and using those partitions will allow to improve the session performance.
The partition attributes include setting the partition point, the number of partitions, and the partition types.

Partition Point:
There can be one or more pipelines inside a mapping. Adding a partition point will divide this pipeline into many pipeline stages. Informatica will create one partition by default for every pipeline stage. As we increase the partition points it increases the number of threads. Informatica has mainly three types of threads –Reader, Writer and Transformation Thread.

The number of partitions can be set at any partition point. We can define up to 64 partitions at any partition point in a pipeline. When you increase the number of partitions, you increase the number of processing threads, which can improve session performance. However, if you create a large number of partitions or partition points in a session that processes large amounts of data, you can overload the system.



You cannot create partition points for the following transformations:
• Source definition
• Sequence Generator
XMLParser
• XML target
• Unconnected transformations

The partition type controls how the Integration Service distributes data among partitions at partition points. The Integration Service creates a default partition type at each partition point. 

Type of partitions are :
1. Database partitioning,
2. Hash auto-keys
3. Hash user keys
4. Key range
5. Pass-through
6. Round-robin.

Database Partitioning
For Source Database Partitioning, Informatica will check the database system for the partition information if any and fetches data from corresponding node in the database into the session partitions. When you use Target database partitioning, the Integration Service loads data into corresponding database partition nodes.
Use database partitioning for Oracle and IBM DB2 sources and IBM DB2 targets. 

Pass through
Using Pass through partition will not affect the distribution of data across partitions instead it will run in single pipeline.which is by default for all your sessions. The Integration Service processes data without redistributing rows among partitions. Hence all rows in a single partition stay in the partition after crossing a pass-through partition point.



Key range
Used when we want to partition the data based on upper and lower limit. The Integration Service will distribute the rows of data based on a port or set of ports that we define as the partition key. For each port, we define a range of values.Based on the range that we define the rows are send to different partitions



Round robin partition is used to when we want to distributes rows of data evenly to all partitions

Hash auto-keys: The Integration Service uses a hash function to group rows of data among partitions. The Integration Service groups the data based on a partition key.

Hash user keys: The Integration Service uses a hash function to group rows of data among partitions. We define the number of ports to generate the partition key.

Related Posts Plugin for WordPress, Blogger...

ShareThis