Saturday, 1 October 2016

Oracle Indexes Performance and Creation Guidelines

These guidelines will help you create and manage indexes and help improving the performance by correct usage of indexes.

DON’T ADD INDEXES WORTHLESSLY:
Addition of indexes increases performance but also ingest disk space.Based on the performance improvement add as many indexes as required sensibly.

MARK INDEXES AS UNUSABLE OR INVISIBLE RATHER THAN DROPPING
Before dropping an index think over marking the indexes as unusable and invisible. This give us an extra option to check for any performance issues before dropping the index. If there are any performance issues we can revert back by rebuilding or re-enable the index without requiring the data definition language (DDL) creation statement.

You can read more about Invisible Indexes here:

It’s better to drop the indexes that are not used by any database objects as it would free up the physical space and improve the performance.

INDEXING METHODOLOGY:
Indexing the columns that are used in queries executed against a table will help improve the performance.

CREATE PRIMARY /UNIQUE CONSTARINTS:
Build primary constraints on all tables and unique constraints wherever applicable. This will automatically create a B-tree index if the columns are not already indexed.

USING SEPARATE TABLESPACE FOR INDEXES
Using distinct table space helps in managing indexes separately from tables. Table and index data may have different storage and/or backup and recovery requirements.

USE BITMAP INDEXES IN DATAWAREHOUSE ENVIRONMENT
Bitmap indexes are used for complex queries in a data warehouse environment to prevent spending long time to access and retrieve answers for the queries. B-Tree index technique is used for high cardinality column and Bitmap Indexes have predominantly been used for low cardinality columns.

Bitmap indexes achieve important functions in answering data warehouse’s queries because they have capability to perform operations at the index level before fetching data

To learn more about Bitmap & B-tree indexes check our previous post


USE APPROPRIATE NAMING STANDARDS
Correct naming standards would help in the maintenance and troubleshooting easier.

If you like this post, please share it on google by clicking on the Google +1 button.


Saturday, 17 September 2016

How to Create/Change/Set Databases in Hive?

As discussed in previous posts, HIVE makes it easier for developers to port SQL-based applications to Hadoop, compared with other Hadoop languages and tools.


Hive is most suited for data warehouse applications where data is static and fast response time is not required and record-level inserts, updates, and deletes are not required

Creating a Database
The simplest syntax for creating a database in hive is shown in the following example:
Go to the Hive Shell by giving the command sudo hive and enter the command


CREATE DATABASE <DATA BASE NAME>
EXAMPLE
HIVE> CREATE DATABASE HR_STAGING;

HIVE> CREATE DATABASE IF NOT EXISTS HR_STAGING;


We can suppress the warning if the database hr_staging already exists in the hive database by using IF NOT EXISTS. The general syntax for creating the database in Hive is given below. The keyword ‘SCHEMA’ can be used instead of ‘DATABASE’ while creating database.



CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] DATABASE_NAME
  [COMMENT DATABASE_COMMENT]
  [LOCATION HDFS_PATH]
  [WITH DBPROPERTIES (PROPERTY_NAME=PROPERTY_VALUE, ...)]


The CREATE DATABASE command creates the database under HDFS at the default location: /user/hive/warehouse.
Hive creates a directory for each database. Tables in that database will be stored in sub directories of the database directory. The exception is tables in the default database, which doesn’t have its own directory.

Syntax to see the databases that already exists in hive. 

  

HIVE> SHOW DATABASES;
HR_STAGING

HIVE> CREATE DATABASE EMP_STAGING ;
HIVE> SHOW DATABASES;
HR_STAGING
EMP_STAGING

HIVE> SHOW DATABASES LIKE 'H.*';
HR_STAGING


Using a Database
The USE command sets a database as your working database, similar to changing working directories in a file system:

SYNTAX:
USE <DATABASE_NAME>
HIVE> USE HR_STAGING;
HIVE > USE DEFAULT;


Dropping Database in Hive
Syntax to drop a database:

HIVE> DROP DATABASE IF EXISTS HR_STAGING;


Hive won’t allow to drop the database if they contain tables. In such case we have to either drop the table first or append the CASCADE keyword to the command, which will cause the Hive to drop the tables in the database first.


DROP (DATABASE|SCHEMA) [IF EXISTS] DATABASE_NAME
[RESTRICT|CASCADE];

HIVE> DROP DATABASE IF EXISTS HR_STAGING CASCADE;



Alter Database in Hive
You can set key-value pairs in the DBPROPERTIES associated with a database using the ALTER DATABASE command. No other metadata about the database can be changed, including its name and directory location:

ALTER (DATABASE|SCHEMA) DATABASE_NAME
SET DBPROPERTIES (PROPERTY_NAME=PROPERTY_VALUE, ...); 

ALTER (DATABASE|SCHEMA) DATABASE_NAME
SET OWNER [USER|ROLE] USER_OR_ROLE;

HIVE> ALTER DATABASE HR_STAGING
SET DBPROPERTIES ('EDITED-BY' = 'XXXX');


If you like this post, please share it on google by clicking on the Google +1 button.


Related Posts Plugin for WordPress, Blogger...

ShareThis