image
Suhaila

1

Common Operations In OLAP Cube


Can someone explain the data tables and common operations in OLAP cube, what are they used for and when to be used?



- Suhaila
image

Bjergsen

3

Operations

Drill / roll up

The roll-up operation (also called drill-up or aggregation operation) performs aggregation on a data cube, either by climbing up a concept hierarchy for a dimension or by climbing down a concept hierarchy

Drill / Roll Down

The roll down operation (also called drill down) is the reverse of roll up. It navigates from less detailed data to more detailed data. It can be realized by either stepping down a concept hierarchy for a dimension or introducing additional dimensions.

Slicing

Slice performs a selection on one dimension of the given cube, thus resulting in a subcube.

Dicing

The dice operation defines a subcube by performing a selection on two or more dimensions

Other OLAP operations


Some more OLAP operations include:

SCOPING: Restricting the view of database objects to a specified subset is called scoping. Scoping will allow users to recieve and update some data values they wish to recieve and update.

SCREENING: Screening is performed against the data or members of a dimension in order to restrict the set of data retrieved.

DRILL ACROSS: Accesses more than one fact table that is linked by common dimensions. COmbiens cubes that share one or more dimensions.

DRILL THROUGH: Drill down to the bottom level of a data cube down to its back end relational tables.

In Summary:

•Concept hierarchies organize the values of attributes or dimensions into abstraction levels. They are useful in mining at multiple abstraction levels. •Typical OLAP operations include roll-up, and drill-( down, across, through), slice-and-dice, and pivot ( rotate), as well as some statistical operations. •OLAP operations can be implemented efficiently using the data cube structure.


The tables involved in the OLAP Cube are the dimensions table and fact tables .

A dimension is a structure that categorizes facts and measures in order to enable users to answer business questions. Commonly used dimensions are people, products, place and time.

In a data warehouse, dimensions provide structured labeling information to otherwise unordered numeric measures. The dimension is a data set composed of individual, non-overlapping data elements. The primary functions of dimensions are threefold: to provide filtering, grouping and labelling.

These functions are often described as "slice and dice". Slicing refers to filtering data. Dicing refers to grouping data. A common data warehouse example involves sales as the measure, with customer and product as dimensions. In each sale a customer buys a product. The data can be sliced by removing all customers except for a group under study, and then diced by grouping by product.

A dimensional data element is similar to a categorical variable in statistics.

Typically dimensions in a data warehouse are organized internally into one or more hierarchies. "Date" is a common dimension, with several possible hierarchies: "Days (are grouped into) Months (which are grouped into) Years", "Days (are grouped into) Weeks (which are grouped into) Years" "Days (are grouped into) Months (which are grouped into) Quarters (which are grouped into) Years" etc.

In data warehousing, a dimension table is one of the set of companion tables to a fact table.

The fact table contains business facts (or measures), and foreign keys which refer to candidate keys (normally primary keys) in the dimension tables.

Contrary to fact tables, dimension tables contain descriptive attributes (or fields) that are typically textual fields (or discrete numbers that behave like text). These attributes are designed to serve two critical purposes: query constraining and/or filtering, and query result set labeling.

Dimension attributes should be: Verbose (labels consisting of full words) Descriptive Complete (having no missing values) Discretely valued (having only one value per dimension table row) Quality assured (having no misspellings or impossible values)

Dimension table rows are uniquely identified by a single key field. It is recommended that the key field be a simple integer because a key value is meaningless, used only for joining fields between the fact and dimension tables. Dimension tables often use primary keys that are also surrogate keys. Surrogate keys are often auto-generated (e.g. a Sybase or SQL Server "identity column", a PostgreSQL or Informix serial, an Oracle SEQUENCE or a column defined with AUTO_INCREMENT in MySQL).

The use of surrogate dimension keys brings several advantages, including: Performance. Join processing is made much more efficient by using a single field (the surrogate key) Buffering from operational key management practices. This prevents situations where removed data rows might reappear when their natural keys get reused or reassigned after a long period of dormancy Mapping to integrate disparate sources Handling unknown or not-applicable connections Tracking changes in dimension attribute values

Although surrogate key use places a burden put on the ETL system, pipeline processing can be improved, and ETL tools have built-in improved surrogate key processing.

The goal of a dimension table is to create standardized, conformed dimensions that can be shared across the enterprise's data warehouse environment, and enable joining to multiple fact tables representing various business processes.

Conformed dimensions are important to the enterprise nature of DW/BI systems because they promote: Consistency. Every fact table is filtered consistently, so that query answers are labeled consistently. Integration. Queries can drill into different process fact tables separately for each individual fact table, then join the results on common dimension attributes. Reduced development time to market. The common dimensions are available without recreating them.

Over time, the attributes of a given row in a dimension table may change. For example, the shipping address for a company may change. Kimball refers to this phenomenon as Slowly Changing Dimensions. Strategies for dealing with this kind of change are divided into three categories: Type One. Simply overwrite the old value(s). Type Two. Add a new row containing the new value(s), and distinguish between the rows using Tuple-versioning techniques. Type Three. Add a new attribute to the existing row.


In data warehousing, a Fact table consists of the measurements, metrics or facts of a business process. It is located at the center of a star schema or a snowflake schema surrounded by dimension tables. Where multiple fact tables are used, these are arranged as a fact constellation schema. A fact table typically has two types of columns: those that contain facts and those that are a foreign key to dimension tables. The primary key of a fact table is usually a composite key that is made up of all of its foreign keys. Fact tables contain the content of the data warehouse and store different types of measures like additive, non additive, and semi additive measures.

Fact tables provide the (usually) additive values that act as independent variables by which dimensional attributes are analyzed. Fact tables are often defined by their grain. The grain of a fact table represents the most atomic level by which the facts may be defined. The grain of a SALES fact table might be stated as "Sales volume by Day by Product by Store". Each record in this fact table is therefore uniquely defined by a day, product and store. Other dimensions might be members of this fact table (such as location/region) but these add nothing to the uniqueness of the fact records. These "affiliate dimensions" allow for additional slices of the independent facts but generally provide insights at a higher level of aggregation (a region contains many stores).



- Bjergsen
image

Lili96

0

OLAP Operations

OLAP provides a user-friendly environment for interactive data analysis. A number of OLAP data cube operations exist to materialize different views of data, allowing interactive querying and analysis of the data.

The most popular end user operations on dimensional data are:

Roll up

The roll-up operation (also called drill-up or aggregation operation) performs aggregation on a data cube, either by climbing up a concept hierarchy for a dimension or by climbing down a concept hierarchy, i.e. dimension reduction.

Roll Down

The roll down operation (also called drill down) is the reverse of roll up. It navigates from less detailed data to more detailed data. It can be realized by either stepping down a concept hierarchy for a dimension or introducing additional dimensions.

Slicing

Slice performs a selection on one dimension of the given cube, thus resulting in a subcube.

Dicing

The dice operation defines a subcube by performing a selection on two or more dimensions.

Pivot

Pivot otheriwise known as Rotate changes the dimensional orientation of the cube, i.e. rotates the data axes to view the data from different perspectives. Pivot groups data with different dimensions.

Other OLAP operations

Some more OLAP operations include:

SCOPING: Restricting the view of database objects to a specified subset is called scoping. Scoping will allow users to recieve and update some data values they wish to recieve and update.

SCREENING: Screening is performed against the data or members of a dimension in order to restrict the set of data retrieved.

DRILL ACROSS: Accesses more than one fact table that is linked by common dimensions. COmbiens cubes that share one or more dimensions.

DRILL THROUGH: Drill down to the bottom level of a data cube down to its back end relational tables.

In Summary: Concept hierarchies organize the values of attributes or dimensions into abstraction levels. They are useful in mining at multiple abstraction levels. Typical OLAP operations include roll-up, and drill-( down, across, through), slice-and-dice, and pivot ( rotate), as well as some statistical operations. OLAP operations can be implemented efficiently using the data cube structure.

you may refers to http://athena.ecs.csus.edu/~olap/olap/OLAPoperations.php for more examples



- Lili96
image

JiaQii

1

OLAP OPERATIONS
OLAP provides a user-friendly environment for interactive data analysis. A number of OLAP data cube operations exist to materialize different views of data, allowing interactive querying and analysis of the data.

The most popular end user operations on dimensional data are:

1) Roll up
The roll-up operation (also called drill-up or aggregation operation) performs aggregation on a data cube, either by climbing up a concept hierarchy for a dimension or by climbing down a concept hierarchy, i.e. dimension reduction. Let me explain roll up with an example:
Consider the following cube illustrating temperature of certain days recorded weekly: enter image description here Assume we want to set up levels (hot(80-85), mild(70-75), cold(64-69)) in temperature from the above cube. To do this we have to group columns and add up the values according to the concept hierarchy. This operation is called roll-up. By doing this we obtain the following cube: enter image description here The concept hierarchy can be defined as hot-->day-->week. The roll-up operation groups the data by levels of temperature.

2) Roll Down
The roll down operation (also called drill down) is the reverse of roll up. It navigates from less detailed data to more detailed data. It can be realized by either stepping down a concept hierarchy for a dimension or introducing additional dimensions. Performing roll down operation on the same cube mentioned above: enter image description here The result of a drill-down operation performed on the central cube by stepping down a concept hierarchy for temperature can be defined as day<--week<--cool. Drill-down occurs by descending the time hierarchy from the level of week to the more detailed level of day. Also new dimensions can be added to the cube, because drill-down adds more detail to the given data.

3) Slicing
Slice performs a selection on one dimension of the given cube, thus resulting in a subcube. For example, in the cube example above, if we make the selection, temperature=cool we will obtain the following cube: enter image description here

5) Dicing
The dice operation defines a subcube by performing a selection on two or more dimensions. For example, applying the selection (time = day 3 OR time = day 4) AND (temperature = cool OR temperature = hot) to the original cube we get the following subcube (still two-dimensional): enter image description here

6) Pivot
Pivot otheriwise known as Rotate changes the dimensional orientation of the cube, i.e. rotates the data axes to view the data from different perspectives. Pivot groups data with different dimensions. The below cubes shows 2D represntation of Pivot. enter image description here

Other OLAP operations
Some more OLAP operations include:
1) SCOPING: Restricting the view of database objects to a specified subset is called scoping. Scoping will allow users to recieve and update some data values they wish to recieve and update.
2) SCREENING: Screening is performed against the data or members of a dimension in order to restrict the set of data retrieved.
3)DRILL ACROSS: Accesses more than one fact table that is linked by common dimensions. COmbiens cubes that share one or more dimensions.
4) DRILL THROUGH: Drill down to the bottom level of a data cube down to its back end relational tables.


In Summary:
--- Concept hierarchies organize the values of attributes or dimensions into abstraction levels. They are useful in mining at multiple abstraction levels.
--- Typical OLAP operations include roll-up, and drill-( down, across, through), slice-and-dice, and pivot ( rotate), as well as some statistical operations.
--- OLAP operations can be implemented efficiently using the data cube structure.



- JiaQii
image

AlvinYap

0

OLAP operations are foundation for most UI and functionality used by Data Visualization tools. The DV user (sometimes called analyst) navigates through the DataCube and its DataViews for a particular subset of the data, changing the data’s orientations and defining analytical calculations. The user-initiated process of navigating by calling for page displays interactively, through the specification of slices via rotations and drill down/up is sometimes called “slice and dice”. Common operations include slice and dice, drill down, roll up, and pivot:

Slice:

A slice is a subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions not in the subset. enter image description here

Dice:

The dice operation is a slice on more than two dimensions of a data cube (or more than two consecutive slices). enter image description here

Drill Down/Up:

Drilling down or up is a specific analytical technique whereby the user navigates among levels of data ranging from the most summarized (up) to the most detailed (down). enter image description here

Roll-up:

(Aggregate, Consolidate) A roll-up involves computing all of the data relationships for one or more dimensions. To do this, a computational relationship or formula might be defined. enter image description here

Pivot:

This operation is also called rotate operation. It rotates the data in order to provide an alternative presentation of data – the report or page display takes a different dimensional orientation. enter image description here



- AlvinYap
image

HuiLing

2

OLAP OPERATIONS

Roll-up: Takes the current aggregation level of fact values and does a further aggregation on one or more of the dimensions. Equivalent to doing GROUP BY to this dimension by using attribute hierarchy. Decreases a number of dimensions - removes row headers.

SELECT [attribute list], SUM [attribute names]
FROM [table list]
WHERE [condition list]
GROUP BY [grouping list];

Drill-down: Opposite of roll-up. Summarizes data at a lower level of a dimension hierarchy, thereby viewing data in a more specialized level within a dimension. Increases a number of dimensions - adds new headers

rollupanddrilldown

Slice: Performs a selection on one dimension of the given cube, resulting in a sub-cube. Reduces the dimensionality of the cubes. Sets one or more dimensions to specific values and keeps a subset of dimensions for selected values.

Dice: Define a sub-cube by performing a selection of one or more dimensions. Refers to range select condition on one dimension, or to select condition on more than one dimension. Reduces the number of member values of one or more dimensions.

diceandslice

Pivot (or rotate): Rotates the data axis to view the data from different perspectives. Groups data with different dimensions. pivot

Drill-across: Accesses more than one fact table that is linked by common dimensions. Combines cubes that share one or more dimensions.

Drill-through: Drill down to the bottom level of a data cube down to its back-end relational tables.



- HuiLing
Cancel