fact table
A fact table is the central table in a star schema of a data warehouse. A fact table stores quantitative information for analysis and is often denormalized.A fact table works with dimension tables. A fact table holds the data to be analyzed, and a dimension table stores data about the ways in which the data in the fact table can be analyzed. Thus, the fact table consists of two types of columns. The foreign keys column allows joins with dimension tables, and the measures columns contain the data that is being analyzed.
Suppose that a company sells products to customers. Every sale is a fact that happens, and the fact table is used to record these facts. For example:
Time ID | Product ID | Customer ID | Unit Sold |
4 | 17 | 2 | 1 |
8 | 21 | 3 | 2 |
8 | 4 | 1 | 1 |
Now we can add a dimension table about customers:
Customer ID
|
Name
|
Gender
|
Income
|
Education
|
Region
|
1 | Brian Edge | M | 2 | 3 | 4 |
2 | Fred Smith | M | 3 | 5 | 1 |
3 | Sally Jones | F | 1 | 7 | 3 |
In this example, the customer ID column in the fact table is the foreign key that joins with the dimension table. By following the links, you can see that row 2 of the fact table records the fact that customer 3, Sally Jones, bought two items on day 8. The company would also have a product table and a time table to determine what Sally bought and exactly when.
When building fact tables, there are physical and data limits. The ultimate size of the object as well as access paths should be considered. Adding indexes can help with both. However, from a logical design perspective, there should be no restrictions. Tables should be built based on current and future requirements, ensuring that there is as much flexibility as possible built into the design to allow for future enhancements without having to rebuild the data.
==============================
dimension table
In data warehousing, a dimension is a collection of reference information about a measurable event. These events are known as facts and are stored in a fact table. Dimensions categorize and describe data warehouse facts and measures in ways that support meaningful answers to business questions. They form the very core of dimensional modeling.
A data warehouse organizes descriptive attributes as columns in dimension tables. For example, a customer dimension’s attributes could include first and last name, birth date, gender, etc., or a website dimension would include site name and URL attributes.
A dimension table has a primary key column that uniquely identifies each dimension record (row). The dimension table is associated with a fact table using this key. Data in the fact table can be filtered and grouped (“sliced and diced”) by various combinations of attributes. For example, a Login fact with Customer, Website, and Date dimensions can be queried for “number of males age 19-25 who logged in to funsportsite.com more than once during the last week of September 2010, grouped by day.”
Dimension tables are referenced by fact tables using keys. When creating a dimension table in a data warehouse, a system-generated key is used to uniquely identify a row in the dimension. This key is also known as a surrogate key. The surrogate key is used as the primary key in the dimension table. The surrogate key is placed in the fact table and a foreign key is defined between the two tables. When the data is joined, it does so just as any other join within the database.
Like fact tables, dimension tables are often highly de-normalized, because these structures are not built to manage transactions they are built to enable users to analyze data as easily as possible.
No comments:
Post a Comment