Saturday, 27 September 2014

Data Vault - Data warehouse modeling

In last couple of months I am obsess with the Data Vault (DV) modeling technique, to design enterprise data warehouse (EDW). The inventor od DV is Dan Linstedt (www.danlinstedt.com). 

Below are some comments sourced from the Lucrum Business Intelligence (www.lucruminc.com), I think they are very good summary of the DV modelling.   
__________
.... The data vault is not a product.  It is not a magic pill that makes all your IT ills go away. It is a comprehensive approach to addressing real world issues with existing implementations.  It brings real flexibility and adaptability to the implementation and brings reliability and dependability to the business.  And with a team that understands the power of the data vault, you are now able to take your Business Intelligence environment upon which the tools that do the flash and splash can be sourced from. ....
__________        

... Modelling Rules for Each Part of the DV Entity:
FOR HUBS:
  • Hubs keys cannot migrate into other hubs (no parent/child like HUBS).
  • Hubs must be connected through links.
  • More than two hubs can be connected through links.
  • Surrogate keys may be used.
  • Business keys are 1 to 1 relationship with surrogate keys.
  • Hubs primary keys always migrate outward.
  • Hub business keys and primary keys never change.
  • If a hub has two or more satellites, then a point-in-time table can be built for ease of joins.
  • An ‘UNKNOWN’ business key record can be inserted into Hub that can be used to tie other data in links and sats that has no business keys in source. This kind of data is usually a bad/incomplete source data.
FOR LINKS:
  • Links can be connected to other links.
  • Links must have atleast two hubs associated with them in order to instantiated.
  • Surrogate keys may be used.
  • The combination of surrogate business keys made a unique key.
  • Does not contain descriptive data.
  • Does not contain begin and end dates.
FOR SATELLITES:
  • Satellites may be connected to hubs or links.
  • Have 1 and only 1 parent table.
  • Satellites always contain either a load date-time stamp, or a numeric reference to a stand-alone load date-time sequence table.
  • Primary key is a combination of ‘surrogate key’ from either hub or link and the load datetime stamp.
  • Surrogate keys may not be used.
  • Must have a Load End Date to indicate when the CHANGE to the data set has occurred.
  • Satellites are always delta driven. Duplicate rows should not appear.
  • Data is separated into satellite structures based on 1) type of information 2) rate of change.....



PS. The images was sourced from this web site  http://thefuturevalueofbusiness.com/index.php?s=Data+Vault , however this web site in not any more active. 


WhereScape RED – Different Dimension Types (Article from Aug’ 2008)


Very old and very good post by WhereScape Team (Raphael K.) describing different dimensions RED supports.
 
"RED operates with dimension types that are recognized by modern data warehouse theory. When a developer starts working on dimension tables he gets to chose from four options for the default generation of the dimension table and its update procedure via the following dialog box: 



Article URL http://blog.wherescape.com/technology/dimension-types-what-are-the-different-dimension-types-and-how-they-are-implemented-in-wherescape-red/

VCE Results Over the Year, Australia – Victoria, High School

In this blog we would like the share the data set, we collected over multiple public sources – related to VCE Results.
What is VCE? - The VCE is Victorian Certificate of Education, is the credential awarded to secondary school students who successfully complete high school level studies (Year 11 and 12 or equivalent) in the Australian state of Victoria.
The data set contains:
  • School information
  • VCE completion rated,
  • Median VCE student score and
  • Percentages of students with VCE score of 40
Below are the URLs to point to the Public Tableau reports, visualising the data set:

Raw Data Set



High School Ranked by VCE Score



High School Map



Dashboard - Shows VCE High School Results Over the Years 
There are three sheets: (No.1) School ranking, (No.2) School map and (No.3) School VCE information over the years. 
Please note! Report (No.1) and (No.2) are used as a filters.


References and Tech. Information 
(*) The data sources VCE Results files is sourced from this URL http://www.vcaa.vic.edu.au/Pages/vce/statistics/schoolstats/index.aspx
(*) Post Codes location, geo mapping was sourced from this web site http://www.corra.com.au/australian-postcode-location-data/
(*) Ranking Website (used for data validation) 
 (**) Private - http://www.bettereducation.com.au/Results/vce.aspx 
 (**)Public - http://www.bettereducation.com.au/Results/vcePublicSchoolResults.aspx  
(*) School Locations (Victoria) http://data.gov.au/dataset/school-locations-victoria (Listing of all school locations in Victoria. Includes primary and secondary schools, government and non-government)
(*) Data Base For the data we used MySQL - version 5.6. 
(*) For data visualisation we used Tableau Public - version 8.2. 
(*) Hardware/OS Apple iMac, OS X - version 10.9.4

iPhone 6 Australian Plan Prices

Australia, 13-Sep-2015 


If you are planning to buy iPhone 6s, I just updated Tableau workbook with new pre-order plans from iPhone 6s too. 

The workbook show the data plans for iPhone 6 and 6s on the launch day. It is very interesting to see how plans are changed duding just two years. As you know, now plans are offering more data than ever. 
  
Below is the Tableau Public visualisations. 

Please note, before final decision check the provider web site to see if a deal is still active or was been modified.



  • iPhone 6 [$]/[GB] ratio as per plans