Sunday, August 11, 2013

Business Intelligence

I am going to set up the Automart database as a Data warehouse. (I am using Automart because at home CommunityAssist is in a process of transformation and is too different from the version on your servers to be useful.) I have tried to reorganize the database into a "Star" structure consisting of a fact table surrounded by Dim or Dimension tables. Here is the diagram:

First we create the new database

Create Database AutomartDW
Go
use AutomartDW
Go

Then we create the tables

Create table DimVehicle
(
 VehicleID int primary key,
 VehicleMake nvarchar(255),
 VehicleYear nvarchar(4)
)
go
Create table DimCustomer
(
 CustomerKey int identity(1,1) primary key,
 LastName nvarchar(255),
 FirstName nvarchar(255),
 Email nvarchar(255),
 VehicleID int Foreign key references DimVehicle(VehicleID)
)

Create table DimVehicleService
(
 VehicleServiceKey  int identity(1,1) primary key,
 VehicleServiceID int,
 VehicleID int,
 LocationName nvarchar(255),
 ServiceName nvarchar(255),

)

go

Create Table DimDate
(
 DateKey int identity (1,1) primary key,
 ServiceDate Date,
 ServiceTime Time,
 sDay int,
 sMonth int,
 sYear int
)
go

Create table FactSales
(
 VehicleID int foreign Key references DimVehicle(VehicleID),
 DateKey int foreign key references DimDate(DateKey),
 VehicleServiceKey int foreign Key references DimVehicleService(VehicleServiceKey),
 CustomerKey int foreign key references DimCustomer(customerKey),
 ServicePRice money,
 DiscountPercent decimal(3,2),
 TaxPercent decimal(3,2)


)

Next we populate the tables. I am pretty sure of the Dim table inserts, but must confess a bit of uncertainty about how I populated the Fact Table

--start inserts into tables

Insert into DimVehicle(VehicleID, VehicleMake, VehicleYear)
Select vehicleID, VehicleMake, VehicleYEar from Automart.Customer.Vehicle

Insert into DimCustomer (LastName, FirstName, Email, VehicleID)
Select LastName, Firstname, Email, VehicleID
From Automart.dbo.Person p
Inner join Automart.Customer.Vehicle v
on p.Personkey=v.PersonKey
inner join Automart.Customer.RegisteredCustomer c
on c.PersonKey=p.Personkey

Insert into DimVehicleService(VehicleServiceID, VehicleID, LocationName, ServiceName)
Select vs.VehicleServiceID, VehicleID, LocationName, ServiceName
From Automart.Employee.VehicleService vs
Inner join Automart.Employee.VehicleServiceDetail vsd
on vs.VehicleServiceID=vsd.VehicleServiceID
inner join Automart.Customer.Location l
on vs.LocationID=l.LocationID
inner join Automart.Customer.AutoService a
on a.AutoServiceID=vsd.AutoServiceID

INsert into dimDate(ServiceDate, ServiceTime, sDay, sMonth, sYear)
Select ServiceDate, ServiceTime, Day(ServiceDate), Month(ServiceDate), Year(serviceDate) 
From Automart.Employee.VehicleService

Insert into FactSales(VehicleID, DateKey, VehicleServiceKey, CustomerKey, ServicePRice, DiscountPercent, TaxPercent)
Select dv.VehicleID, DateKey, VehicleServiceKey, CustomerKey, ServicePRice, DiscountPercent, TaxPErcent
From DimVehicle dv
inner Join DimCustomer dc
on dv.VehicleID = dc.VehicleID
inner join Automart.Employee.VehicleService vs
on dv.VehicleID=vs.VehicleID
inner join Automart.Employee.VehicleServiceDetail vsd
on vs.VehicleServiceID=vsd.VehicleServiceID
inner join DimVehicleService dvs
on vs.VehicleServiceID=dvs.VehicleServiceID
inner join Automart.customer.AutoService a
on vsd.AutoServiceID=a.AutoServiceID
inner join DimDate dd
on vs.ServiceDate=dd.ServiceDate 

Next we need to make sure we add the service user to the AutomartDW database

First open the SQL Server Configuration Manager and select SQL Server Analysis Services. Right Click on the properties. You will probably see something like this.

We want to change it to local services.

If it already looks like this, great. other wise change it. Oking this will cause a server restart. Say OK to the warning. Then return to Sql Server Management studio. Go to the Server Security folder and right click logins. Select New Login. In the login dialog click on search,and then advanced. Select NT Authority/Local Service as a login.

Then click OK, OK until you are back at the main dialog window. Click on the page User Mapping. Then check the box by AutomartDW and Check the roles DataReader, DataWriter

Click OK. Now we need to open Visual Studio Data tools. and Start a new Data Analysis Multidimensional cube project. Name it "AutomartAnalysisProject."

In the Solution Explorer right click on Data Sources and add a new Data source

Click OK and Next and then choose Use Service Account

Go Next and Finish. Now right click Data Source Views in the Solution Explored and Select New. Use the data source we just created.

On the next dialog of the wizard, add all the tables

Click Next and finish. You will see the designer with the tables.

Now right click on the Cubes folder and add a new cube

Use Existing tables

Click Next and then choose all the tables except sysDiagrams

click Next, Next and Finish. It wouldn't hurt at this time to build the solution. Then right click on the cube and select Process.

Click Run. You should see something like this when it is done.

Click Close, Close. Then click the Browser tab. Here you can drag in dimensions and calculations

Not very exciting, but Automart doesn't have much data. Also, I am not absolutely confident of my dimesion setup. I seemed to have missed the location. It should have been a dim table. Also I am not sure about how I populated the SaleFact table. The numbers displaying are correct, but some of the other measures produce wrong results. But this blog does give you an overview of the process.

You can still look at other measures and try to add KPIs and other aspects. Maybe sometime in the not too distant future I will do a blog where I take it from the cube and show how to apply statistics and KPIs to the results.

Saturday, August 3, 2013

Query Optimization

Overview

Query optimization is an important administrative task. But it is a difficult and subtle process. It involves extensive testing of various query structures and indexes and comparing the results.

Sql Server has a built in optimazation engine (see below) that usually but not always provides the best execution plan. You can also look at the actual execution plans and compare statistics when running variations of a query. Sql Server also provides the syntax for getting "Hints" when running queries. Finally you can use the Database Tuning Advisor to get suggestions for what indexes to create.


SQL Servers Query Optimization

Sql Server has a built in query optimization engine. Every time a query is run it goes through the following steps:

Parsing makes sure the query is valid SQL. Binding is mostly about name resolution, getting the table and column names. Optimization generates candidate execution paths and determines which has the least cost in cpu and total execution time.

Query optimization is complex and even the best optimizer doesn't get it right all the time. Still most of the time the optimizer does generate the optimal path.


Looking at a query with the execution plan and statistics

Open SQL Server Management Studio.

Start a new Query window

Select the Actual Execution Plan, and the Include Client Statistics from the toolbar

We are going to use Adventure works because it has more records. Our query will focus on the sales and sales details tables, but we will also bring in the Product name from the product table. We will use the dates and salesperson IDs for criteria.

Here is the query:

Use AdventureWorks2012

Select s.SalesOrderID, OrderDate, 
SalesOrderNumber, SalesPersonID,
ORderQty, Name, unitPrice,
 UnitPRiceDiscount
From Sales.SalesOrderHeader s
Inner Join Sales.SalesOrderDetail sd
on s.SalesOrderID=sd.SalesOrderID
inner Join Production.Product p
on p.ProductID=sd.ProductID
Where OrderDate between '2008-1-1' and '2008-1-31'
And SalesPersonID is not null

After you run this click the tab Execution plan. You will have the following output

Notice that it suggests a couple of indexes that are missing--in other words should be created, particularly on SalesOrderDate and SalesPersonID. Notice also that the majority of the cost is incurred processing the clustered indexes--which means going row by row through the table.

Next look at the statistics output

Open a second query window. We are going to create part of the suggested index

Create index ix_salesDate on Sales.SalesOrderHeader(OrderDate)

Now go back and rerun the query. Notice the query results include the new index. Now all the cost is in SalesDetail clustered index. This would suggest we should add another index.

Look at the statistics. Notice, interestingly the total cost has actually gone up and many of the indicators are worse.

This suggests that the next step would be to try an index on SalesPersonID and see if that improves the stats.


Query Hints

Query hints are a set of commands that you can add to a query to suggest an execution path. I am only going to show a couple. Query hints start with the Option keyword and have various arguments in parenthesis. The first example is a merge join which suggest executing the Joins as merges. Here is the code. The only change is in the last line.

Select s.SalesOrderID, OrderDate, SalesOrderNumber, SalesPersonID,
ORderQty, Name, unitPrice, UnitPRiceDiscount
From Sales.SalesOrderHeader s
Inner Join Sales.SalesOrderDetail sd
on s.SalesOrderID=sd.SalesOrderID
inner Join Production.Product p
on p.ProductID=sd.ProductID
Where OrderDate between '2008-1-1' and '2008-1-31'
And SalesPersonID is not null
Option (Merge Join)

Notice the change in results and statistics. Notice the change of joins to merge join and the suggestion to create and index.

Here are the statistics

Most of the other query hints are suggestions to the query optimizer. Look at http://msdn.microsoft.com/en-us/library/ms181714.aspx for a complete descriptions.


The DataBase Engine Tuning Advisor

To start the Tuning advisor go to the TOOLS menu in the Sql Server Management Studio. Connect to Localhost.

In the general tab, select "Plain Cache", and check Automart

Click the Tuning Options Tab. Leave everything as default except the time.

In Advanced Options set the max space to 4 mbs or so

Move it ahead 10 minutes or so.

Click start analysis.

The Tuning adviser has no suggestions. (Automart is too small a database to really analyze.) Here is the report:


Useful links:

Query hints

http://msdn.microsoft.com/en-us/library/ms181714.aspx

Overview of query optimization

https://www.simple-talk.com/sql/sql-training/the-sql-server-query-optimizer/
http://sqlblog.com/blogs/paul_white/archive/2012/04/28/query-optimizer-deep-dive-part-1.aspx
http://sqlblog.com/blogs/paul_white/archive/2012/04/28/query-optimizer-deep-dive-part-2.aspx

Advice

http://exacthelp.blogspot.com/2012/04/sql-server-query-optimization-tips.html
http://blogs.lessthandot.com/index.php/DataMgmt/DBAdmin/sql-server-tuning

Database tuning advisor

http://msdn.microsoft.com/en-us/library/ms174202.aspx