Quantcast
Channel: Dinesh's Blog :::: Being Compiled ::::
Viewing all 530 articles
Browse latest View live

SQL Server 2016 Sample Database - Wide World Importers- Restoring and adding more records

$
0
0
We have been using Adventure Works database with almost all versions of SQL Server and no questions, it can be used with SQL Server 2016 as well. Everyone likes Adventure Works but when a complex testing over functionalities is required, Adventure Works is considered as a simple database and many tend to use some other databases instead of Adventure Works. One popular database was Contoso database which was available but for most of cases, the size was not enough.

For SQL Server 2016, Microsoft introduces a new database called Wide World Importers. This database covers many new features of SQL Server and it allows you to load data to current date. Not only that, you decide how many Customer Orders you need per day to increase the volume.

Here is the way of adding Wide World Importers databases.

WideWorldImporters Database
Go to this link and download WideWorldImporters-Full.bak file. Restore it either using TSQL or GUI.

USE master
RESTORE DATABASE WideWorldImporters
FROM  DISK = N'D:\Software\SQL Server 2016\WideWorldImporters-Full.bak' 
WITH  FILE = 1,  
MOVE N'WWI_Primary' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\WideWorldImporters.mdf',  
MOVE N'WWI_UserData' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\WideWorldImporters_UserData.ndf',  
MOVE N'WWI_Log' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\WideWorldImporters.ldf',  
MOVE N'WWI_InMemory_Data_1' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\WideWorldImporters_InMemory_Data_1',  
NOUNLOAD,  STATS = 5
GO

Once restored, you should see the database in your instance. If you query Sales.Orders table, you should see 73,595 records. These orders are populated till 2016-05-31.


If you want to add more data to the current data, you can call DataLoadSimulation.PopulateDataToCurrentDate stored procedure with relevant parameter values as below.

DECLARE @return_value int

EXEC @return_value = [DataLoadSimulation].[PopulateDataToCurrentDate]
  @AverageNumberOfCustomerOrdersPerDay = 80,
  @SaturdayPercentageOfNormalWorkDay = 30,
  @SundayPercentageOfNormalWorkDay = 5,
  @IsSilentMode = 0,
  @AreDatesPrinted = 0

SELECT 'Return Value' = @return_value

You can simply have a larger number of @AverageNumberOfCustomerOrdersPerDay if you need more records.

WideWorldImportersDW database
If you need the relational data warehouse related to Wide World Importers, then you need to download the WideWorldImportersDW-Full.bak from same page. Restore it just like the previous database using following script;

USE master
GO

RESTORE DATABASE WideWorldImportersDW
FROM  DISK = N'D:\Software\SQL Server 2016\WideWorldImportersDW-Full.bak' 
WITH  FILE = 1,  
MOVE N'WWI_Primary' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\WideWorldImportersDW.mdf',  
MOVE N'WWI_UserData' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\WideWorldImportersDW_UserData.ndf',  
MOVE N'WWI_Log' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\WideWorldImportersDW.ldf', 
MOVE N'WWIDW_InMemory_Data_1' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\WideWorldImportersDW_InMemory_Data_1',  
NOUNLOAD,  STATS = 5
GO

If you have repopulated WideWorldImporters database using PopulateDataToCurrentDate, then you need to transfer new data to the data warehouse. You do not need to manually load it, you can use a given package for populating the data warehouse.

Download Daily.ETL.ispac from the same page and add it to Integration Services catalog by running the file. Once it is added, you should see the package;


Right-click on the package and execute it for updating the data warehouse. Make sure connection string set are correct as per your environment. If not, change it and execute it.


Once executed, your data warehouse is updated based on new data added to WideWorldImporters database.



Connecting to the Integration Services - "Access is denied."

$
0
0
If required permissions are not configured for Integration Services for your account, you will experience following error when trying to connect with Integration Services;

Connecting to the Integration Services service on the computer "" failed with the following error: "Access is denied."


This clearly indicates that you need permission on it. How can I add required permissions for any account?

Here are the steps for giving required permission;
1. Search for Component Services and open it.
2. Expand Component Services -> Computers -> My Computer -> DCOM Config.


3. Find Microsoft SQL Server Integration Services, right click on it and get Properties. Go to Security tab and add accounts you need with relevant permissions for Launch and Activation Permissions, Access Permissions, and Configuration Permissions by clicking Edit button.



Click OK to save all. You need to restart the Integration Services service for getting added permission applied. Once restarted, you will be able to connect with it without any issue.

Scaling out the Business Intelligence solution

$
0
0
In order to increase the scalability and distribute workload among multiple servers, we can use a scale-out architecture for our DW/BI solution by adding more servers to each BI component. A typical BI solution that follows Traditional Business Intelligence or Corporate Business Intelligence uses following components that come in Microsoft Business Intelligence platform;
  • The data warehouse with SQL Server database engine
  • The ETL solution with SQL Server Integration Services
  • The models with SQL Server Analysis Services
  • The reporting solution with SQL Server Reporting Services
Not all components can be scaled out by default. Some support natively some need additional customer implementations. Let's check each component and see how it can be scaled out;

The data warehouse with SQL Server database engine


If the data warehouse is extremely large, it is always better to scale it out and scaling up. There are several ways of scaling out. Once common way is Distributed Partitioned Views. It allows us to distribute data among multiple databases that are either in same instance or different instances, and have a a middle layer such as a View to redirect the request to right database.

Few more ways are discussed in here that shows advantages and disadvantages of each way.

In addition to that Analytics Platform System (APS) which was formerly named as Parallel Data Warehouse (PDW) is available for large data warehouses. It is based on Massively Parallel Processing architectures (MPP) instead of Symmetric Multiple Processing (SMP), and it comes as an appliance. This architect is called as Shared-Nothing architecture as well and it supports distributing data and queries to multiple compute and storage nodes.

The ETL solutions with SQL Server Integration Services


There is no built-in facility to scale out Integration Services but it can be done by distributing packages among servers manually, allowing them to get executed parallel. However this requires extensive custom logic with packages and maintenance cost might be increased.

The models with SQL Server Analysis Services


Generally, this is achieved using multiple Analysis Services Query servers connected to read-only copy of multi-dimensional database. This requires a load-balancing middle tier component that handles requests of clients and direct them to right query server.

The same architecture can be slightly changed and implement in different ways as per requirements. Read following articles for more info.


The reporting solution with SQL Server Reporting Services


The typical way of scaling out Reporting Services is, configure a single report database with multiple report server instances connected to the same report database. This separates report execution and rendering workload from database workloads.

SQL Server Query Editor with Query Window

$
0
0
Some prefer write the TSQL statement without a help from an editor but some prefer to use an editor because it simply allows us to select tables and columns easily and make joins, grouping and filtering easily. I personally prefer to write statements by myself as it helps me to remember the way of writing it.

Those who like to get a help from editor for writing SELECT statements, they go to the View node in Object Explorer and use New View menu item for opening the editor. It allows us to construct the SELECT statement without writing it and the statement can be used without creating the view. However, there is a way of opening the editor with New Query Window without opening it with Create View.

Do you know that there is a menu item called Design Query in Editor... in Query menu? This menu item opens the editor that can be used for constructing the query and adding to your New Query window.



Although this is not a new menu item, many do not use it, in fact many do not know that it is available. It is usefull, you can get your statement written without spending much time.

SQL Server Management Studio July 2016 is available

$
0
0
Good thing is we get continuous releases for Management Studio. Microsoft has released the newest version of SQL Server Management Studio which is July 2016 release. Latest version is 13.0.15500. The June 2016 release was 13.0.15000.

You can download the latest from here.

Some of the major changes are as follows (quoted from the same page);
  • Support for Azure SQL Data Warehouse in SSMS.
  • Significant updates to the SQL Server PowerShell module. This includes a new SQL PowerShell module and new CMDLETs for Always Encrypted, SQL Agent, and SQL Error Logs.
  • Support for PowerShell script generation in the Always Encrypted wizard.
  • Significantly improved connection times to Azure SQL databases.
  • New ‘Backup to URL’ dialog to support the creation of Azure storage credentials for SQL Server 2016 database backups. This provides a more streamlined experience for storing database backups in an Azure storage account.
  • New Restore dialog to streamline restoring a SQL Server 2016 database backup from the Microsoft Azure storage service.



SQL Server Backup to URL - Now easier than before

$
0
0
I made a post on SQL Server Backup to URL that describes how to take a backup to URL. Earlier, this needs the credential created before you take a backup but now, creating credentials using Policy on the container and Shared Access Signature Key (see more this with SQL Server On-Premise Database files maintained in Azure) can be done with the given GUI. This saves time and not only that, it clears the confusion had with previous GUI.


This is what we had before;


Now the interface with latest Management Studio;


If you have already created a credential, it can be picked from the first drop-down. If not, click on New container button and Sign in to your Azure Account. Once signed in, select the Storage and Container.


You can generate the Shared Access Signature by clicking Create Credential button, and complete it.


Now you can select the Azure storage container  and take the back.

Azure SQL Database - Cannot connect to database.windows.net.

$
0
0
Are you experience this error when trying to connect with your Azure SQL Database?

TITLE: Connect to Server
------------------------------

Cannot connect to database.database.windows.net.

------------------------------
ADDITIONAL INFORMATION:

A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server) (Microsoft SQL Server, Error: 53)

For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&EvtSrc=MSSQLServer&EvtID=53&LinkId=20476

------------------------------

The network path was not found


There can be various reasons for this, but other than general technical reasons, you need to check one thing if you have connected to another database before, which is not exist now.

For example, you might have taken a snapshot of your database and connected to it using Management Studio. For connecting, you need to go through Options and set the database name with it. Once set, SSMS keeps, and next time when you try to log in, even though u do not notice, it might try to connect with the Snapshot, which you connected before. If it exist, it will connect, else it will throw the above one without giving you much details.

So, check that before going into technical analysis.



Conformed dimensions and Stovepipe data marts

$
0
0
Long time back, I wrote a small note on Stovepipe data marts, thought make a blog post again on it as it is still unknown to many. The simplest way of remembering What is Stovepipe Data Mart, is remembering a  data mart that cannot be either extended to or integrated to Data Warehouse. This happens if your data mart does not contain Conformed Dimensions.


How do we know whether we design conformed dimensions?

Most of the entities identified during the requirement gathering process are common to entire organization. Example, Product, Employee, Customer are some of the entities. Although these entities have many attributes, only few attributes will be taken into the account during the design of data marts. Not only that, some attributes may not be correctly identified and designed wrongly though the wrong design satisfy the current requirement up to some extent. This does not lead you to conformed dimensions.

In order to make sure that the dimension designed is a conformed dimension, make sure that all related and known attributes are correctly added. There can be a situation that you need only few attributes to the current data mart, but if you have identified many attributes, make sure everything is added. When you add them, make sure that correct business logic related to them is implemented. This makes the dimension as a conformed dimension. Example, if yo u have designed a dimension called Product for Sales data mart and you have added all possible attributes and implemented all possible business logic, then it is a conformed dimension and you will be able to use it when you design Marketing or Finance data marts.

Coming back to Stovepipe data marts, simply if you do not have conformed dimensions, it can be considered as a Stovepipe data mart. You will not be able to integrate the data mart with data warehouse without doing modifications to the data mart, which will be costly, or which will be impossible considering factors like resource availability, system usage and budget constraint. Therefore make sure that you are not leading to a Stovepipe data mart

SQL Server Brain Basher of the Week #046 - Default Database Size

$
0
0
You may or may not have noticed the initial size of the database at creation but there is a fixed size for both data and log files. Most of the time, developers just create the database by supplying the name without considering other properties unless some properties need an attention and a change. Same goes to the size too. Let's talk about the initial size today;

What is the default size of SQL Server database?

Before discussing the answer, let's see how a database is created with our CREATE DATABASE command. When SQL Server has to create a database, it actually takes a copy of Model database as it is considered as the standard template for databases. Therefore default size of the database is based on the size set with Model database.

Until SQL Server 2014, the default size of the data file is 3MB and log file is 1MB. However this has been changed from SQL Server 2016. Now it is 8MB for both files. Not only that, File Growth setting has been changed as well.


SQL Server Brain Basher of the Week #047 - Database diagram

$
0
0
This is something I asked during an interview. While we were talking about system databases, I questioned interviewee that how get information related to tables exist in system databases. The answer was "Simple, we can make a database diagram and can use it for understanding structures and relationships". Can we really do it?

Here is the question for the week? Can we create database diagrams on system databases?

The answer is simple, but not the answer given by interviewee. It is not possible. SQL Server does not allow us to generate diagram or script system tables. Even if you try to use Generate Script it will not list out system tables.

See, how it appears when trying to script the msdb database. As you can see, there is no options to get even script generated on system tables.


SSIS Progress / Execution Results Tab

$
0
0
SQL Server Integration Services is The Enterprise ETL platform that supports on all ETLing scenario. Covering from small data transfer to large complex ETL solutions, it has become the number one on ETL solutions for many projects and organizations.

Just like other implementations with other tools, the packages we create with Integrations Services need to troubleshoot when it does not work as we expected. There are many way of troubleshooting, however, the most common technique is debugging. Debugging is the process of finding problems that occur during the execution, either at design stage or after deployment. We can debug SSIS package too.

There are various ways of debugging SSIS package at the development, will discuss it with different post, but let's focus on one common technique. When we run the package using SQL Server Data Tools, most of the events are recorded and can be seen with one of the Tabs in design window. This tab cannot be seen when you open the package for designing but it becomes visible as Progress when you execute and visible as Execution Results when you complete the execution. See the images below;


This tab is very useful as it explains how tasks and containers involve with the execution in a hierarchical manner. You can navigate via the tree and find out the problematic one if there is an issue.


There is a slight performance impact on this collection. Since it has to collect all information, it adds an overhead to the package execution. Even though it is a slight one, if you want, it can be disabled by toggling Debug Process Reporting menu item in the SSIS menu;


How to add a Watch Windows in SSIS

$
0
0
My post SSIS Progress / Execution Results Tab talked about few things related to debugging and thought to add another post on it, specifically on two windows provided by SQL Server Data Tools for observing values during debugging.

There are two windows that can be used for seeing values of variables and parameters: Locals and Watch Windows. The Locals windows allows us to see systems setting, values of variables and parameters that are currently in scope. The Watch window helps us to specify a specific item such as variable or parameter and track its value through out debugging. If you have multiple variable and parameters and you need to see how they are getting changed during the execution (or debugging), Watch window is the best. You can add multiple Watch windows for tracing multiple items.

Let'see how we can add a Watch and use it when debugging. Follow given steps for creating a package and testing;

1. Create a SSIS project and add a package.
2. Create a variable called TestVariable. Set the type as int.


3. Add a Script Task. Make sure that TestVariable is added under ReadWriteVariables of the Script Task and following code is set with the task.



public void Main()
{
    // TODO: Add your code here

 
    int x = Convert.ToInt32(Dts.Variables["User::TestVariable"].Value);
    x = x + 1;
    Dts.Variables["User::TestVariable"].Value = x;
    Dts.TaskResult = (int)ScriptResults.Success;
}

This code increases the value of the variable by one.

4. Copy the Script Task added and duplicate it two times. Now you have three Script Tasks. Connect them with Success Precedence Constraints.


5. Select the first Script Task and press F9 or use Debug -> Toggle Breakpoint menu for adding a breakpoint. Do the same for other two tasks.


6. Now start debugging. It stops at the first Script Task as we have set a breakpoint. For opening Locals window, use Debug -> Windows -> Locals menu item.

7. Once the Locals window is opened, navigate to find TestVariable. Right-click on it and add a Watch on it.


8. You should see a Watch window now. You will see that Watch window shows how value is getting changed when you continue with debugging.


Handout to Resolve SQL Server Connectivity Issues [Guest Post]

$
0
0
This post is written by Andrew Jackson.


Problem Statement
While launching Microsoft SQL Server Management Studio, sometimes user faces a problem when they try to connect with the server. After clicking on Connect button, an error message box appears, which indicates that there is some connectivity issues in the server.


End user becomes tensed when they find it impossible to connect with the server and such errors are known as SQL server connectivity issues. In this blog, we are going to discuss that why does these connectivity issues occurs and what are the measures to troubleshoot them.

Causes of Connectivity Issues in SQL Server
Generally, when one fails to connect to the SQL server, the reasons may be:
  • The network of SQL server may be disconnected or slow
  • The SQL server of machine may be improperly configured
  • There must be some problem with firewall or other network devices
  • The connection between two servers may be disconnected
  • Applications related to the server may be improperly configured
  • There may be authentication error or log-in issue on the server

Approaches To Resolve The Problem
In this section, we are going to discuss different techniques to resolve SQL server connectivity issues. Follow the below-mentioned procedure for resolving your problem:
NOTE: After performing each step, restart your machine and then try to connect to the server. If you are successfully connected to the server, then stop your procedure there only; else go to next step.

STEP 1: Resolving Networking Issue
As discussed earlier that network connection could be the one reason for connectivity error because a stable network is being required for establishing a remote connection. Therefore, the first approach should be to examine the network by executing the following commands (given in the snapshot):


STEP 2: Modify the SQL Server Configuration
Use SQL Server Configuration Manager for examining the protocols, which are used for running targeted SQL server. The server supports TCP protocol, Shared memory, and Named Pipes. If the protocols are disabled, then enable them by making use of Configuration manager. After performing configuration, restart your SQL server machine.

STEP 3: Improving Network Devices Functioning
Network devices like firewall, routers, gateway, etc., may be settled in such a way that they block the connection request. The rules of firewall may be configured in such a way that it blocks the request of SQL connection. Therefore, for resolving such error you will have to disable the functioning of firewall because if the firewall is disabled, then its will stop its functioning & hence, you will be able to connect with SQL server. Moreover, after having a successful connection, you can again enable the functioning of the firewall.

STEP 4: Resolving the Client Driver Issue
This solution is applied on the client machine. Go through the following description to troubleshoot the SQL Server connectivity issues:

“Execute the below-shown command to log-in into another client computer via TCP protocol. If TCP is disabled, then enable it and then use SQL Management Studio, OSQL, and SQLCMD for testing SQL connections.”


STEP 5: Properly Configure SQL Related Applications
Once again reconfigure the applications related to SQL server and then go through the following assumptions:
  1. The application is running on your account or on a different account. If the application is running on the different account, then configure it within your account.
  2. Analyze your connection string and check whether the string is compatible with your driver or not. If not, then make it compatible with your driver.

STEP 6: Resolve Authentication & Log-in Faults
The last step or you can say the end approach to resolve the connectivity problem is to troubleshoot faults, which occurs at the time of authentication procedure and during log-in. It relates to a network connection, machine’s operating system, and the server database.
  1. Make sure that the input credentials are valid
  2. If you are using SQL auth, then mixed authentication should be enabled
  3. Examine all the eventlog of your machine and collect some more information
  4. Analyze the permissions whether they are enabled for logging into the account or not. If not, then instantly enable them.
Conclusion
After having a brief discussion about how to deal with SQL Server connectivity issues, one can wind up with the fact that these connectivity issues are due to fault in the network, network devices, or improper configuration. Therefore, one can troubleshoot the SQL connection error 40 and SQL error 26 by going through the mentioned steps.

Two new Free SQL Server / PowerBI / Azure related books

What is CATS, CRTAS and CETAS? - SQL Server CREATE TABLE AS SELECT

$
0
0
I am sure that everyone is well aware on SELECT INTO statement. This statement allows us to create a table using a SELECT statement just like below;

USE WideWorldImporters;
GO

SELECT o.OrderId, SUM(ol.UnitPrice * ol.Quantity) OrderAmount
INTO dbo.OrderAmount
FROM Sales.Orders o
 INNER JOIN Sales.OrderLines ol
  ON o.OrderID = ol.OrderID
GROUP BY o.OrderId;

SELECT * FROM dbo.OrderAmount;

However this has many limitations, specifically on the construction of the new table. We cannot add some elements such as constraints, indexes, when we get the table created using SELECT INTO.

SQL Server 2016 makes it possible to create the table with more control, introducing CREATE TABLE AS SELECT. But, unfortunately, this statement is available only with Azure SQL Data Warehouse and Parallel Data Warehouse. This statement is referred as CATS.

Here is an example;

CREATE TABLE myTable   
WITH   
  (   
    CLUSTERED COLUMNSTORE INDEX,  
    DISTRIBUTION = HASH (CustomerKey)  
  )  
AS SELECT * FROM dimCustomer; 


You can read more on this at: CREATE TABLE AS SELECT (Azure SQL Data Warehouse).

There are two more abbreviations like this: CRTAS and CETAS.

The CRTAS stands for CREATE REMOTE TABLE AS SELECT that allows to create a table in remote SMP SQL Server and populating data from a table in Parellel Data Warehouse.

Read more on this at: CREATE REMOTE TABLE AS SELECT (Parallel Data Warehouse).

The CETAS stands for CREATE EXTERNAL TABLE AS SELECT that allows to create an external table and export data into Hadoop or Azure Blob Storage. Again, this available only with Azure SQL Data Warehouse and Parellel Data Warehouse.

Read more on this at: CREATE EXTERNAL TABLE AS SELECT.


Install SQL Server with just three clicks - The SQL Server Basic Installer

$
0
0
We have been installing SQL Server for long time, different versions and different editions. Everyone is familiar with installation wizard and knows that it needs to go through various steps, windows and multiple clicks. If you need an installation to be done for addressing many database requirements, then of course you need to go through almost all installation wizard pages and customize as per the requirements. But if it is for a simple database and we do not need much features installed, then it could be seen as too-much-of-clicks and too-much-of-pages.

What if we can install SQL Server with just three clicks?

Yes, now it is possible. This is called as SQL Server Basic Installer that allows you to get the database engine and click connections SDK installed without seeing much screens. This currently available for following editions;

Let's test this one of the editions. Click on the link give above for Express Edition and get the SQLServer2016-SSEI-Expr.exe file downloaded. Start installation it, this is the first screen you see with it;


You have three options for continuing. Since we need a simple installation, let's select the first option: Basic. This is the first click.

You cannot skip accepting the license terms, you get the usual screen and click on Accept. That is your second click.


Next screen is for specifying the installation location. You do not need to do anything because the default location is selected. Here is your third click, click on Install to continue.


This starts the installation, different screens appear;
Downloading....


Installing...


And finally the screen for showing and installation is completed. During the download, you will see various information on samples, forum, etc, that will be helpful for learning more on SQL Server.


This screens educates you on Connection String used for connecting with the instance, Instance ID, default administrator, installed features etc. It allows you to launch SQL Command for connecting and testing the installation. In addition to that, if you want to get SQL Server Management Studio, you can click on Install Tools for getting it.

As you noticed, it is just three clicks, try and see.

SQL Server Brain Basher of the Week #048 - Dropping Clustered Index

$
0
0
SQL Server mainly maintains two structures for tables: Heap and Clustered. If there is no clustered index created with the table, table is with Heap Structure else it will be with Clustered structure.

Once the Clustered Index is created, SQL Server organizes data in a form of B-Tree and data pages related to the table will be held with the last level which is called as Leaf Level. Since the Clustered Index uses data pages, it is not possible to have another Clustered Index in the same table. Now the question is;

What will happen when the Clustered Index is dropped?

Many think that it is not possible because it holds data pages related to the table. But it is possible. If you drop the index, it goes back to the Heap and table will be maintained in the Heap. Here is a code that shows it;

-- creating a test table
CREATE TABLE dbo.TestTable
(
 TestTableId int index ix_TestTable_Id clustered
 , TestTableValue1 varchar(100) not null
 , TestTableValue2 char(5) not null index ix_TestTable_Value2 nonclustered
);
GO

-- Inserting test records
INSERT INTO dbo.TestTable 
VALUES 
(1, 'Hello world', 'ABC01')
, (2, 'Hi', 'ABC02')

-- Checks indexes availeble
SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.TestTable'), NULL, NULL, 'DETAILED');


-- Dropping the clustered index
DROP INDEX ix_TestTable_Id ON dbo.TestTable;

-- Checking the indexes and data after dropping 
SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.TestTable'), NULL, NULL, 'DETAILED');
SELECT * FROM dbo.TestTable;


Result shows asterisk when converting int to varchar

$
0
0
I was sent a code by one engineer saying they he gets an odd error with one of calculations written. He gets the following error;

Conversion failed when converting the varchar value '* ' to data type int.

By looking at the error, I realized the issue, it is related to truncation. He converts a numeric value to a varchar variable and then again he uses the converted varchar value with another calculation, converting to back to int. This calculation causes the error because previously converted value contains asterisk, not the numeric value that was used with initial conversion.

How can it be happened? This happens when you convert character or binary expressions to an expression of a different data type. If the the result cannot be accommadate with the set data type, SQL Server either truncates the value, partially displays or throw an error.

See this code;

DECLARE @x int = 1000;
DECLARE @y char(2);

SET @y = @x;
SELECT @y;

When the value 1000 is converted to char(2), the result becomes asterisk because the set data type cannot hold 1000. The asterisk indicates that value is truncated.


Here are some known results when converting to an expression of a different data type.


What is Table Value Constructor?

$
0
0
I am sure that many have used this without knowing the term. However it is always better to know the term before applying because that makes sure the implementation is done with right way.

The Table Value Constructor allows us to specify set of row value expressions to into a table. This makes it possible to have set of row value expressions in a single DML statement and use it as a form of a table. Yes, this is what we used with INSERT statement when we need to insert multiple records into the table using a single INSERT statement.

The Table Value Constructor can be used with INSERT statement, MERGE statement (with USING clause) and in the definition of a derived table. Here is an example;

CREATE TABLE dbo.Customer
(
 CustomerID int identity(1,1) primary key
 , CustomerCode char(5) not null
 , Name varchar(200) not null
 , DateOfBirth date not null
);
GO

-- Inserting multiple rows with
-- Table Value Constructor
INSERT INTO dbo.Customer
 (CustomerCode, Name, DateOfBirth)
VALUES
 ('CUS01', 'Jane', '1998-01-04')
 , ('CUS02', 'Knight', '2000-05-23')
 , ('CUS03', 'Sam', '1997-2-11')

-- Using Table Value Constructor
-- with USING clause
MERGE INTO dbo.Customer t
USING (VALUES
  ('CUS01', 'Jane', '1998-02-04')
  , ('CUS04', 'Knight', '2000-05-23')) 
  AS s (CustomerCode, Name, DateOfBirth)
 ON t.CustomerCode = s.CustomerCode
WHEN MATCHED THEN
 UPDATE SET Name = s.Name, DateOfBirth = s.DateOfBirth
WHEN NOT MATCHED BY TARGET THEN
 INSERT (CustomerCode, Name, DateOfBirth)
  VALUES (s.CustomerCode, s.Name, s.DateOfBirth);

-- Using Table Value Constructor as a derived table.
SELECT Customer.CustomerCode, Customer.Name, CustomerCountry.CustomerCode
FROM dbo.Customer 
 INNER JOIN (VALUES ('CUS01', 'USA'), ('CUS04', 'LK')) AS CustomerCountry (CustomerCode, CountryCode)
  ON Customer.CustomerCode = CustomerCountry.CustomerCode

SSIS - Fact Loading with Dimension Type II and Dimension Type I

$
0
0
The Dimensional Modeling in data warehousing, either using Star or Snowflake schema is still the most famous structure applied and widely used, though summarized tables using in-memory and columnstore concepts are slowly taking over the traditional method. It is very common to see that SSIS (or SQL Server Integration Services) is used for data loading, applying various patterns. Searches show different types of articles and posts but it is rare to find a complete post that talks about loading a Fact table using SSIS, looking up type I and II dimensions. That is the reason for this post, this talks about how to load a Fact Table, getting keys from both Type I and Type II dimensions.

Let's make a small data warehouse with Star Schema and populate some sample records. The following code creates a database called SalesDataWarehouse and then creates dimension tables named DimDate, DimCustomer, DimProduct, and DimEmployee. As you see, DimEmployee is a slowly changing dimension type II dimension.

CREATE DATABASE SalesDataWarehouse;
GO

-- Connect and create tables
USE SalesDataWarehouse;
GO

CREATE TABLE dbo.DimDate
(
 DateKey int  primary key
 , [Date] date not null
 , [Month] varchar(20) not null
 , MonthNumber smallint not null
 , QuarterName varchar(20) not null
 , QuarterNumber smallint not null
 , [Year] smallint not null
);

CREATE TABLE dbo.DimCustomer
(
 CustomerKey int identity(1,1) primary key
 , FullName varchar(100) not null
 , Town varchar(100) not null
 , SourceKey int not null
);

CREATE TABLE dbo.DimProduct
(
 ProductKey int identity(1,1) primary key
 , Product varchar(100) not null
 , Brand varchar(100) not null
 , SourceKey int not null
);

CREATE TABLE dbo.DimEmployee
(
 EmployeeKey int identity(1,1) primary key
 , Name varchar(100) not null
 , SalesTown varchar(100) not null
 , StartDate date not null
 , EndDate date  null
 , IsCurrent bit not null default (1)
 , SourceKey int not null
);

The next code is for loading some sample records. Note that DimDate is loaded with WideWorldImportersDW database as its date dimension table contains records for the current year. However, this can be loaded using a simple script as well.

Have a look on records related to DimEmployee. Since it is a SCD Type II dimension, it holds history records. As you see, there are two records to James Phil with validity period.

-- Inserting date records from AdventureWorks table
INSERT INTO dbo.DimDate
 (DateKey, [Date], [Month], MonthNumber, QuarterName, QuarterNumber, [Year])
SELECT CONVERT(int, CONVERT(varchar(8), [Date], 112)), [Date], [Month], [Calendar Month Number]
 , 'Quarter ' + CONVERT(varchar(2), DATEPART(Q, [Date])), DATEPART(Q, [Date]), [Calendar Year]
FROM WideWorldImportersDW.Dimension.Date;

-- Inserting records to Customer
INSERT INTO dbo.DimCustomer 
 (FullName, Town, SourceKey)
VALUES
 ('Jane Knight', ' Town 1', 100)
 , ('Jack walter', 'Town 2', 101);

-- Inserting records to Product
INSERT INTO dbo.DimProduct 
 (Product, Brand, SourceKey)
VALUES
 ('Product 1', ' Brand 1', 22)
 , ('Product 2', 'Brand 3', 23);

-- Inserting records to Customer
INSERT INTO dbo.DimEmployee
 (Name, SalesTown, StartDate, EndDate, IsCurrent, SourceKey)
VALUES
 ('Peter Kevin', 'Town 1', '2016-06-01', null, 1, 200)
 , ('James Phil', 'Town 2', '2016-06-01', '2016-06-30', 0, 201)
 , ('James Phil', 'Town 3', '2016-06-30', null, 1, 201);


Next code is for creating the Fact table. Generally, Fact Table is a partitioned table. Therefore, this code creates a partition function and scheme for it. Note that, the best practice is, have different File Groups for partitions but let's just use PRIMARY (I will be discussing partition switching on fact loading with another post, extending the same code).

-- creating the partition
CREATE PARTITION FUNCTION pf_SalesDate (int)
AS RANGE RIGHT
FOR VALUES (20160101, 20160201, 20160301, 20160401, 20160501, 20160601, 20160701)
GO

-- creating the scheme
CREATE PARTITION SCHEME ps_SalesDate
AS PARTITION pf_SalesDate
TO ([PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY])
GO

CREATE TABLE FactSales
(
 DateKey int NOT NULL INDEX IX_FactSales CLUSTERED,
 ProductKey int NOT NULL,
 CustomerKey int NOT NULL,
 EmployeeKey int NOT NULL,
 SalesAmount money NOT NULL,
 SalesQuantity smallint NOT NULL
)
ON ps_SalesDate (DateKey)
GO

Let's insert some sample records into the Fact Table too.

INSERT INTO FactSales (DateKey, ProductKey, CustomerKey, EmployeeKey, SalesAmount, SalesQuantity)
VALUES
 (20160605, 1, 1, 1, 500, 2)
 , (20160615, 2, 2, 1, 500, 2)
 , (20160622, 2, 2, 2, 500, 2)
 , (20160624, 2, 1, 1, 500, 2)

Done, now we have to create a SSIS package for loading new records. As you noted, we have records related to June 2016, and we will be loading records related to July 2016. Let's use following recordset for loading the Fact Table.

SELECT CONVERT(date, '2016-07-01') As TransactionDate, 22 AS ProductId, 100 AS CustomerId, 200 AS EmployeeId, 500 AS SalesAmount, 2 AS SalesQuantity
UNION ALL
SELECT CONVERT(date, '2016-07-12') As TransactionDate, 23 AS ProductId, 101 AS CustomerId, 200 AS EmployeeId, 500 AS SalesAmount, 2 AS SalesQuantity
UNION ALL
SELECT CONVERT(date, '2016-07-14') As TransactionDate, 23 AS ProductId, 100 AS CustomerId, 201 AS EmployeeId, 500 AS SalesAmount, 2 AS SalesQuantity
UNION ALL
SELECT CONVERT(date, '2016-07-22') As TransactionDate, 22 AS ProductId, 100 AS CustomerId, 201 AS EmployeeId, 500 AS SalesAmount, 2 AS SalesQuantity


Let's create a SSIS package and add a connection to the data warehouse. I have a database named Sales and will be using it as the source even though we do not extract records from the source. As you see, I have a DataFlow Task in the Control Flow.


Since we do not get surrogate key values from the source, we need to have Lookups for dimension tables for getting surrogate key values. As you see with the image below, I have added OLEDB Source to connect with Sales and four of Lookups for getting surrogate key values related to Date, Customer, Employee and Product.


Let's have a look on source first. It is configured with Sales database and the query above is used for populating required records.


There are three Type I type dimensions. Let's have a look on one and see how it needs to be checked. In order to get DateKey from the DimDate dimension, we need to look up DimDate by using TransactionDate that comes with the source. We need to configure the lookup to DataWarehouse connection and select the lookup table as DimDate. Once the connection is set, we need to link TransactionDate coming from the source with Date in the dimension. We need to select DateKey as it is the one we need when we insert records to the Fact Table. The selected DateKey is added to the flow automatically.


The similar configuration is required for DimCustomer and DimProduct for getting surrogate key values. We need to link CustomerID of the source to SourceKey in the DimCustomer table and get the CustomerKey. We need to link ProductID of the source to SourceKey in the DimProduct table and get the ProductKey. However, getting EmployeeKey is different as it is as SCD Type II dimension. Let's have a look on the configuration of Employee Lookup.

I will take you through the steps required for configuring this. First we need to make sure that Cache mode is set to either Partial Cache or No Cache because we are going to parameterize the lookup query.


Next is setting up the connection. We need to connect it to DataWarehouse and use a query for loading dimension records. Note the way it has been written. I have used question marks for parameterizing the query as we need to pick the right record related to the TransactionDate.


Just like the other dimensions, we need to match two columns from the source and dimension to find the record. However, additional link is required for getting values for parameters, hence need to link TransactionDate with StartDate though that is not the exact link we need.


You will see the reason for joining TransactionDate and StartDate with next step. We need to adjust the lookup query now, need to do it with Advanced section. It is required to adjust the query like below.


Once adjusted, need to set parameters like below. It allows us to select TransactionDate as a parameter value because the above join. Note that parameters a 0-based indexed.


All done, finally it needs the destination. Set OLEDB Destination for loading the FactSales as the last step the of the DataFlow task. Once all configurations are done, you can run the package and see how the FactSales has been loaded.




Viewing all 530 articles
Browse latest View live