MS Ignite Day #2 Recap

Day 2 was my opportunity to be focused and dive into a few technologies that were really peaking my interest, namely Azure Synapse Analytics and SQL Server 2019. Since I heard the Synapse announcement I wanted to make sure I sought out every session on this and understood everything I could possible learn and today certainly went a long way to help me understanding this new technology and also realized there is so much more to learn, but there are still 3 more days left.

With Azure Synapse everything I keep learning about this is it’s certainly impressive and I will stand by the statement that Snowflake, Google Big Query and Amazon Redshift better be paying attention because the complete integrated story, performance benchmarks and cost model have closed some big gaps that previously existed with Azure Data Lake and Azure SQL DW. One of the sessions I had a chance to attend was “Democratizing the Data Lake with On-Demand Capabilities in Azure Synapse Analytics” with Josep Aguilar Saborit and Charles Feddersen, where I came away with two very important takeaways. First, was the ease of being able to query and import data, in a nutshell get the data into Azure Data Lake Storage and it’s available. With that said it seems that Microsoft and most of the industry are standardizing on parquet as the file format of choice. Which if you are not familiar with is an extremely easy file format to get almost any file into, Azure Data Factory can really help here. Parquet format files are first class citizens within Azure Synapse and querying or importing this data is almost effortless.

Now let’s talk a little bit about the architecture behind Synapse and while yes Azure Synapse can be thought of as Azure SQL DW 2.0 you need to take everything you may already know about Azure SQL DW architecture and throw it out the window. The most important thing to remember is that the architecture behind this new service has been reinvented from the ground up and made to be a cloud first design. If you remember Azure SQL DW was not a cloud first design as it was basically the predecessor of SQL Server PDW just ported into the cloud with very little architecture change.

Now let me caveat the next few statements with I still have a lot to learn about Azure Synapse architecture but here are a few items that I have picked up on and I have been very impressed by. Both the failure detection and hotspot detection that is baked into the solution is quite impressive. First let’s understand that queries are natively treated as distributed and alight with both Hash and User partitions with the execution nodes, see picture below (Sorry these are pictures taken during presentations)

With failure detection being handled at the node level it allows for any failure at a node level to be automatically recovered and redistributed to the other active nodes. That’s right unless for some reason you lose all nodes your queries will not fail. If you are familiar with more legacy architectures, you would then know that this was not always the case.

Now with hot spot detection imagine the scenario that a query task was partitioned in a way where one node was doing a skewed amount of the work. What Synapse will do is recognize this skewed load and automatically redistribute this load to other nodes that are currently being underutilized. Oh, wait I am not done with the amazing special sauce, if multiple nodes become overloaded and distribution can’t be done then additional nodes can just be added to achieve ultimate performance.

Now let’s move on to SQL Server 2019 and new features in Azure SQL which I am just going to lump all in together. Now this is not a full comprehensive list of all new features just some things I wanted to call out. Let’s start with Azure SQL Serverless, if you are looking for a way to use Azure SQL and not incur a crazy cost then take a look at Azure SQL Serverless for scenarios such as Dev/Test environments (I’m sure there are other use cases but this is what I am focusing on for now). Typically in Dev/Test environments you don’t have the requirement to be up 100% of the time and therefore you don’t want to have the cost of running a database 24/7, and this is where serverless comes into play as you are only charged for the time that you are actually running queries. I will go out on a limb and say this could easily be a 40-50% cost savings for existing Dev/Test environments you have that are running on current Azure SQL Database and incurring 24/7 costs.

One more feature set I want to quickly hit on is the Intelligent Query Processing. With this new feature there are quite a number of areas of improvement so let’s just hit on a quick few. The first is the memory grant feedback, where if the initial execution if the memory grant was too small and caused spill to disk the subsequent executions will be adjusted to properly set memory grants and therefore reduce spill and increase query execution time. Table Variable deferred compilation now treats a table variable just as a temp table has always been done and therefore query cardinality is now no longer assumed to be a rowcount of 1. While I am not a huge table variable fan if you do have a workload that is very table variable heavy then just upgrading to SQL Server 2019 should result in performance gains just by upgrading.

One last note is on Azure Data Studio as I had a chance to see Alan Yu’s session on Azure Data Studio focused on notebooks. If you are not using Azure Data Studio and notebooks hurry up and start learning you will thank me later. Plus, Azure Data Studio now has PowerShell notebooks and is a one stop show for almost any database developer. Azure Data Studio November release

Use PowerShell to automate your Azure DB creation

Why create your databases manually if you don’t have to?  If there is one thing that I hate to do it is extra work when it’s not needed.  With the magic of PowerShell and Azure PowerShell modules you can easily automate the creation of your Azure Database.  You can download the scripts I put together from here and below is a little walk through on the scripts.

There are three files contained in the zip folder that will be needed.

  • AzureConfig.xml (Contains the username and encrypted password to login to Azure)
  • PasswordEncrypt.ps1 (utility to set the encrypted password for the AzureConfig.xml)
  • CreateAzureDBv12.ps1 (Script to create your Azure Database)

So why did I create my automation this way?  Well like I said before I hate doing extra work and that includes having to manually login, but I am also security conscious so I wanted to create a way to have a configuration file that would have my username and password and also store that password as an encrypted value.

To use this all you have to do is enter your username into the username tag in the AzureConfig.xml and then launch the PasswordEncrypt.ps1 and enter the password to encrypt when prompted and it will update the AzureConfig.xml file.  There is a default key being used in the PasswordEncrypt.ps1 that you can feel free to change but if you do you must change it in the PasswordEncrypt.ps1 on line 2 and CreateAzureDBv12.ps1 on line 36.

PasswordEncrypt.ps1image

CreateAzureDBv12.ps1image

And yes as I am writing this I realized I should of put the key into the config file as well but I will save that for the next update.

Now you are ready to use the CreateAzureDBv12 script which has the following parameters.

  • SQLDatabaseName (The name for your database)
  • Location (The Azure location you wish to create your database in)
  • SQLServerName (The name of your SQL Server.  This can be an existing or new server.)
  • StartIPAddress (the start IP Address to create the Azure Firewall rule for your server to allow connections)
  • EndIPAddress (The End IP Address to create the Azure Firewall rule for your server to allow connections)

There is also a default rule that is created to allow Azure IP addresses.  You can comment this out if you don’t want to create this rule.  Myself I do this because almost all the databases I create are being used by an Azure Web Application or Power BI.

Hope you enjoy these and love to hear feedback.