What is Azure Synapse Analytics and how it differs from Data Factory

Challenges and innovations in the IT world on Advanced Technology Days

Advanced Technology Days was held in Zagreb for the 17th time! The conference has become a traditional gathering of IT enthusiasts in the SEE region with an emphasis on new technologies and innovations in the field. This year Unitfly had two presenters: our COO talked about Azure Synapse Analytics, an Azure platform that combines enterprise data warehouse and big data analytics to ensure centralized management of data lakes and warehouses. Seemingly opposite, our Software Engineer Dino Grgic presented the challenges of OCR, which you can read here.

COO Alan Debijadi / Unitfly

We will now get in-depth into Azure Synapse Analytics.

Bulking up with Azure Synapse Analytics

Azure Synapse Analytics is an enterprise analytics service that offers a more efficient way to gain insights across data warehouses and big data systems. It offers the key features of multiple solutions: ETL from data warehouses, big data analytics, and reporting, as well as visualization (achieved by accessing Power BI within the service).

Difference between Synapse Analytics and Data Factory

Azure Synapse Analytics can help you turn big, unstructured data into actionable insights, while Data Factory ensures numerous integrations without the use of code. The main difference between the two services is that Synapse Analytics is an analytics service, and Data Factory is a hybrid data integration service that simplifies the ETL at scale. Data Factory offers the integration of different data sources, but Synapse Analytics serves as a platform from which you can manage, prepare and serve data for BI and Machine Learning purposes with reporting capabilities.

Azure Data Factory offers features such as:

  • real-time integration
  • parallel processing
  • data chunker

On the other hand, Azure Synapse provides:

  • Complete T-SQL-based analytics
  • deeply integrated Apache Spark
  • hybrid data integration

What does Azure Synapse Analytics do?

Ingest – all functionalities of Data Factory and more

Synapse Analytics offers all the possibilities of Data Factory such as the integration of different data sources, but with added functionalities of monitoring, management, alerting, and security in one place.

Explore and analyze – using Synapse SQL

Synapse SQL combines distributed query processing capabilities with Azure Storage to achieve high performance and scalability, offering serverless and dedicated resource models.

Serverless SQL pool

Serverless SQL pool is a query service over the data in your data lake. It enables you to access your data through these functionalities:

  • a familiar T-SQL syntax to query data in place without the need to copy or load data into a specialized store
  • integrated connectivity via the T-SQL interface that offers a wide range of business intelligence and ad-hoc querying tools, including the most popular drivers

Dedicated SQL pool (formerly SQL DW)

Dedicated SQL pool (formerly SQL DW) is a collection of analytic resources that are provisioned when using Synapse SQL. The size of a dedicated SQL pool is determined by Data Warehousing Units (DWU).

The analysis results can go to worldwide reporting databases or applications. Business analysts can then gain insights to make well-informed business decisions.

The other available services are Apache Spark and Data Explorer (still in preview).

Visualization

The main appeal of Synapse Analytics lies in the ability to do everything in one place. Thanks to the native integration with Power BI, data can be instantly visualized in the platform.

Conclusion

Azure Synapse Analytics offers a way to have the whole end-to-end process in one place, from managing, preparing, and serving data for BI and machine learning purposes. Without the need to include additional platforms to import data from different sources, it positions itself as a must-have solution for data engineers.

Create and Restore Virtual Machine Image with Azure

Here is an easy way how you can simply snapshot, “freeze” your Azure Virtual Machine and then restore it whenever you want. This can be useful for creating Virtual Machines for testing purpose. With this process, you can still use your Virtual Machine Image after creating Snapshot.

Create Image from Azure Virtual Machine

Prepare you Virtual Machine

This is really important process where you prepare you Virtual machine:

  1. In the Azure portal, Connect to the virtual machine. For instructions, see How to sign into a virtual machine running Windows Server.
  2. Open a Command Prompt window as an administrator.
  3. Change the directory to %windir%\system32\sysprep and then run sysprep.exe.
  4. The System Preparation Tool dialog box appears. Do the following:
  • In System Cleanup Action, select Enter System Out-of-Box Experience (OOBE) and make sure that Generalize is checked. For more information about using Sysprep, see How to Use Sysprep: An Introduction.
  • In Shutdown Options, select Shutdown.
  • Click OK.
system preparation tool

Sysprep shuts down the virtual machine, which changes the status of the virtual machine in the Azure portal to Stopped.

Create Snapshot

In the Azure portal, navigate to your Virtual Machine.

Then, select:

  1. Disks on Virtual Machine menu, after selecting your Virtual Machine.
  2. Select your OS disk on the right (You could go directly to you disk from Azure portal)
  3. Create snapshot command from action menu on the top right

5. Name your snapshot and select Create

Now you have Snapshot of your Virtual machine (Disk).

Create Image

Best way to create Image from snapshot is with PowerShell:

  1. Create variables
$rgName = "myResourceGroup"
$location = "EastUS"
$snapshotName = "mySnapshot"
$imageName = "myImage"
  1. Connect to Azure
Login-AzureRmAccount

2. Get the snapshot

$snapshot = Get-AzureRmSnapshot -ResourceGroupName $rgName -
SnapshotName $snapshotName

3. Create the Image configuration

$imageConfig = New-AzureRmImageConfig -Location $location
$imageConfig = Set-AzureRmImageOsDisk -Image $imageConfig -OsState 
Generalized -OsType Windows -SnapshotId $snapshot.Id

4. Create the Image

New-AzureRmImage -ImageName $imageName -ResourceGroupName $rgName -Image $imageConfig

Image with $imageName is created and store on Azure portal.

Create/Restore Virtual Machine from Image

If you want to restore Azure Virtual Machine or create new one from Image, on All resources select your image.

On up right Menu select Create VM and follow instruction same as creating new Virtual Machine.

That’s all, now you have your “new” Virtual Machine created.

[Errors]

Azure Virtual Machine is not running / starting

If you’ve followed this article be aware that after this process Virtual machine is not usable and you’ll get errors and Virtual Machine will not run or start.

If you have any of the errors below, your virtual machine is not usable anymore.

Set-AzureRmVMAccessExtension : Long running operation failed with status 'Failed'.
ErrorCode: VMAgentStatusCommunicationError
ErrorMessage: VM '' has not reported status for VM agent or extensions. Please verify the VM has a running VM agent, and can establish outbound connections to Azure storage.
StartTime: 8/1/2017 12:41:40 PM
EndTime: 8/1/2017 1:07:06 PM
OperationID: 54486c40-20b6-4a97-be62-b4fcf23a68d0
Status: Failed
At line:1 char:1
+ Set-AzureRmVMAccessExtension -ResourceGroupName "" -VMName ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : CloseError: (:) [Set-AzureRmVMAccessExtension], ComputeCloudException
    + FullyQualifiedErrorId : 
Microsoft.Azure.Commands.Compute.SetAzureVMAccessExtensionCommand