The Cloud Way: May 2015

Thursday, 7 May 2015

Introduction to windows azure cloud services

Cloud Service: We can consider cloud service as a box where we can put different roles in it, for example web role, worker role etc.

If we want to give a mathematical representation of a Cloud Service,it will be given as:
Cloud service = [ role 1 x2 , role 2 x3 , role 3 x1] --------------------------- 1

where role 1, role 2,role 3 refers to cloud roles in azure and x1, x2, x3 refers to the number of instances of those cloud roles.

Role Instance is a Virtual Machine with a operating system like windows server running on it, a role instance if show mathematically can be represented as follows:

Role Instance = VM + Guest OS + {Service Bits} + {x} ------------------------------- 2

where x is the number of request this instance can handle per second

If we deploy a cloud service represented by equation 1 on cloud then this cloud service will deploy six VMs : 2 instances of role 1, 3 instances of role 2 and 1 instance of role 1

For each instance of same role, the VMs will have same service bits.

Why will we need more instances of our cloud service :

Each role instance has limitation of handling x (see equation 2) number of request per second, so in order to handle more requests one thing that we can do is to increase the value of x. This is referred to as scale up which can be done by providing more CPU power and more memory etc. But scaling up has its limitations , on a hosted environment we have a fixed choices for the maximum limit of scaling a virtual machine.

The other option that we have is to scale out in which we add more instances of virtual machine each of which will have capacity of handling x number of request per second.
Scale out has some advantages over scale up :

First we can infinitely scale out
Second we scale out by adding more servers which won't effect present working server

However multiple instances of VMs need to use a load balancer to ensure that they evenly distribute the incoming requests to the different instances evenly. Also in case when one of the instance goes down the server the other instance will be available and service remains available , in the background azure creates a new instance for the service which has gone down.

Monday, 4 May 2015

Understanding failover in azure

Understanding failover in azure cloud services:

The Service Level Agreement (SLA) of azure for cloud services is 99.95 i.e. the compute roles hosted on azure will have external connectivity at least 99.95% of the time. However in order to be compliant with this SLA figure windows azure wants its user to maintain at least 2 instance of the worker/web role.

This article tries to understand this requirement and give an insight into the architecture of compute roles on windows azure

First let’s get familiar with the terms used in this article:

(i) Azure Fabric Controller is responsible for provisioning and monitoring the condition of the Azure compute instances. The Fabric Controller checks the status of the hardware and software of the host and guest machine instances. When it detects a failure, it enforces SLAs by automatically relocating the VM instances.

The Fabric Controller uses dedicated resources that are separate from Azure hosted applications. It has 100% uptime because it serves as the nucleus of the Azure system. It monitors and manages role instances across fault domains.

The Azure Fabric Controller operates as the kernel and framework for Windows Azure, as it manages all the nodes, which includes servers, load balancers, switches, routers, etc.

(ii)Fault Domain is a physical unit of failure, and is closely related to the physical infrastructure in the data centers. The scope for a single point of failure is referred to as a fault domain. In Windows Azure the rack of servers in datacenters can be considered a fault domain.

Windows Azure Fabric is responsible to deploy the instances of your application in different fault domains. Right now fabric controller makes sure that your application uses at least 2 (two) fault domains, however depending on capacity and VM availability it may happen that it is spread across more than that. As of now developers have no direct control over how many fault domains his/her application will use.

(iii)Upgrade Domain defines the logical unit of deployment of an application. This concept helps Microsoft azure to handle how different instances of a compute role is upgraded, it makes sure to provide high availability of services during upgrade of an application. In order to achieve this when instances of an application are upgraded on one upgrade domain, the instances on the other upgrade domains keep on running. When the upgrade for all the instances gets completed on this domain the same process is repeated for instances in next upgrade domain and this step is repeated until all instances of our application are upgraded.

By default an application has 5 upgrade domains, but a user can change this value to a maximum of 20 upgrade domains.Also the upgrade domain are always spread across fault domains and instances of web/worker role are allocated to upgrade domain in a circular manner.

Introduction:

The compute role (web/worker) instances in windows azure are stateless. This means that in case of any fault /during upgrade a compute role’s instance might stop at one physical server and when this happens another server will pick it up. This is done through fabric controller which is used by windows azure to manage the systems.

However the storage in Azure does maintain state. In fact the data in our azure storage is replicated at three places in a single datacenter. So if the code of the compute role application is written in stateless manner i.e. the state is saved using azure storage services then in case of failure to an instance the fabric controller will start another instance at some other location and it will just restart that transaction and keep working.

Case: Only one instance of compute role is deployed on azure

In this case the instance of the compute role is present on only one fault domain and upgrade domain, so in case of any failure to the fault domain the service will be down and when the fabric controller detects this failure it will create a new instance of this instance at some other location but until this happens our service will be down.

Similar will be the case when we make any update to the instance of the application and redeploy it, because during upgrade all instances of the application within an upgrade domain are stopped and then upgraded and restarted. Since in this case we have only one update domain containing the only instance of our application which will be down when the upgrade takes place, so our application will be down for that particular time.

This is the reason why the SLA for cloud service says that it is up for 99.95% only for cases where two/more instances of the application are deployed.

Case: Two or more instance of compute role is deployed on azure

In cases when two or more instances of the compute role are deployed, the fabric controller ensures that these instances are running on two or more fault domains. Whenever an instance goes down there will be at least one other instance still running and when the fabric controller notices that one of the instance of the compute role has gone down it ensures to start another instance at some other physical server. In this manner in case of failure to one instance of the application the whole application does not go down.

During upgrade of the application only the instances of one upgrade domain are stopped at a time and upgraded , during this period the other instances keep on running and the application does not have to go through any downtime.

References:

Understanding Azure Services

Understanding how virtual machine works in azure:

It offers us Infrastructure as a service(IaaS) i.e. using this service we can create a virtual machine on demand either from standard images or from one we supply

Structure:

VHDs that back azure virtual machine are stored in azure storage blobs,this provide redundancy which ensures that even in case of hardware or disk failures VMs keep on working
Azure offers gallery of VHDs (also called images) , we can choose a VHD from gallery of stock VHDs or we can upload our own custom VHD
We can also copy a VHD out of azure and then run it locally

Scenario's to use VM:

Creating a Dev/Test environment
Moving application(with less dependency on local system) from local datacenter to azure
Extending our local data center with azure virtual machine by using azure virtual network

Understanding how websites work in azure

Scenarios to use websites:

We prefer using azure websites in place of hosting website on VM or using a cloud service when we just want to handle the website related tasks and leave the task related to administrating of website upto azure
For development websites supports .net,php,nodejs,java , python along with SQL database and mysql database.

Understanding azure cloud services:

This service is mostly used in cases when we want to build scalable, reliable and low admin apps. This is also referred to as platform as a service.
When running a cloud service , the code of our application (built in c#, java, php , nodejs or something else)runs on a virtual machine(instances) running a version of windows server.
This virtual machine is managed by azure itself and because of this reason our application should not maintain state in the web or worker role instance instead it should use azure data management services for this purpose.
When creating a cloud service we have two roles to choose from both of which are hosted on windows server : Web role and worker role.
The main difference between the two is that instance of web role runs IIS while instance of worker role do not.
In order to scale an application up or down we can request azure to create more instances of either role or shut down an existing instance.
Like in case of VM the charges are only made in case when the web/worker role is running.
This service should be preferred in case when we need more control over the platform than azure websites provide but less control over the operating system.

*The computer role instances in windows azure are stateless.To upgrade hardware because of a fault or any other reasons a computer role instance might stop on one physical server and another will pick it up.This is done through fabric controller whose task is to check the status of the hardware and software of the machine instances.In case of failure, it enforces SLAs by automatically relocating the VM instances.

Understanding data management in azure:

Azure provides many different ways to store data , with any of the options provided by azure 3 copies of the data is kept across a datacenter and 6 copies in case we kept our data geo redundant.

In virtual machine:
We can run relational systems as well NoSql technologies in VM created using Azure Virtual machines , however if we use this technique we need to handle the administration for that database system as well, in the following other cases azure does this work for its users.

Azure Sql database:

Azure SQL Database provides all of the key features of a relational database management system, including atomic transactions, concurrent data access by multiple users with data integrity, ANSI SQL queries, and a familiar programming model. Like SQL Server, SQL Database can be accessed using Entity Framework, ADO.NET, JDBC, and other familiar data access technologies.

Tables:

This follows a NoSql approach and provides a key value based storage
Azure Tables let an application store properties of various types, such as strings, integers, and dates. An application can then retrieve a group of properties by providing a unique key for that group
It is also referred to as azure storage, storage tables or azure storage.
Tables are usually less expensive to use than SQL Database's relational storage.
Tables are best to use in case when we need fast access to typed data and do not require to perform any complex sql queries on this data.

Blobs:

These are used to store unstructured binary data
Inexpensive and a single blob can store upto 1 TB of data.
They are useful when application wants to store video, massive files and other binary information.

Azure file services:

This service allows us to use Server message block protocol to share files between VMs.
In addition, the files can also be accessed at the same time via a REST interface, which allows us to access the shares from on-premises when we also set up a virtual network.

Understanding Azure marketplace:
Azure Marketplace lets us find and buy Azure applications and commercial datasets and use them as part of your Azure applications.Potential customers can search to find Azure applications that meet their needs. Customers can search for commercial datasets as well, including demographic data, financial data, geographic data, and more. When they find something they like, they can access it either from the vendor, directly through the Marketplace or Store web locations or in some cases from the Management Portal.

References:
(i) http://azure.microsoft.com/en-in/documentation/articles/fundamentals-introduction-to-azure/