Tuesday, August 16, 2016

Azure Resource Manager Policy management: Usecases and scenarios

I was recently asked to configure certain restrictions in our enterprise Azure Subscription for better utilization of the subscription as well as restrict our members from launching large VMs and potentially avoid evaporating the Azure credits.  The requirements are below.

  • Members must stick to East US and Central US data-centers for launching resources, other data-centers must be denied.
  • Members must tag their resources according to the below guidelines (Click here to learn about Tagging in Azure)
    • Environment : Production or Testing or Development
    • CostCenter : Finance or Engineering or Accounting etc
    • Customer : Contonso or Fabricam etc
  • Only selective Web Hosting Plan SKUs are allowed for Test and Development Purposes
For defining custom policies, we must use Azure Resource Manager Policies, an advanced feature that lets us define our custom policies and  evaluate based on various fields and allow and Deny the execution of the request.

{
  "if" : {
       | 
  },
  "then" : {
      "effect" : "deny | audit | append"
  }
}

Tuesday, May 31, 2016

Azure App Services : Chennai Azure User Group Meetup (May - 2016)

Microsoft launched Azure App Services during Microsoft Build 2015. Azure App Services is essentially a combination of 4 different but unique offerings which includes Web Apps + Mobile Apps + API Apps + Logic Apps. Logic Apps was a brand new launch, but the rest of the services were existed earlier as individual offerings such as Azure Websites and Azure Mobile Services.

The re-branding and grouping activity was very much logical as all the services compliment each other very well. Developers who were using Azure websites and WebAPIs got lot of questions behind the consolidation and wanted to understand the newer capabilities that Azure Web app plans brings to the table.



Based on the meetup members request, i hosted an exclusive meetup to address some of the common questions around Azure App Services and the best practices to be followed while using Azure App Services.

Agenda
  1. Understanding Azure App Services
  2. Introduction to Web Apps + Mobile Apps + API Apps + Logic Apps
  3. Similarities & Differences between Web Apps + Mobile Apps + API Apps 
  4. Understanding App Services Plans and its pricing
  5. Understanding App Service Environments
  6. Best Practices for suing App Service Plans


Note: The topic of the next Chennai Azure User Group is going to be on Virtual Network, so join and learn Azure!


Tuesday, April 19, 2016

Summary of Chennai Azure Global Bootcamp 2016

The Chennai Global Azure bootcamp 2016 happened on Apr, 16th 2016 went on really well with lots of technical session and handson labs. We received more than 1000 registrations from people with variety of background, some of them were practicing professionals, few from IT Pro background and lots of students who are aspiring to be a cloud professionals soon.

The event started at 10.00 am and the keynote was delivered by Balaji Uppili, Chief Delivery Officer ,GAVS. He gave a lightning talk on the current cloud landscape and how azure is leading the game also touched little upon how developers must equip themselves to stay relevant in the ever changing IT. Soon after the keynote, presenters started offering session in 2 tracks, 1. Developer 2. IT Pro

We received 10 fantastic speakers from Chennai and Bangalore who delivered various tracks on various topics including Azure Apps Service, Open Source, Big data. I delivered  a topic on Azure Data lake Analytics and Data lake store, the services which are currently in preview, but the attendees were able to recognize the value of the services and how it can help developers to leverage these big data analytics services and exploit big data.

Event Highlights


  • Total Registrations for the event : 1000+ 
  • Attendees joined the session : 450
  • % of Dev/IT Pro : 75%
  • % of Student partners : 25%
  • Total no of Technical Tracks : 10
  • Hands on Lab conducted : 1

Tracks



9:00 TO 9:30 : RECEPTION AND REGISTRATION
Dev TrackIT Pro Track
09:30 - 10:00: CHENNAI GLOBAL AZURE BOOTCAMP: KEYNOTE BY BALAJI
10:15 - 11:00: Building Mobile Apps with Visual Studio and Azure Services10:15 - 11:00: Power of Open Source on Azure (Ruby on Rails)
11:00 - 11:15: Café
11:30 - 12:15: Deep into Azure Machine Learning & Predictive Analytics11:30 - 12:15: Running LINUX workload on Azure
12:30 - 13:15: 14:00 - 14:45: DevOps In Azure12:30 - 13:15: Kick off your project for success!
13:15 - 14:00: LUNCH
14:00 - 14:15: Deep dive into Azure SQLIntroduction to Data Lake Analytics & Store
15:00 - 15:45: IoT on Azure (Part 1)15:00 - 15:45: Azure AD with Office 365
15:45 - 16:00: Café
16:15 - 17:00: IoT on Azure (Part 2)Azure Hands on Labs
17:00 - 17:30: AZURE GLOBAL BOOTCAMP 2016 CLOSING


Local Sponsors

GAVS Technologies and 8KMiles Software Services are the key local sponsors who helped us to execute such a larger event in Chennai. Infact, this is one of the largest community driven Azure event in the recent past conducted in the city. I'm very thankful for all the sponsors in helping us to execute the event.



Conclusion

Overall, the event went on really well, and a lot of great content was delivered by our awesome experts and MVP speakers. Thanks to all the presenters as well as all attendees! Without you, there wouldn’t have been an event. Also, special thanks goes to the Global Azure Bootcamp team for organizing the global event and getting together the prizes from all the global sponsors.

I had a great time presenting and helping people out with the hands-on labs launching Windows and Linux VMs towards the end of the day. It was a great learning and fun experience. Currently, I’m planning to help coordinate the Chennai Global Azure Bootcamp event next year as long as god will.

Until next year, adios amigos!

P.S. Please feel free to contact me with any question about Azure or general feedback on the event. You can either submit them to me in the comments on this post, via Twitter @ilyas_tweets or drop an email to me @ ilyas@bornoncloud.com.

Thanks!

Thursday, April 14, 2016

Chennai Azure Global Bootcamp 2016 – By Chennai Azure User Group

Azure Global Bootcamp, as the name suggests a global event organized by community all around the globe at various places. If you got few minutes to learn the history of this event I recommend you to head over to http://global.azurebootcamp.net/. The motto of the event is to bootstrap developers, IT Pros to get into Azure cloud and learn the various services offered by Microsoft Azure. The event happens in various cities and this year I grabbed the opportunity to host and present a session on Azure Data lake analytics and Data store topic.

Chennai compared to rest of the cities in India is kind of offbeat when it comes to Technical events and communities, but it doesn’t mean that Chennaiites are not enthusiastic towards technology or programming, but it’s less compared to other cities of India.

The Azure Bootcamp has been conducted in Chennai ever since the event was started in the year 2013, but for various reasons its been conducted at universities and colleges mainly because of overhead involved in arranging venues, food and other expenses involved in conducting such an community event where there is no source of fund!

Conventionally for the year 2016, I took it as a challenge and wanted to conduct the event mainly targeting the IT community rather than student community, its not that we don’t want students to participate, but the content demanded more of IT Professionals rather than students. However we have got reservations from students from various universities and colleges to participate and learn Azure.

For the 2016 Chennai edition we have received more than 1000+ registrations because of the good work put into marketing this event and reaching out to all my connects through various social media platforms. I wanted to make it as a larger event with multiple tracks under different topics to cater the various kinds of audiences including developers, IT Pros, Big Data enthusiasts, Architects and mid-level managers.


As it is a full day event with multiple sessions with such a crowd, I wanted a larger venue that can accommodate such a size, but unfortunately hiring an auditorium to host such a larger event is outside of our budget. So i reached out to many IT & ITES services partners in around Chennai especially around the IT Express Highway where the majority of the IT communities exist. Many It partners came forward, but unfortunately some of them has a smaller venue or away from the potential location we were looking forward.

Atlast, GAVS Technologies, a leading IT & ITes provider located at Sholinganallur, OMR Chennai sponsored us the venue, food and other key assistance for us to host the bootcamp at their premise. Similarly audio/video, mic and speakers arrangments were sponsored by 8KMiles, my current employer, thanks to my boss Mr.Harish Ganesh who is the CTO of the company.

Getting speakers for a community events like this is always challenging, however with the help of   Chennai Microsoft user group and the right connects with the community people helped me to find experts in various areas to offer a session in the azure bootcamp!




We are very excited to conduct and host the session and looking forward to see there!



Tuesday, April 12, 2016

Sorting Operations in U-SQL

Just like SQL we can sort rowsets by using the ORDER BY operator in U-SQL by specifying the ASC and DESC keyword controls whether the sort is ascending or descending, respectively. In the below U-SQL SELECT * FROM statement you can order the output rows IF and ONLY you also specify how many rows you want to fetch back, failing to using FETCH First clause will result in error. The reason for this behavior is, when running U-SQL scripts in the Azure Datalake analytics distributed environment, performing any sorting will be a very expensive operation. It is for this reason that the ability to order rows has been deliberately restricted to the cases shown above.
DECLARE Out_File string = "/Smples/Output.tsv";

@searchlog = 
EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

// List the sessions in increasing order of Duration
@output = 
    SELECT * 
    FROM @searchlog
    ORDER BY Duration ASC
    FETCH FIRST 3 ROWS;

OUTPUT @output
    TO @Out_File
    USING Outputters.Tsv();

Alternatively, you can also specify ordering when using an OUTPUT statement which does not require a FETCH FIRST.
@output = 
    SELECT * 
    FROM @searchlog;

OUTPUT @output 
    TO @"/Samples/Output/SearchLog_output.tsv"
    ORDER BY Duration ASC
    USING Outputters.Tsv();

Declaring Parameters in U-SQL

Declaring re-usable parameters is always handy when writing a lengthy U-SQL Scripts. U-SQL also supports using DECLARE statement while writing queries. You can use the DECLARE instruction to save values of Native U-SQL datatypes.
DECLARE Out_File string = "/Output/sample.txt";

@searchlog = 
EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

@output = 
    SELECT 
        Region,  
         (Duration>300 ? "long" :"short") AS DwellType
    FROM @searchlog;

OUTPUT @output
    TO @Out_File
    USING Outputters.Tsv();
U-SQL Data Types DECLARE can be used with all the U-SQL native datatypes. String Type
DECLARE @text1 string = "Hello World";
DECLARE @text2 string = @"Hello World";
DECLARE @text3 char = 'a';
You can use other variable declarations in a #DECLARE statement to combine strings.
DECLARE @text4 string = "BEGIN" + @text1 + "END";
DECLARE @text5 string = string.Format("BEGIN{0}END", @text1);
Numeric signed
DECLARE @numeric1 sbyte = 0;
DECLARE @numeric2 short = 1;
DECLARE @numeric3 int = 2;
DECLARE @numeric4 long = 3L;
DECLARE @numeric5 float = 4.0f;
DECLARE @numeric6 double = 5.0;
Numeric unsigned
DECLARE @unumeric1 byte = 0;
DECLARE @unumeric2 ushort = 1;
DECLARE @unumeric3 uint = 2;
DECLARE @unumeric4 ulong = 3L;
Miscellaneous
DECLARE @misc1 bool = true;
DECLARE @misc2 Guid = System.Guid.Parse("BEF7A4E8-F583-4804-9711-7E608215EBA6");
DECLARE @misc4 byte [] = new byte[] { 0, 1, 2, 3, 4};
DateTime
DECLARE @d1 DateTime = System.DateTime.Parse("1979/03/31");
DECLARE @d2 DateTime = DateTime.Now;
Usage of user definied function in DECLARE is currently not supported. For example, you cannot use the below statement which will result in error
DECLARE myName string = MyHelper.GetMyName();
Similalry Values cannot be assigned from a RowSet. For example, you cannot use the below statement which will result in error
DECLARE @maxval int = SELECT MAX(value) FROM data;
The above statements are not explicitly supported by U-SQL, however an alternative is to get a single-row rowset with a single column and then JOIN that tiny rowset other rowset to get what you need.

Introduction to Azure API Management

APIs been there for many decades and it’s an integral part of many applications including desktop and web and mobile solutions. Recently, the momentum of API has tremendously increased because businesses wants to integrate their applications with their vendors and partners, web companies wants to enable newer business models, governments wants to expose their data to citizen developers to build newer insights and solutions for the betterment of the society. The secondary reason for the enormous adoption of the API Application is due to the technical evolution such as Server less application development, Micro Services, Containerization etc.,.

API applications are just like any other web applications without a user interface, however the requirements an API solution are very similar to web applications. For e.g. Authentication and Authorization, discoverability, ease of access, Logging, monitoring and finally cross platform support. Any typical solution development has two key parts 1. Functional components 2. Non-Functional components. Functional components are basically the core of the problem that you are solving for example if you are a developer at a Pharma company and you’re building an API App that would expose the list of medicines that your company has developed. Example for the nonfunctional components would be building an authentication module for securing your API app against malicious user or a system or enabling analytics for API etc.

Businesses has to make significant investment in building the nonfunctional components required to make a successful API strategy, similarly if an enterprise has built internal APIs way back using legacy technologies and looking for exposing these APIs to the external world, it would be extremely difficult. So Microsoft has come up with a new offering to bridge the gap between API developers and consumers.

Building Blocks of Azure API Management


Azure API Management provides powerful capabilities for businesses to offload the complex management and delivery of APIs. Business can bring in any APIs built on any language and hosted anywhere to Azure and let API management to take care of the complete lifecycle of the API. Below are the key benefits of using Azure API Management.


Benefits of API Getway


  • Accepts API request from the clients and routes it to APIs hosted at azure or anywhere including on-premise 
  • Validates the request for security
  • Apply quotas and limitation per client or customer
  • Convert the API response to desired format, for example if your API outputs XML and the desired format is JSON, Azure API management does the conversion on the fly
  • Cache the responses
  • Logging and Monitoring

 Publisher Portal


  • Define or import API schema.
  • Create package or subscriptions by bundling one or more APIs
  • Set up configuration policies and restricting no of API request from one customer in a given time Get insights from analytics. User management 

 Developer Portal

  • Automatic design and delivery of API documents/websites using Swagger/Swashbuckler
  • Web console for trying APIs on the fly 
  • Customer onboarding and security token issuing
  • creation Access analytics on customer API usage.

Monday, April 11, 2016

EXTRACTing data from Multiple Files in U-SQL

We’ve seen this example EXTRACT Operator to extract data from a single file, but what if we want to extract data from multiple identical/un-identical data from similar or from different locations.

@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();
The below example shows how to read data from multiple files from different locations.

@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM 
        @"/Samples/Data/SearchLog1.tsv",
        @"/Samples/Data/SearchLog2.tsv",
        @"/Samples/Data/Innerfolder/SearchLog3.tsv"
    USING Extractors.Tsv();

Important pointers to note: 

  • Notice that the name of each input file must be specified 
  • If one of the specified input files does not exist, then the script will fail to compile 
  • Not obvious in this example, the files don’t have to have any common naming or location

Transforming Rows into Columns in U-SQL

Pivot operator in SQL helps us to turn the unique values of a specified column from a single/multiple rows into multiple column values in the output. Additionally we can also perform aggregations, wherever required, for column values that are necessary in the final output.

If you want similar functionality in your U-SQL query, you have to use the CROSS APPLY operator of U-SQL to essentially do the similar job.

@output = 
    SELECT 
        Region, 
        Urls
    FROM @searchlog;

@output =
    SELECT 
        Region, 
        SqlArray.Create(Urls.Split(';')) AS UrlTokens
    FROM @output;

@output = 
    SELECT 
        Region, 
        Token AS Url
    FROM  @output
    CROSS APPLY EXPLODE (UrlTokens) AS r(Token);

Cross Apply Output

0
1
"en-us"
"www.nachos.com"
"en-us"
"www.wikipedia.com"
"en-gb"
"skiresorts.com"
"en-gb"
"ski-europe.com"
"en-gb"
"www.travelersdigest.com/ski_resorts.htm"
"en-gb"
"mayoclinic.com/health"
"en-gb"
"webmd.com/a-to-z-guides"
"en-gb"
"mybrokenleg.com"
"en-gb"
"wikipedia.com/Bone_fracture"
"en-ca"
"southparkstudios.com"
"en-ca"
"wikipedia.org/wiki/Sout_Park"
"en-ca"
"imdb.com/title/tt0121955"
"en-ca"
"simon.com/mall"
"en-us"
"cosmos.com"
"en-us"
"wikipedia.org/wiki/Cosmos:_A_Personal_Voyage"
"en-us"
"hulu.com/cosmos"
"en-fr"
"microsoft.com"
"en-fr"
"wikipedia.org/wiki/Microsoft"
"en-fr"
"xbox.com"
"en-us"
"www.amazon.com"
"en-us"
"reviews.cnet.com/wireless-headphones"
"en-us"
"store.apple.com"
"en-us"
"dominos.com"
"en-us"
"wikipedia.org/wiki/Domino's_Pizza"
"en-us"
"facebook.com/dominos"

Converting columns into Rows


Similarly if we want to convert a set of columns into rows or revert back to the previous state (refer the above situation), we have to use ARRAY_AGG operator. In the example below you will see rowset taken apart by CROSS APPLY and then reconstructed via the ARRAY_AGG operator.

@a = SELECT Region, Urls FROM @searchlog;

@b =
    SELECT 
        Region, 
        SqlArray.Create(Urls.Split(';')) AS UrlTokens
    FROM @a;
@c = 
    SELECT 
        Region, 
        Token AS Url
    FROM  @b
    CROSS APPLY EXPLODE (UrlTokens) AS r(Token);
@d =
    SELECT Region,
           string.Join(";", ARRAY_AGG(Url).ToArray()) AS Urls
    FROM @a
    GROUP BY Region;

Grouping and Aggregation in U-SQL

Group By function helps in collapsing multiple rows into single rows based on a criteria. By using Group By operation, some fields in the output rowset must be aggregated into some meaningful value (or discarded if no possible or meaningful aggregation can be done). Let’s apply the group by operation in an example
@searchlog = 
EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

// find the total Duration by Region
@output = 
    SELECT 
        Region, 
        SUM(Duration) AS TotalDuration
    FROM searchlog
    GROUP BY Region; 
Now lest look at the common use of HAVING operator. We can use HAVING to restrict the output rowset to those rows that have aggregate values we are interested in. For example, we want to find all the Regions where total dwell time is above some value.
@searchlog = 
EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

// find all the Regions where the total dwell time is > 200
@output = 
    SELECT 
        Region, 
        SUM(Duration) AS TotalDuration
    FROM @searchlog
    GROUP BY Region
    HAVING TotalDuration > 200;
Sample for Count Function (Count the number of total sessions by Region)
// Count the number of total sessions.
@output =
    SELECT 
        COUNT() AS NumSessions
    FROM @searchlog;
 

@output = 
    SELECT 
        COUNT() AS NumSessions, 
        Region
    FROM @searchlog
    GROUP BY Region;
Count the number of total sessions by Region and include total duration for that language.
@output = 
    SELECT 
COUNT() AS NumSessions, 
Region, 
SUM(Duration) AS TotalDuration, 
AVG(Duration) AS AvgDwellTtime, 
MAX(Duration) AS MaxDuration, 
MIN(Duration) AS MinDuration
    FROM @searchlog
    GROUP BY Region;
Note: Dealing with Aggregations by Data Types If the input data type is double, the output datatype while applying the Aggregate functions will be as follows
SUM(double) -> double
COUNT(double) -> long(int64)
If the input data type is numeric (long/int/short/byte, etc.):
SUM(type) -> long(int64)
COUNT(type) -> long(int64)
Tip: Aggregate functions can ONLY be used in a SELECT clause and not in Extract statement. Usage of DISTINCT Operator with Aggregates Every aggregate function can take a DISTINCT qualifier. For example
COUNT(DISTINCT x) 
DISTINCT also works for user-defined aggregates.
MyAggregator(DISTINCT x,y,z)

Sunday, April 10, 2016

Ternary Operators in U-SQL

In C#, the ternary operator tests a condition. It compares two values. It produces a third value that depends on the result of the comparison. In U-SQL this can be achieved using the conditional operator as shown below and not using If statement as if is not supported by U-SQL. Applying this in our query sample
@searchlog = 
EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

@output = 
    SELECT 
        Region,  
         (Duration>300 ? "long" :"short") AS DwellType
    FROM @searchlog;

OUTPUT @output
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv();
Understand the importance of using Parentheses with the Conditional Operator Always enclose the conditional operator in parenthesis to C# Precedence rules. Fro example, the below statement without the parenthesis
c1 ? true : false AND c2 ? true : false
It will be evaluated as below without parenthesis, which we don’t want to happen
c1 ? true : (false AND c2 ? True : false)
To prevent this from happening just get into the habit of always using parenthesis around a conditional operator as shown below
(c1 ? True : false) AND (c2 ? True : false)

How to use C# Code behind in U-SQL

The role of C# code behind class file is very similar to having C# code behind in asp.net web forms application. The purpose of the code behind file in a USQL project is to store C# code that can be used in your U-SQL script. Every U-SQL Script in Visual Studio will have a Code-Behind file associated with it. In the case of the Script.usql file w’ve created so far, the code behind looks like this.


Clicking on the code behind file will show the following code.

using Microsoft.Analytics.Interfaces;
using Microsoft.Analytics.Types.Sql;
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;

namespace MyUSQLApp
{
}
using Microsoft.Analytics.Interfaces;
using Microsoft.Analytics.Types.Sql;
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;

namespace MyUSQLApp
{
    public static class Helpers
    {
        public static Double AddDuration(int duration)
        {
            return duration + 1.0;
        }
    }
}


To use the helper method in the script, follow the below code snippet. Please note that we have to use the fully qualified name while referring the function in the script, failing to do that will cause run time error.

@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

@output = 
    SELECT 
        Region,
        MyUSQLApp.Helpers.AddDuration( Region ) AS Region2
    FROM @searchlog;

OUTPUT @output
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv();

U-SQL Data Transformations - Part2

Create new columns

In certain scenarios, we may need to create a new column as part of the data transformation. The below sample shows how we can add a new column into the rowsets.

@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();


@output = 
    SELECT 
        Start,
        Region,
        Duration + 5.0 AS Duration2
    FROM @searchlog;

OUTPUT @output
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv(); 

Find Distinct records

The below snippet shows the application of "DISTINCT" to fetch only unique values.
@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();


@output = 
    SELECT DISTINCT Region FROM @searchlog;

OUTPUT @output
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv(); 

using IN Operator in U-SQL

@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();


@output = 
    SELECT Region FROM @searchlog where Region in ("USA","India");

OUTPUT @output
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv(); 

Numbering Rows

We can use use ROW_NUMBER windowing function aggregate to assign row numbers to the dataset.

@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();


@output = 
    SELECT ROW_NUMBER() OVER ( ) AS RowNumber, 
           Start, 
           Region 
       FROM @searchlog
       ORDER BY Start;


OUTPUT @output
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv(); 

Saturday, April 9, 2016

Introduction to Azure Big Data Analytics Suite

In Build 2015, Microsoft announced Azure Data Lake analytics, Data lake Store, Data Factory, Data Catalog to address analytics and big data challenges of enterprises. Many wondering why is “Yet another big data service” when there are plenty of big data technologies and products in the market including open source and premium products. Microsoft also has HDInsight in their market place which is an Apache Hadoop distribution powered by the cloud powered by HortonWorks.

The primary reason being the complexity and learning curve involved in learning and getting started with the existing big data tools and technologies. Below is a partial and most popular list of Apache open source technologies (For complete list of apache big data open source projects, click here) which addresses certain challenges of a greater big data problems, but there are plenty of industry specific vertical and domain based big data solutions from various vendors.


At first sight, this is of course confusing especially if you are coming from a Microsoft world, you will simply skip and drop the big data project.


The second major reason is currently Microsoft doesn’t have any Big data solutions to cater their enterprises audience, being a provider of massively successful SQL Server solutions Microsoft doesn’t want to left behind on Big data offering and this makes lot of sense as well.  

Finally Microsoft doesn’t want their Visual Studio, .Net/C#/SQL developers to look at another vendor or learn new technologies to achieve big data projects. Azure Big data lake suite is the answer for bridging the gap between .Net worlds with big data.

However the real challenge for Microsoft was deciding between building a big data product from scratch or use existing established open source projects. Finally, Microsoft took the later route i.e. embracing open source projects to power Azure Big data suite.


The Data Lake analytics service is a new distributed analytics service built on Apache YARN that dynamically scales so you can focus on your business goals, not on distributed infrastructure like other IaaS based solutions. Azure Data Lake analytics natively works with all the data sources including Azure SQL, SQL Server on VM, Azure BLOBs, Azure SQL Datawarehouse that lives in cloud as well as from elsewhere, similarly Data lake store built in top of WebHDFS can ingest and receive data from any HDFS endpoints whether it’s coming from Azure or anywhere else.



Is the Azure Data Lake Query engine powered by Hadoop?

No, Microsoft didn’t build the data lake suite on top of Hadoop map reduce, partly because of the overhead involved in integrating it with the vast set of existing Microsoft products. Secondly Microsoft wanted its developers to use the same SQL + C# even for big data projects and not to burden them with introducing another language exclusively for big data. However, both C# and SQL scores well in its own areas and had advantages and disadvantages when it comes to big data processing, so Microsoft intelligently combined the power of C# expressions and the elegance of SQL and formulated Unified SQL (U-SQL), an all-new big data query language for big data processing.


Welcome to U-SQL

U-SQL, is the big data query language from Microsoft for its Data lake suite. If you are SQL developer and SQL DBA, you’ll notice that U-SQL queries look a lot of SQL queries. Many fundamental concepts and syntactic expressions will be very familiar to those with a background in SQL.

However U-SQL is a unique language and some of the expectations you might have from the SQL world do not carry over into U-SQL. For example, Select in SQL is not case sensitive, but it has to be SELECT (in upper) when used in Data lake.

In the upcoming blog posts, I’ll dig deep into the various U-SQL concepts with example. Please come back for more on Azure Big Data Analytics and U-SQL.

Good Luck!

U-SQL Data Transformations - Part1

In the last post, we looked at the detailed step by step process in getting started with the U-SQL Hello world project. Now that we are comfortable in the basic querying and rowset creation process, lets look how we can selectively filter the dataset by applying conditions.

Selection and Filtering

The SELECT clause, as in SQL, allows you to pick the columns of interest.

@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

@output = SELECT Start, Region, Duration
    FROM @searchlog;

OUTPUT @output
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv();


If you notice the line no 12, we are creating another rowset called @output by selecting only “Start,Region,Duration” columns from the @Searchlog. Similarly we can create as many filtered rowsets as we want from other Rowsets

Now let’s apply WHERE and HAVING Clauses to further filter @Output rowset as we do in SQL. Logical operators such as AND and OR are supported.

@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

@output = 
    SELECT Start, Region, Duration
    FROM @searchlog
    WHERE Region == "en-gb";

OUTPUT @output
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv();



Note: Notice the use of "==" in the example above instead of "=". This is because expressions in the SELECT statement are C# expressions.

Refining RowSets

A rowset can be created from itself – this allows you to refine a rowset one statement at a time which may be useful for debugging or simply to make your script easier to use.

@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

@output = 
    SELECT Start, Region, Duration
    FROM @searchlog;

@output = 
    SELECT *
    FROM @output
    WHERE Region == "en-gb";

OUTPUT @output
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv();

Filter Rowsets by Dates

A rowset can be created from itself – this allows you to refine a rowset one statement at a time which may be useful for debugging or simply to make your script easier to use.

@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

@output = 
    SELECT Start, Region, Duration
    FROM @searchlog;

@output = 
    SELECT *
    FROM @output
    WHERE  Start >= DateTime.Parse("2012/03/01") AND Start <= DateTime.Parse("2012/03/30");

OUTPUT @output
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv();

Logical Operators

The AND/OR/NOT operators can be combined with parentheses for more complex expressions

@searchlog = 
    EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

@output = 
    SELECT Start, Region, Duration
    FROM @searchlog 
    WHERE (Duration >= 2*60 AND Duration <= 5*60) OR NOT (Region == "en-gb");


OUTPUT @output
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv(); 

In addition to AND/OR/NOT, U-SQL also supports C#-style logical operators. These are equivalent to their AND/OR/NOT. However, || and && perform short-circuiting which AND/OR do not.


C#
Logical AND with short-circuiting
(a && b && c)
Logical OR with short-circuiting
(a || b || c)
Logical NOT
!a





Writing your first U-SQL Hello World

Writing your first U-SQL Hello World

To get started with writing your first U-SQL query, open visual studio and click on new project à and click on U-SQL (If you don’t see the U-SQL project, you have to install U-SQL tools for Visual Studio, click here for instructions)

Create new U-SQL Project

Sample data and script


The sample datasets are available in the Azure Data lake analytics account for us to get started with ADL. To get the sample datasets, click on Explore Sample Jobs and click on copy samples, this process will copy the entire dataset samples to our ADL store account. Additionally you can also download the sample datasets and keep it in your local environment.

Copy Sample Datasets

Sample Data sets in your Azure Data Analytics Store

Now that we have the data, let’s go ahead and execute the below sample script. This script is a very basic query and it essentially just copies the contents of one TSV file into another TSV file.

@searchlog =
    EXTRACT UserId          int,
            Start           DateTime,
            Region          string,
            Query           string,
            Duration        int,
            Urls            string,
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

OUTPUT @searchlog
    TO @"/Samples/Output/SearchLog_output.tsv"
    USING Outputters.Tsv();

Sample Script in the Solution Explorer

Compile and run the script

Click on start button, this will compile the script and run it against the Azure ADL account. After successful completion of the execution, the output file will be created in the output folder inside the samples folder in the Azure Data Analytics Store.

Key aspects to note

Extracting data from a File

You must have noticed that there is no SELECT statement in the first U-SQL, it’s because the input data has no pre-defined schema and hence EXTRACT statement is used to read data from files. However once the data is extracted and created as a Rowset (Schematized), we can apply SELECT on top of it. While using EXTRACT you must specify both the column name and column type.
To perform the EXTRACT function, we must use an Extractor – it’s an inbuilt function of Azure ADL that knows how to parse the file and produce rowsets. U-SQL comes with several such built-in extractors for reading data from files. The one used in this script is Extractors.Tsv. (“TSV” stands for tab-separated value).

Rowsets

The rows derived from the EXTRACT statement are defined by a RowSet named @searchlog. Rowsets – is essentially an intermediate table structure created by U-SQL to internally pass data during script execution. The rowsets could be further transformed to one or more rowsets.

Outputters

The @searchlog RowSet is written to an externa output file using the OUTPUT statement. A built-in Outputter called Outputters.Tsv is available to write a file in TSV format, similarly we also have Outputters.Csv for outputting the data into CSV format.

Query Execution Environment

The ADL query can be executed locally or in the Azure ADL Account. The input data and output data can be placed locally as well as in the Azure ADL account, but while you execute the query locally it may not store (Known bug) the output in the defined location and instead it will store in the  below path.

“C:\Users\UserAccount<>\AppData\Local\USQLDataRoot”

Note: To check the exact path, follow the below screenshots.

Step -1


Step 2

Key Things Learned

Always use “/” as the path Separator

Likewise, stick the "/" separator as it works both locally and for remote execution unlike "\" which only works on local execution.

U-SQL Keywords are Upper-Case

U-SQL is very case sensitive, for e.g. writing extract instead of EXTRACT will cause a compilation error, so make sure to use EXTRACT, FROM, and USING in upper case.

About Header rows

Azure ADL doesn’t understand the first row that contains column headers, so make sure the input data doesn’t contain any header column. Similarly the output data produced by ADL will also not contain any column header.  However, this feature is expected to be added while the ADL and store generally available.

Friday, March 25, 2016

Azure Solution Architects Meetup Group Organizes Azure Global bootcamp @ Chennai, OMR on 16th April 2016

Azure Global BootCamp is a global event conducted around the globe at the same day same time. The Chennai Microsoft User Group has been conducting the event ever since the event was first introduced. Unlike other events, this is completely organized and conducted by independent Communities and not by Microsoft.

Little bit history about the past Bootcamp events that was conducted in Chennai. This is the 4th year of the event, for the last 3 years, we had been conducting this event in College venues for their students, but this year and going forward we had strictly decided to conduct the event only for the practicing Dev/IT professionals, of course students are allowed, but the primary audiences are working IT professionals.

For more information about the event below.




Global Azure Bootcamp main website:
http://global.azurebootcamp.net

Global Azure Bootcamp Chennai website:
http://chennaiazureglobalbootcamp2016.azurewebsites.net/

Event Registration Website:

http://www.eventbrite.com/e/azure-global-bootcamp-2016-chennai-tickets-22141543954?aff=Bornoncloud.com

Chennai Azure Solution Architects Meetup Group Website

http://www.meetup.com/Chennai-Microsoft-Azure-Solution-Architects-Group/

Event Location:
    Gavs Technologies Private Limited,
    No.11, Old Mahabalipuram Rd, Sholinganallur, Chennai, Tamil Nadu 600119

Schedule:
    Date: April 16, 2016
    Time: 9:30 AM to 5:00 PM
    Registration start: 9:00 AM

Agenda:

Please refer to the following link to view the agenda for this event.
http://chennaiazureglobalbootcamp2016.azurewebsites.net/

Parking, Food and Drink:

Parking for the event is free in the office parking lots on campus.  On the day of the event, we make arrangement with the parking authority to allow free parking and will not issue parking citation on the day of this event.
We will have food and drink.

What You Need:

To participate in the demos and labs, you will need the following:
    *  Microsoft Azure Subscription. Don't have one? Get a free trial subscription here.
    *  Visual Studio 2013/2015. Free versions are available here if you don't already have it.
    *  Azure .NET SDK for Visual Studio 2013/15. Also free and available here.

What To Expect:

The event is organized with developer audience in mind and will be comprised of presentations, demos, and labs that cover various Azure and open source topics.
This is a community driven event to help developer community in Chennai to learn and take advantage of the Azure cloud platform.

While i don’t have all the details yet, we will be passing out swag and prizes.

Wednesday, March 9, 2016

How and Where do i get started with Azure?

Its been quite some time since i blogged, i've been off track and away from blogging for almost more than 6 months now. But i always wanted to pen down something really valuable and help fellow members learn something new.

Alright back to the topic, of late i have been receiving many requests through mails and linked in from many technical folks on how to get started with Microsoft Azure. I used to help them with study materials, e-books, tutorials and video links, but later i though it would be a good idea to consolidate all the materials that i shared individually with people in one place so that everybody can benefit from.

Although its good to understand and learn all the services offered by Azure which i personally advise as well, but sometime its overwhelming for most of the people and there are high chances that you will be lost somewhere. I personally advise my fellow friends to begin with what they are already comfortable with. For instance if you are a developer, begin with Azure WebApps, Mobile Apps, Service Bus etc, similarly for IT Professionals begin with Azure Virtual Machines, Networking, Understanding VPN etc. Starting with what you know already can help you to accelerate the learning path and quickly understand the services.

Now lets look at the getting started materials that can help you begin the Azure Cloud Journey. By no means its only restricted for starters, it can also help advanced users of Azure at times.

Note: Now that you have decided to get started with Azure, its always better to get a free trial account (Credit Card required just for validation) if you or your organization doesn't have Azure Account already. If you are not part of any organization and trying to self learn, do not hesitate to create the account, because you will never learn until you try it out by yourself.

When you sign up for Azure, you get 200$ worth of azure credit valid for 1 month, this is more than enough to get started with the key services of azure. However, please be cautious about the services you try for example if you boot up a windows virtual machine of D14 series your credit will run out within an hour approx, so please be careful. Always shut down the services that you are trying after your training period and make sure no services are running. Enough with advises, lets get started!

Ebooks

Free Ebooks from Microsoft Azure press can be a good starting point for getting started with Azure.
There are number of ebooks on various topics, but below are the few ones i strongly recommend.

1. Microsoft Azure Essentials: Fundamentals of Azure

This ebook explores six foundational features of the Microsoft Azure platform, along with insights on getting started with Azure, management tools, and common business scenarios.

2. Microsoft Azure Essentials: Azure Web Apps for Developers

 This book provides information you can use to start building web applications using Azure Web Apps. It will guide you through development, deployment, and configuration tasks that are common for today’s developer building cloud applications.

3.Introducing Azure for IT Professionals

This ebook covers most Azure services are described in some detail, with screenshots used to demonstrate some of the multitude of capabilities of the platform.

You can find other free e-books here  and paid books with discounts here.

Azure Documentation

Almost all the services are covered in the Azure Documentation section here. The documentation is almost updated on a daily basis with new updates and enhancements, the good thing is that the content is managed through Github and Microsoft appreciates people to contribute if you can, however there is editorial team to review.

https://azure.microsoft.com/en-in/documentation/


MSDN Self Learning (Curated Training Path)`

MSDN (Microsoft Developer Network) Subscribers has access to curated training videos and free certifications (Not professional). Check with your organization's IT team or whoever engage with Microsoft on similar activities. If your organization doesn't have one, talk to them and request to subscribe one, if your organization has one already, you can associate yourself with the subscription and leverage the benefits.


Video Tutorials

There are several video training courses and turorials are available for learning Azure. The two best sources for video tutorials are


Blogs to Follow:

There are plenty of active Azure experts consistently blog on various topics, out of them i personally follow

Azure Blog(Official): https://azure.microsoft.com/en-us/blog/
Scott Guthrie: https://weblogs.asp.net/scottgu
Scott Hanselman: http://www.hanselman.com/blog/
Keith Mayer: http://blogs.msdn.com/b/keithmayer/
Chris Pietschmann: http://buildazure.com/
Michael Collier: https://michaelcollier.wordpress.com/
Mike Mckeown: http://www.michaelmckeown.com/

Hope this helps and wish you good luck in starting yourself with Microsoft Azure.