Tuesday, April 12, 2016

Declaring Parameters in U-SQL

Declaring re-usable parameters is always handy when writing a lengthy U-SQL Scripts. U-SQL also supports using DECLARE statement while writing queries. You can use the DECLARE instruction to save values of Native U-SQL datatypes.
DECLARE Out_File string = "/Output/sample.txt";

@searchlog = 
EXTRACT UserId          int, 
            Start           DateTime, 
            Region          string, 
            Query           string, 
            Duration        int, 
            Urls            string, 
            ClickedUrls     string
    FROM @"/Samples/Data/SearchLog.tsv"
    USING Extractors.Tsv();

@output = 
         (Duration>300 ? "long" :"short") AS DwellType
    FROM @searchlog;

OUTPUT @output
    TO @Out_File
    USING Outputters.Tsv();
U-SQL Data Types DECLARE can be used with all the U-SQL native datatypes. String Type
DECLARE @text1 string = "Hello World";
DECLARE @text2 string = @"Hello World";
DECLARE @text3 char = 'a';
You can use other variable declarations in a #DECLARE statement to combine strings.
DECLARE @text4 string = "BEGIN" + @text1 + "END";
DECLARE @text5 string = string.Format("BEGIN{0}END", @text1);
Numeric signed
DECLARE @numeric1 sbyte = 0;
DECLARE @numeric2 short = 1;
DECLARE @numeric3 int = 2;
DECLARE @numeric4 long = 3L;
DECLARE @numeric5 float = 4.0f;
DECLARE @numeric6 double = 5.0;
Numeric unsigned
DECLARE @unumeric1 byte = 0;
DECLARE @unumeric2 ushort = 1;
DECLARE @unumeric3 uint = 2;
DECLARE @unumeric4 ulong = 3L;
DECLARE @misc1 bool = true;
DECLARE @misc2 Guid = System.Guid.Parse("BEF7A4E8-F583-4804-9711-7E608215EBA6");
DECLARE @misc4 byte [] = new byte[] { 0, 1, 2, 3, 4};
DECLARE @d1 DateTime = System.DateTime.Parse("1979/03/31");
DECLARE @d2 DateTime = DateTime.Now;
Usage of user definied function in DECLARE is currently not supported. For example, you cannot use the below statement which will result in error
DECLARE myName string = MyHelper.GetMyName();
Similalry Values cannot be assigned from a RowSet. For example, you cannot use the below statement which will result in error
DECLARE @maxval int = SELECT MAX(value) FROM data;
The above statements are not explicitly supported by U-SQL, however an alternative is to get a single-row rowset with a single column and then JOIN that tiny rowset other rowset to get what you need.