Writing your first
U-SQL Hello World
To get started with writing your first U-SQL query, open
visual studio and click on new project à
and click on U-SQL (If you don’t see the U-SQL project, you have to install
U-SQL tools for Visual Studio, click here
for instructions)
![]() |
Create new U-SQL Project |
Sample data and script
The sample datasets are available in the Azure Data lake analytics account for us to get started with ADL. To get the sample datasets, click on Explore Sample Jobs and click on copy samples, this process will copy the entire dataset samples to our ADL store account. Additionally you can also download the sample datasets and keep it in your local environment.
![]() |
Copy Sample Datasets |
![]() |
Sample Data sets in your Azure Data Analytics Store |
Now that we have the data, let’s go ahead and execute the
below sample script. This script is a very basic query and it essentially just
copies the contents of one TSV file into another TSV file.
@searchlog = EXTRACT UserId int, Start DateTime, Region string, Query string, Duration int, Urls string, ClickedUrls string FROM @"/Samples/Data/SearchLog.tsv" USING Extractors.Tsv(); OUTPUT @searchlog TO @"/Samples/Output/SearchLog_output.tsv" USING Outputters.Tsv();
![]() |
Sample Script in the Solution Explorer |
Click on start button, this will compile the script and run
it against the Azure ADL account. After successful completion of the execution,
the output file will be created in the output folder inside the samples folder
in the Azure Data Analytics Store.
Key aspects to note
Extracting data from a File
You must have noticed that there is no SELECT statement in the first U-SQL, it’s because the input data has
no pre-defined schema and hence EXTRACT
statement is used to read data from files. However once the data is extracted
and created as a Rowset (Schematized), we can apply SELECT on top of it. While using
EXTRACT you must specify both the column name and column type.
To perform the EXTRACT
function, we must use an Extractor –
it’s an inbuilt function of Azure ADL that knows how to parse the file and
produce rowsets. U-SQL comes with several such built-in extractors for reading
data from files. The one used in this script is Extractors.Tsv. (“TSV” stands for tab-separated value).
Rowsets
The rows derived from the EXTRACT statement are defined by a
RowSet named @searchlog. Rowsets – is essentially an intermediate table
structure created by U-SQL to internally pass data during script execution. The rowsets could be further transformed
to one or more rowsets.
Outputters
The @searchlog
RowSet is written to an externa output file using the OUTPUT statement. A built-in Outputter called Outputters.Tsv is available to write a file in TSV format,
similarly we also have Outputters.Csv for outputting the data into CSV format.
Query Execution
Environment
The ADL query can be executed locally or in the Azure ADL
Account. The input data and output data can be placed locally as well as in the
Azure ADL account, but while you execute the query locally it may not store (Known
bug) the output in the defined location and instead it will store in the below path.
“C:\Users\UserAccount<>\AppData\Local\USQLDataRoot”
Note: To check the exact path, follow the below screenshots.
![]() |
Step -1 |
![]() |
Step 2 |
Key Things Learned
Always use “/” as the path Separator
Likewise, stick the "/" separator as it works both
locally and for remote execution unlike "\" which only works on local
execution.
U-SQL Keywords are Upper-Case
U-SQL is very case sensitive, for e.g. writing extract instead of EXTRACT will cause a compilation error, so make sure to use EXTRACT, FROM, and USING in upper
case.
About Header rows
Azure ADL doesn’t understand the first row that contains
column headers, so make sure the input data doesn’t contain any header column.
Similarly the output data produced by ADL will also not contain any column header.
However, this feature is expected to be
added while the ADL and store generally available.