Glass Jar Limited
Technology Overview

SAS

SAS provides many features for manipulating and processing data. Much of the functionality it provides is similar to that offered by traditional database products (e.g. Oracle). In fact many organisations use it primarily in that way. However SAS has two particular strengths. Firstly the range of built-in functions is enormous and offers huge potential for fine grained manipulation numeric and textual data. Secondly it has a wide range of statistical procedures. Just about everything you could conceive of is covered. Some of these procedures come with the base package, others require you purchase additional modules. If SAS has a weakness, it is its rather idiosyncratic syntax, particularly of the data step, the core means for manipulating data. There are some tutorials available which will help get you started. Perseverance with the data step is rewarded by the fact that it is extremely flexible. With it, the user is able to iterate over a dataset, simultaneously read from multiple sources and write to multiple output sets. The SQL programmer may be pleased to learn that the SAS procedure "proc sql" gives access to a largely recognisable variant of the SQL dialect. SAS SQL is enhanced by the fact that most SAS functions can be used inline with SQL code. A brief description of the key parts of the SAS system is given below.

SQL

Set oriented language, used to manipulate and manage data stored in relational database systems (e.g. Oracle, Teradata, SQL Server). The language style is very different to the often encountered procedural languages (e.g. Basic). There is no flow control. The user specifies how a collection of data is to be treated and the database system (via its optimizer) works out the best method to achieve this. Look at our SQL tutorials here. Key components of the SQL language are

A relatively recent extension to the sql language is the ability to write "recursive" queries. With a little ingenuity these can open up the possibility of using SQL to create simulations (i.e. where the output of one period is dependent on the output of previous periods), an option not previously possible without resorting to cursors.