Next Previous Contents

5. Extracting information from tables

5.1 General

The following shows some examples of how the system is usually used, which involves a combinations of operators. Using the table named 'sample' the command:

      nosql column NAME AMT TYP < sample | nosql sort TYP | nosql print
    

gives the output:

                NAME     AMT  TYP
                ------  ----  ---
                Bush     133  A
                Hansen    23  A
                Perry    244  B
                Hart    1111  D
                Holmes  1111  D
                Jones     77  X
    

Note that the column COUNT was excluded by the 'column' operator, and that the order of the selected columns was changed from that in the input table. Of course to save the output in a file, (redirection of STDOUT into a file) something like the following is used:

      nosql column ... < sample | nosql sort TYP | nosql print > file.out
    

An example using the operator 'row' on the table 'sample' is:

      nosql row 'AMT < 900' < sample | nosql column NAME AMT TYP |\
            nosql sort NAME | nosql print
    

Note how the row selection expression is surrounded by single quotes, to prevent its interpretation by the shell. Note also that the "\" character at the end of the first line of the above command is the signal to the UNIX shell that the command is continued on the next line. Here we select some rows using 'row', select some columns using 'column', sort what we have with 'sort', and print with 'print'. The output is:

                NAME    AMT  TYP
                ------  ---  ---
                Bush    133  A
                Hansen   23  A
                Jones    77  X
                Perry   244  B
    

5.2 A real world problem

The following shell script shows how the NoSQL operators and other UNIX utilities can be fitted together to solve a real world problem. The problem was to find out if the rows in a large table were unique over four columns. Since 'summ' will tell us whether the rows of a table are unique on a single column, we need to construct a temporary table.

To illustrate the solution on a small table, the script below works on a table like 'sample' but with some rows repeated. First the script selects four columns and adds a dummy column named 'uniq' (using 'column'). It then puts the combined values of the four columns into the dummy column (using 'compute'). Next it examines the value of the dummy column 'uniq' for uniqueness (using 'summ') and then uses the UNIX command 'egrep' to show only the lines of interest, e.g. those lines that start with something other than a '1'.


      nosql column < table  Unit  Day  Time  MSN |\
        nosql addcol uniq | nosql compute 'uniq=Unit"."Day"."Time"."MSN' |\
        nosql summ -cuu uniq | egrep -v '^  *1'

    

The output was like the following:


      Rows: 9
      Unique values for uniq: 6
             2 Jones77X77
             3 Perry77B244

    

meaning that there were (in this example) two rows that had duplicates over the four columns with one set of values, and three rows that had duplicate with another set of values.


Next Previous Contents