FDL is a functional database language, which extends the functional data model to computational completeness. The functions are the database and update is by modification of the function definitions. A function is defined by its type declaration and a set of equations specifying its value for the various possible values of its argument(s). There is no notion of a separate database schema although the user may choose so to regard the type and function declarations. However type and function declarations may be introduced or deleted at any time, subject only to minimal constraints to ensure that the database (i.e the function definitons) remain well defined. There are four primitive data types, string, integer, boolean, and abstract entity, and the usual tuple and list constructors which may be arbitrarily nested.
FDL is implemented over a persistent semantic free software triple store in which all information is held. This triple store (or software machine) has a small set of primitive instructions and for sets of triples retrieval is lazy. Deriving from the persistence of the triple store FDL provides complete persistence for all type and function declarations and for the function defining equations.
FDL could thus be regarded simply as a persistent functional programming language with type extensions to give it a database flavour. However this would be to neglect the importance of its semantic data model orientation and its consequent ability to support object oriented database formalisms as a higher level view.
The essential characteristics of any database management system must include:
It is through the associated database programming language that these features are made available. Other characteristics essential to a database system (as opposed to a more simple file manager or repository) such as transaction management, recovery, and managing the multi-user environment do not influence the nature of the database programming language to the same extent and are not discussed here.
In section 2 we give a brief history of FDL to early 1993; in section 3 we give a tutorial introduction to the language by way of a simple exampleand in 4 we discuss the data model which FDL implements.
FDL was developed in the context of TriStarp (Triple Store application research project) at Birkbeck College which was initiated by the author of this chapter with support from IBM Hursley in October 1984. A description of the first phase of this project can be found in King et al [KIN90]. FDL was first presented at the INFORSID Conference at La Rochelle in 1988 [KIN88] and further discussed at BNCOD [POU 88] and at EDBT90 [POU 90]. FDL forms the subject of the PhD thesis of Alexandra Poulovasslis [POU 89] and the persistent triple store over which it is implemented is the subject of the PhD thesis of Mir Derakhshan [DER 89] both of whose contributions to TriStarp have been outstanding. FDL has been under further continuous development by the TriStarp group since these various publications and the implementation improved and stabilised for application experimentation with real life examples. The description and discussions in this chapter relate specifically to FDL release 1.2, the version current in early 1993. Further releases are planned during 1994 and 1995.
We introduce the language by means of a toy example. Suppose we wish to create a database with information about employees and are working interactively at a simple terminal. The create_db facility (see section 3.?) enables us to establish and name an "empty" FDL database. In response to the operating system prompt we could then initate FDL and open that database:
FDL staffdb
and enter some information:
employee :: nonlex;
salary == integer;
name:employee -> string;
earns:employee -> salary;
age:employee -> integer;
bonus:employee -> integer;
These statements having respectively declared a new abstract entity subtype, employee, introduced salary as a type alias and declared four functions, name, earns, age, and bonus.
More information might now be entered:
create employee $e;
name $e <= "bill";
earns $e <= 15000;
age $e <= 30;
the first statement of which brings into existence a new abstract entity of type employee and assigns it as the value of the temporary global variable $e, such global variables being available implicitly to the user during an active session, but having no persistence between or existence outside such sessions.
The following three statements are definitions (equations) relating to three of the functions we have declared, the first saying that name for that abstract entity currently the value of $e has value "bill" etc. If we now continue:
create employee $e;
name $e <= "mary";
earns $e <= 18000;
age $e <= 33;
bonus x <= * 0.05 earns x;
we have entered similar information to that previously on a further employee but also added an equation for bonus stating that for any employee x, bonus is calculated as 5% of salary, operators in FDL being written prefix.
We could now query the database by using the list abstraction:
[{name x, age x, earns x, bonus x, "\n"} <- All_employee];
and would obtain the result:
bill 30 15000 750
mary 33 18000 900
Notice the uniform way in which the four functions are used in the list abstraction notwithstanding that bonus is defined "intensionally" and the other three "extensionally". All_employee is a zero argument function automatically maintained whose value is a list of all abstract entities of type employee which have been created (and not subsequently deleted). It is used above in a generator, in which x can be thought of as taking in turn each value in the list.
If we now wish to terminate the session we have two options (see 3.?). We can either close the database (simply using;) in the state to which it has been modified during our session, or we can specify that all modifications in our session be ignored (by using quit;). We suppose we have used the former.
Suppose james is to receive a bonus of 1250 rather than the 1000 which the only equation for the bonus function we have so far entered would give. We could proceed:
$x == head inv_name "james";
bonus $x <= 1250;
the first statement of which assigns to the global local variable $x the first element of the list produced by the function inv_name. Such inverse functions are automatically available in FDL for all functions of a single argument of type abstract entity. In the above case its result would be a single element list in consequence of the data we have entered, but this would not in general be so.
The list abstraction:
[{name x, bonus x, "\n"}3x <- All_employee
& < 17500 salary x];
would now produce:
mary 900
james 1250
since the function bonus is now defined by two equations, one of which has the james entity as argument and will match only that, whereas the other equation matches any entity of type employee. The matching algorithm gives priority to that equation with the most specific match and thus uses the general case for mary but the specifically stored case for james. Our bonus function previously defined only intensionally is now of mixed definition, part intensional and part extensional.
If we introduce a further function as:
count: (list alpha) -> integer;
count [] <= 0;
count [x3y] <= + 1 count y;
then, with the database we have defined:
count [All_employee];
would give the value 4. In the declaration of count alpha is a type variable and list a type forming operator. Thus the declaration of count states that it is a function from a list, the members of which must all be of the same type, although that type can be any type, to an integer. The first defining equation states that count for the empty list is zero, and the second equation that for a list with head x and tail y (which may be empty) it is computed by adding one to the result of applying it to y. Thus FDL is conventionally recursive and polymorphic.
It we now also introduce the function sum:
sum: (list integer) -> integer;
sum [x] <= x;
sum [x3y] <= + x sum y;
we could then introduce the zero argument function:
average_sal :-> integer;
average_sal <= / sum [earns x || x <- All_employee];
count All_employee;
which, with the present database state, would have the value 16375, which we would obtain with the statement:
average sal;
If, following the definition of average-sal above, we closed our session with the database in its current state (using;) the function definitions stored in the database would be:
age All_employee;
average_sal inv_age;
bonus inv_bonus;
count inv_earns;
earns sum;
those on the left being functions we have explicitly defined and those on the right the consequential automatically maintained functions. Notice that the usual arithmetic functions (+, -, etc) and the boolean functions (>, <, etc) are built in as are various constructor functions. Not to complicate the above discussion we assumed also that the function head was available but it should, in fact, have been defined:
head: (list alpha) -> alpha;
head [x | y] <= x;
We now introduce a further primitive abstract entity into our toy example corresponding to the concept of department:
dept :: nonlex;
works_in : employee -> dept;
manager : dept -> employee;
dname : dept -> string;
nstaff : dept -> integer;
with three associated functions which specify a department's manager, its name (dname) and the number of staff in the department (nstaff). The latter function requires only the single defining equation:
nstaff x <= count inv_works_in x;
To record that bill and mary work in data processing (dp) of which bill is the manager we could proceed:
create dept $d;
dname $d <= "dp";
$e == head inv_name "bill";
manager $d <= $e;
works_in $e <= $d;
$e == head inv_name "mary";
works_in $e <= $d;
Notice that we have used the global local variables, $e and $d, to force the evaluation of the expressions which specify the department and employee entities concerned so that they are used directly in the equation to be stored in the definition of works_in. The LHS of such equations being limited to constants, variables or patterns, it was necessary to force the evaluation in the case of the employee entity. However this was not necessary in the case of department and we could have specified the equation as:
works_in $e <= head inv_name $y;
but had we done so the RHS would not have been evaluated and the expression itself would have been stored and evaluated at run-time whenever required.
The query to find the names of all employees who earn more than the manager of the department in which they work, often discussed in the earlier literature on the relational model as an example of a query with the need for two distinct variables ranging over the same relation, is readily formulated in FDL as:
[name x || x <- All_employee & >
(earns x) (earns manager works_in x)];
Notice the goal directedness in the way this query is formulated. Each employee i considered in turn. We obtain the employee's salary and compare it with that of his manager which we obtain by following the path from the employee to the salary of his manager, applying in turn the functions denoted by the arcs works_in, manager, and earns.
The conventional SQL formulation of this query is equivalent to the FDL list abstraction.
[name x || x<- All-employee & y <- All-employee
& (= y manager works_in x) & (> earns x earns y)];
which in effect is considering the cartesian product of the set of employee with itself, and selecting relevant pairs. We suggest that the natural approach of FDL which starts with the object of interest and traverses the graph to determine whether it meets the selection criterion is to be preferred.
From a database viewpoint FDL has as its data model the functional view of the binary relational model usually known as the functional data model. In the binary relational model as presented in NIAM ( ), entities may be lexical, ie they can be written down directly, printed, viewed on a screen etc or they represent real world objects, concepts, things ..., ie they are non-lexical and cannot be directly represented, but only referred to or described indirectly using lexicals and the relationships they have with them. In other descriptions of this model the terms abstract entity and scalar are used for non-lexical and lexical entity respectively.
The functional data model views a binary relationship R(A,B) as defining two functions:
F : A -> Set (B)
G : B -> Set (A)
It is clear that either function would define the binary relationship and hence also the other function. In the approach of FDL these two functions are regarded as "inverses" of each other. In practical data models it is found that most binary relationships are one-many rather than many-many so that usually the pair of functions would be of the less general form:
F : A -> B
G : B -> Set (A)
The advantage of this data model is that it is generally well understood by ordinary people without any mathematical training or knowledge. They readily understand the evaluation of functions such as "age of", "earns", "value of" etc when applied to their friends, colleagues, and objects around them. In consequence it is a persuasive data model for capturing the real worls semantics of an area and thus on which analysis and design methods with their attendant CASE tools can be based.
Moreover the binary relational model has, for relatively simple examples, an easily understood diagramtic representation readily adapted to the functional view by making the arcs directional with the function names denoting also direction. Thus the schema information in our toy example above could be represented as shown in Figure 1 below. We term such a diagram a datamap.