next up previous contents
Next: Knowledge-Bases Up: Java Uniform Document Environment Previous: Introduction   Contents

Subsections


A Simple Case Study

This chapter is a tutorial about Jude Applications development. It describes a little problem and its solution using Jude. It compares also the Jude solution with the SQL one.

Problem

The problem to solve is the same described in [Bekke 1997]:

``Assume a simplified database with items and sales. The relational model for this consists of the following relations:

* items, for items are relevant: item identification,description, quantity in stock and price. These properties are stored in the ITEMS table.

* sales, for each sales transaction are relevant: sales identification, item number, date (week and day),number and amount. These properties are stored in the SALES table. ``

Database Schema

First we must create the database schema. The schema is the structure of data in the database. In Jude, or better in SimpleLogic language, the structure is represented by involved types and relations. A type is a set containing elements with a common characteristic, or using object-oriented terminology a type is a class having objects as instances. A relation identifies a possible link between two or more objects of a certain type. This link in SimpleLogic is called fact. The schema can be easily obtained paraphrasing the informal description. First we introduce the two main types:

assert item is_subtype_of object;

assert sale is_subtype_of object;

Item and sale are two types, each of them is a sub type object, that contains all the objects of the database.

Reading the informal description we can deduce important relations of the type item:

declare relation item has_description string;

declare relation item has_stock number;

declare relation item has_price number;

The relation has_description permit linking an item with a string representing its description. For example:

assert jude_cdrom is_a item;

 

assert jude_cdrom has_description ``A smart programming environment !'';

In this case jude_cdrom is an item (an instance of the type item, an element of the set item). The entire statement is a fact.

Now we can introduce the relations of type sale:

declare relation sale regards item 

                      for_quantity number 

                      with_amount number;

declare relation sale has_week week and_day day;

The two relations regards_for_quantity_with_amount and has_week_and_day, have an arity of three because they permit to link three objects together. An example of facts involving the two relations should be:

assert sale_1 is_a sale;

assert sale_1 has_week 1 and_day monday;

assert sale_1 regards jude_cdrom 

              for_quantity 5

              and_amount 50;

assert sale_1 regards linux_cdrom

              for_quantity 1

              and_amount 100;

Note that we have two facts concerning the same relation regards_for_quantity_with_amount on the same object sale_1. This permit to specify compound sales.

In order to complete the specification we must define also the two types week and day:

assert week is_subtype_of number;

  

assert day is_subtype_of object;

assert monday is_a day;

assert tuesday is_a day;

assert wednesday is_a day;

assert thursday is_a day;

assert friday is_a day;

assert saturday is_a day;

Database Rules

In a traditional database you can work only on explicitly specified facts, called base facts. In Jude you can specify also rules that permits to derives new facts from other (base or derived) facts. The used formalism is the first order predicate logic. This is a formalism simple to grasp and rather powerful.

Suppose we want to monitor all items having a low stock. We can introduce a new type:

assert item_with_low_stock is_subtype_of item;
Now we can write a rule that permits to assign or retract items from this set according to theirs stock value:

assert ITEM is_a item_with_low_stock 

if  ITEM has_stock STOCK

and STOCK is_less_than 5;

The rule is composed from an implication part followed from a condition part. The inference-engine of the DDBMS used by Jude assert the fact specified in the implication part whenever the condition part is satisfied. ITEM and STOCK are variables and the rule introduces conditions/constraints that the variables must satisfy. The inference-engine assert a fact for each value of ITEM satisfying the condition.

Suppose that we are interested at the turnover per item during a certain week. The relation declaration is simple:

declare relation item has_turnover number during_week number;
This example is more problematic than the previous one because involves the sum of all values respecting a certain condition. In SimpleLogic this is performed collecting all amount values in a list and performing the sum of all list values:

assert ITEM has_turnover TURNOVER during_week WEEK 

if  ALL_AMOUNTS = {AMOUNT | 

                      SALE regards ITEM 

                           for_quantity _

                           and_amount AMOUNT

                  and SALE has_week WEEK and_day _ } 

and TURNOVER is_sum_of ALL_AMOUNTS;

ALL_AMOUNTS is the list containing all the amounts satisfying the condition expressed in the part after ``|''.

``_'' is a special variable that indicates that its values are not important. If in the same rule you uses two or more ``_'' special variables they are considered as separated, not interfering variables.

Suppose we want to redefine our concept of item_with_low_stock taking in consideration the turnover of the last week. We can easily introduce this simple but powerful rule:

assert ITEM is_a item_with_low_stock 

if  today has_week WEEK

and LAST_WEEK is_equal_to WEEK - 1

and ITEM has_turnover LAST_TURNOVER 

         during_week LAST_WEEK

and LOW_STOCK is_equal_to LAST_TURNOVER * 1.5

and ITEM has_stock STOCK

and STOCK is_less_than LOW_STOCK;

Database Queries

What is the turnover per item during a certain week ? The already specified rules permit to specify formally this query. In SimpleLogic as in any good declarative language the specification of the meaning of a query represents also the command to use in order to retrieve the result.

get {ITEM, TURNOVER, WEEK |

  ITEM has_turnover TURNOVER during_week WEEK };

The result of this query is a table with ITEM, TURNOVER and WEEK as column header and the corresponding variables result values as cell values.

What are the items for which the turnover decreased in this week compared with last week ? In this case the query is a little more complex:

get { ITEM | 

      today has_week THIS_WEEK

and   LAST_WEEK is_equal_to THIS_WEEK - 1

and   ITEM has_turnover LAST_TURNOVER during_week LAST_WEEK

and   ITEM has_turnover THIS_TURNOVER during_week THIS_WEEK

and   THIS_TURNOVER is_less_than LAST_TURNOVER

};

This query is hard to read and it can not be reused in other queries or rules. Instead of expressing complex query is better to define new concepts and then rewrite the query in a more simple form:

declare relation item has_decreased_turnover_on_week week

                      respect_week week;

 

assert ITEM has_decreased_turnover_on_week WEEK_1 

            respect_week WEEK_2

if  ITEM has_turnover TURNOVER_1 during_week WEEK_1

and ITEM has_turnover TURNOVER_2 during_week WEEK_2

and TURNOVER_1 is_less_than TURNOVER_2;

 

assert item_with_decreased_turnover_respect_last_week

       is_subtype_of item;

 

assert ITEM is_a item_with_decreased_turnover_respect_last_week

if  today has_week THIS_WEEK

and LAST_WEEK is_equal_to THIS_WEEK - 1

and ITEM has_decreased_turnover_on_week THIS_WEEK 

         respect_week LAST_WEEK; 

Now we can rewrite the query:

get { ITEM | ITEM is_a 

item_with_decreased_turnover_respect_last_week };

The introduction of a new relation and of the new type permits to extend the ``vocabulary'' of our database (or better of our knowledge-base) and to reuse them in other queries and rules.

Database Refinements

During our problem formalization/specification we have introduced some ``programming tricks''. Now it is the time to fix them in order to obtain a more comprehensible and maintainable knowledge-base.

In our case a trick is a problem formalization that differs too much from reality. For example we have said that: ``assert week is_subtype_of number;'', but not all numbers are available weeks. A more precise definition can be:

assert week is_subtype_of integer;

assert WEEK is_a week

if  WEEK is_less_or_equal_to 52

and WEEK is_greater_or_equal_than 1;

A consequence of the last trick is the usage of statements like: ``LAST_WEEK is_equal_to THIS_WEEK - 1''. They are based on the assumption that a week coincide with number, but what happens if we want compare the turnover of an item during the first week of the year with the previous week ? In our world conceptualization there is not a week before the first week of the year (this is problem deriving from the usage of weeks instead of date). So we can introduce a relation and use it in our rules:

declare relation week is_previous_week_of week;

assert PREVIOUS_WEEK is_previous_week_of WEEK

if  PREVIOUS_WEEK is_equal_to WEEK - 1

and PREVIOUS_WEEK is_a week;

Note that in Jude it is very easy to evolve a database model because it is rather compact. A future enhance of Jude based on Transaction Logic will make more easy also update base facts towards new database schema. See section [*]for more details.


Relation Cardinality

The relation cardinality is the number of facts related to a given relation. Many formalisms such Entity Relationship permit assigning some constraints on the cardinality of a relation. On the contrary SimpleLogic does not permit this easily.

For example the has_stock relation associate to an item must accept a single fact given a certain item. So we can express this using this rule:

assert ITEM is_a bad_object

if  ITEM has_stock STOCK1

and ITEM has_stock STOCK2

and STOCK1 is_not_equal_to STOCK2;

bad_object is a set containing all the malformed objects of the system. It can be used in order to express all possible constraints about our database, not only relation cardinality constraints.

However you can note that the majority of relations allow multiple values insisting on the same object. In our case we can extend our database in order to avoid the specification of the previous constraint about has_stock relation. First we introduce a new relation:

declare relation number of item enter_into_magazine_at_week week; 
Obviously this relation has not a limit on cardinality. This relation permits facts like:

assert 100 of jude_cdrom enter_into_magazine_at_week 5;

assert 500 of linux_cdrom enter_into_magazine_at_week 10;

Now we can associate some rules involving the count of available stocks:

declare relation number of item enter_into_magazine_to_week week;

 

assert TOTAL of ITEM enter_into_magazine_to_week LIMIT

if ALL_ENTERS = { NUMBER | 

                  NUMBER of ITEM enter_into_magazine_at_week WEEK              and  WEEK is_less_or_equal_to LIMIT}

and TOTAL is_sum_of ALL_ENTERS;

 

declare relation number of item was_sold_to_week week;

 

assert TOTAL of ITEM was_sold_to_week LIMIT

if ALL_SALES = { NUMBER | 

                 SALE regards ITEM 

                 for_quantity NUMBER

                 and_amount _

             and SALE has_week WEEK and_day _

             and WEEK is_less_or_equal_to LIMIT}

and TOTAL is_sum_of ALL_SALES;

 

declare relation item has_stock number at_week week;

 

assert ITEM has_stock STOCK at_week WEEK

if  TOTAL_OUT of ITEM was_sold_to_week WEEK

and TOTAL_IN of ITEM enter_into_magazine_to_week WEEK

and STOCK is_equal_to TOTAL_IN - TOTAL_OUT;

All these rules permit extending considerably our vocabulary. In particular we can reuse them in order to specify the has_stock relation:

assert ITEM has_stock NUMBER

if  today has_week WEEK

and ITEM has_stock NUMBER at_week WEEK;

So after a redesign of our knowledge-base the cardinality limit of the has_stock relation is now an implicit consequence of the rule. We can avoid the constraint specification, as we avoid to express all other constraints that we can derive from the rule. The rule itself is self-explanatory.

User Interface

Jude offers a default rich of features user interface. This is the result obtained from the compilation of the knowledge-base previous specified:

!! insert screenshot !!

If you want you can add personalized viewers for item and sales types. For example you can add a photo to each item and display them in the item form.

!! when added is_sum_of support to SimpleLogic implement the solution and add screen shoot in this place. !!


Comparison with SQL

At the moment SQL is the most used data base language. In [Bekke 1997] there is a comparison between SQL and Xplain query language, that can be used also in the SimpleLogic case. In this chapter I have used the same problem, so in this section I can directly insert the SQL code suggested by [Bekke 1997]. Given the problem the resulting suggested SQL schema is:

CREATE TABLE ITEMS (

ITEM# CHAR (4) NOT NULL, 

DESCRIPTION CHAR (13) NOT NULL,

STOCK NUMERIC (4) NOT NULL, 

PRICE NUMERIC (4,2) NOT NULL,

PRIMARY KEY (ITEM#) ); 

 

CREATE TABLE SALES (

SALE# CHAR (4) NOT NULL, 

WEEK NUMERIC (2) NOT NULL,

DAY CHAR (3) NOT NULL, 

ITEM# CHAR (4) NOT NULL,

NUMBER NUMERIC (4) NOT NULL, 

AMOUNT NUMERIC (4,2) NOT NULL,

CHECK (DAY IN 'Mon','Tue','Wed','Thu','Fri','Sat'), 

PRIMARY KEY (SALE#),

FOREIGN KEY (ITEM#) REFERENCES ITEMS );

The first query must return the turnover per item. A first naive solution is:

SELECT ITEM#, TURNOVER = SUM (AMOUNT)

FROM SALES 

GROUP BY ITEM#;

However this query is not the right solution because does not take in consideration items having no sales. The right solution is:

SELECT ITEM#, TURNOVER = SUM (AMOUNT)

FROM SALES 

GROUP BY ITEM#

UNION 

SELECT ITEM#, TURNOVER = 0

FROM ITEMS 

WHERE ITEM# NOT IN

      (SELECT ITEM#

       FROM SALES); 

This is the SQL query returning items for which the turnover decreased in week 2 compared with week 1.

SELECT I.ITEM#, DESCRIPTION

FROM ITEMS I, SALES S2 

WHERE I.ITEM# = S2.ITEM#

AND S2.WEEK = 2

GROUP BY I.ITEM#, DESCRIPTION

HAVING SUM(S2.AMOUNT) <

       (SELECT SUM(S1.AMOUNT)FROM SALES S1 

        WHERE S1.ITEM# = I.ITEM#

        AND S1.WEEK = 1); 

UNION

SELECT I.ITEM#, DESCRIPTION 

FROM ITEMS I, SALES S

WHERE I.ITEM# = S.ITEM# 

AND WEEK = 1

AND I.ITEM# NOT IN

    (SELECT ITEM#

    FROM SALES 

    WHERE WEEK = 2); 

Obviously the SQL query is rather difficult to understand. The SimpleLogic counterpart is:

assert ITEM has_decreased_turnover_on_week WEEK_1 

            respect_week WEEK_2

if  ITEM has_turnover TURNOVER_1 during_week WEEK_1

and ITEM has_turnover TURNOVER_2 during_week WEEK_2

and TURNOVER_1 is_less_than TURNOVER_2;

  

get { ITEM, DESCRIPTION |  

      ITEM has_decreased_turnover_on_week 1

      respect_week 2

and   ITEM has_description DESCRIPTION };

The SimpleLogic specification is clearly more readable and less error-prone than SQL specification. This because we can specify new rules and queries using already specified rules and types. This is a consequence of object-oriented (type-based) approach that favors the reuse of already specified building-blocks. The relation is_sum_of and the list with all values to sum are the counterpart of aggregate predicates present in SQL and other logical formalisms. This formalism is to prefer because also the list is as first-class citizen and not only the results of the operation as in the case of SQL.

So in SimpleLogic you can divide a complex problem into simpler sub problems. Instead ``[..]in SQL one is often inclined to formulate one single statement for a problem. [..] On the other hand, in SQL one encounters very soon the limits of what humans can comprehend. For example, experiences from programming languages have learned that the number of abstraction levels humans can understand at once is limited to about four levels. Above this number one must split up the problem into smaller problems. At that moment a standard splitting procedure is missing and the advantages of an optimizer are lost. ``[Bekke 1997].

Conclusion

In SimpleLogic, as in any good declarative language, the specification of the problem represents also the code to use in order to produce the solution. The usage of types, relations and rules permit building readable and reusable knowledge-base. Unlike SQL you can reuse the already definite concepts in order to build new powerful concepts. The database became a conceptualization of the domain of interest and it plays the role of a knowledge-base.

Jude provides also a compound-document framework that permits editing and viewing all the objects of the system. You can add new viewers and editors specific to your application domain in order to personalize the user interface.


next up previous contents
Next: Knowledge-Bases Up: Java Uniform Document Environment Previous: Introduction   Contents
Massimo Zaniboni 2001-03-10