Feature Wiki

Information about planned and released features

Tabs

Establish Referential Integrity

1 Initial Problem

Currently (ILIAS 8) ILIAS doesn't use advanced database-built-in functionalities that ensure the integrity of stored data.

The benefits ILIAS demands from the Database Management System (DBMS) regarding data value correctness on field-level currently are:

  • uniqueness, by primary or unique indexes
  • 'not null' (without a defined default) for essential required fields
  • low-level datatype warranty on field-level, reasonable for numeric- and date/time-types, sometimes poor for unspecific varchar fields (examples: email and client_ip in [usr_data])
To improve data quality and to support code maintaining modern DBMSs offer several options ILIAS COULD use:
  1. Referential Integrity (Foreign Keys)
  2. Stored Procedures & Functions
  3. Trigger
Options 2 and 3 COULD be used (in combination) to check, format, generate and archive data on both field- and row-level, in sophisticated manner. These options, however, are not addressed by this feature request. In this FR, Option 1 only is to be discussed.

The lack of RI/Foreign Keys currently results in data inconsistency. This problem is very obvious, because at the time being we need additional cleanup features to handle orphaned and corrupted data, e.g.:
  • System Check: Missing Entries in Repository Tree
  • System Check: Missing Object References
Furthermore every administrator/developer/user certainly noticed errors like `Whoops\Exception\ErrorException thrown with message "<b>Error: There is no dataset with id 92771!</b><br />class: ilObjUser<b />Script: /srv/www/ilias/htdocs/Services/User/classes/class.ilObjUser.php<br />Line: 292"` in their ILIAS installations. This is also an evidence for missing RI.

2 Conceptual Summary

To improve the quality of the ILIAS data stored in the DMBS, the next goal SHOULD be to establish foreign keys in and between tables where ever it is semantic consequential.

Once a semantic relation between data fields is established by a foreign key, this level of data integrity will be guaranteed by the DBMS for every attempt of data manipulation!
This means the integrity is no longer reliant solely on:

  • correctness of data values provided by manipulating PHPcode
  • correctness of data values provided by direct (interactive) manipulations (CLI, phpmyadmin, ...)
  • completeness of complex actions without an appropriate sequence of steps, (which may be interrupted by external technical incidents)
which can be failing unintentionally or by purpose.

As another benefit, if data manipulating code has yet unrecognized semantic bugs violating a relation, those bugs will emerge due to DBMS denying the execution, leading to attention so they can be fixed.

On the other hand, once a relation is established in the DBMS, every PHP class that manipulates (insert, update, delete) an involved data field must be checked whether the now existing relation requires a different behavior of its actions.
Potential activities which may be affected by a relation are:
  • deletes of data in the referenced table
  • updates of data in the referencing table
  • insertions in both tables, the referencing and the referenced table, as ONE unit of work
The kind of necessary or desirable code modifications may be:
  • changing the sequence of actions (insertion:  first in referenced table, secondary in referencing table)
  • removing of now unnecessary actions (deletion:  using ON DELETE CASCADE of relation)
  • changing the strategy when updating or inserting referencing data,  (example: If the function formerly could not rely on the presumption whether the new value already exists in the (now referenced) superior table, a check for its existance had to be scheduled for every update/insert.  With RI and the presumption that the value exists in very most cases, the former check can be optimistically removed and a (new) handling for violating the constraint must be implemented. The benefit by RI lies in smaller resource consumption for regular cases.)

The process of establishing RI in ILIAS implies actions in many contexts:

  • introduce the intentions of using Referential Integrity to the ILIAS community
  • modification of ILIAS' database abstraction layer
  • guidelines/documentation for maintainers to handle RI
  • migration of individual installations to use an appropriate database engine
  • changes to database structures
  • introduction of foreign keys in portions of at least one module or one service by its maintainer
  • management of the communication process when a modules maintainer can't act self-sufficient
  • reflecting an ongoing RI coverage for a general data deletion strategy (?)
  • presumably as a final phase: process superior services like 'User'
  • ...
Some of this actions are one-shots, some are ongoing.

As a pure technical view, following actions MUST take place in sequence:

  1. Migration of individual installations to use an appropriate database engine
  2. Modification of ILIAS' database abstraction layer
  3. Introduction of foreign keys in portions of at least one module or one service by its maintainer
  4. presumably as a final phase: process superior services like 'User'
Step 1. is addressed in feature request Abandon support for MyISAM engine and MySQL fulltext search
Step 2. is addressed in feature request Generalized Management Functions for Referential Integrity

A major challenge for Step 3. will be to design a RI-establishing process which handles existing violating data. This shouldn't happen theoretically if everything was fine in whole past - but this is unrealistic in the wild!
Therefore data checking scripts will be REQUIRED to ensure acceptable data in all affected client databases before an installation updates its ILIAS codebase to a new version including a new "portion of RI".
At least, this checking scripts (or at best: with data correcting ability) MUST cover those data-structures, which are going to be used for RI by this certain update portion.
The task to design and provide a data checking/preparation process is addressed by feature request Generalized Management Functions for Referential Integrity as well.

3 User Interface Modifications

3.1 List of Affected Views

{Please list all views (screens) of ILIAS that should be modified, newly introduced or removed.}

3.2 User Interface Details

{For each of these views please list all user interface elements that should be modified, added or removed. Please provide the textual appearance of the UI elements and their interactive behaviour.}

3.3 New User Interface Concepts

{If the proposal introduces any completely new user interface elements, you might consult UI Kitchen Sink in order to find the necessary information to propose new UI-Concepts. Note that any maintainer might gladly assist you with this.}

4 Technical Information

{The maintainer has to provide necessary technical information, e.g. dependencies on other ILIAS components, necessary modifications in general services/architecture, potential security or performance issues.}

5 Privacy Information

{ Please list all personal data that will need to be stored or processed to implement this feature. For each date give a short explanation why it is necessary to use that date. }

6 Contact

  • Author of the Request: Tesche, Uwe [utesche]
  • Maintainer: {Please add your name before applying for an initial workshop or a Jour Fixe meeting.}
  • Implementation of the feature is done by: {The maintainer must add the name of the implementing developer.}

7 Funding

If you are interest in funding this feature, please add your name and institution to this list.

8 Discussion

9 Implementation

{The maintainer has to give a description of the final implementation and add screenshots if possible.}

Test Cases

Test cases completed at {date} by {user}

  • {Test case number linked to Testrail} : {test case title}

Approval

Approved at {date} by {user}.

Last edited: 21. Feb 2022, 12:00, Jansen, Michael [mjansen]