Transcription

Sonja Ristić1, Ph.D, Assistant prof., Ivan Luković1, Ph.D., Associate prof.,Pavle Mogin2, Ph.D., Senior Lecturer, Jelena Pavićević3, M.Sc, System Engineer1Faculty of Technical Sciences,21 000 Novi Sad, Dositej Obradovic Square 6, SCG, [email protected], [email protected] University of Wellington, School of Mathematical and Computing Sciences,P.O. Box 600, Wellington, New Zealand, [email protected] Crna Gora d.o.o., Bulevar Lenjina 2, 81000 Podgorica, SCG, [email protected] A DATABASE SCHEMA USING IIS*Case TOOLAbstractIIS*Case is a relational database schema design tool. It is based on a methodology of gradual integration of subschemas intoa database schema. By integrating independently designed subschemas IIS*Case produces a database schema. The process ofindependent subschema design may lead to collisions in expressing the real world constraints and business rules. IIS*Caseuses a specialized algorithm for checking the consistency of constraints embedded in the database schema and thesubschemas. It also generates two types of reports: reports on database schema design process and reports on detectedcollisions. The paper discusses the use of the new release of IIS*Case (R.6.0) in detecting collisions, including the analysis ofrelated reports and possible designer's actions. A number of examples given in the paper, illustrate how IIS*Case detectsconstraint collisions.Key Words: Form Type; Relational Database Schema; Subschema; Subschema integration; Formal consistency by constrainttype; IIS*Case.1 INTRODUCTIONIIS*Case is a relational database schema design tool. It is based on a methodology of gradual integration of subschemas into adatabase schema. The first step of the design process is to identify groups of similar user requests. For each group of userrequests a designer defines an external schema. An external schema is expressed in a data model that is based on the formtypes. A form type is an abstract description of a human-computer interface form. IIS*Case automatically maps an externalschema into a relational subschema by inferring attributes and constraints from the external schema and embedding them intothe relational subschema. A subschema is the data description part of a transaction program specification. Whereas the processof integration of external schemas that are based on the form type data model is hard to formalize, the process of relationalsubschema integration may be based on the well known synthesis algorithm, and may be formally described. By integratingindependently designed subschemas, IIS*Case produces the first version of the database schema. We also call this schema thepotential database schema. The process of independent design of external schemas may lead to collisions in expressing the realworld constraints and business rules. If the collisions exist, at least one subschema is formally not consistent with the potentialdatabase schema. Programs made over an inconsistent subschema do not guarantee safe database updates.IIS*Case uses a specialized algorithm to check the consistency of constraints embedded in a database schema and thecorresponding subschemas. Consistency checking is performed for each constraint type separately. IIS*Case detects collisionsof: attribute sets, key and unique constraints, null value constraints, and referential integrity constraints. IIS*Case generatestwo types of reports: reports on database schema design process and reports on detected collisions. Resolving collisions maylead to producing a new version of a database schema. The paper discusses applying of the new release of IIS*Case (R.6.0) [1]in detecting collisions, including the analysis of related reports and possible designer's actions. Examples given in the paper,illustrate how IIS*Case detects constraint collisions.The main goal of the paper is to present the practical appli-cation of the results of the theoretical research given in [2, 3]. Theformal definitions of the terms used in the paper may be also found in [2, 3].Apart from the Introduction and Conclusion, the paper has five sections. Section two defines the notion of constraint collision.Section three is the presentation of a case study. Collisions of attribute sets are considered in Section four and collisions of thekey, unique and null value constraints are considered in Section five. Collisions of referential integrity constraints areconsidered in Section six.

2 COLLISION OF CONSTRAINTSA subschema and a database schema are formally consistent if:1. The set of attributes of each subschema relation scheme, is a subset of the attribute set of a corresponding database relationscheme;2. Each set of attributes X with a unique value property (as it is defined in [2] and [3]) in a subschema relation scheme has thesame property in the corresponding database relation scheme; and3. All the constraints that can be inferred from the database schema and that are relevant for the subschema are embedded intoit.A constraint o is embedded into a subschema Pk if it can be inferred from the subschema contraint set Ok, which is denoted bythe implication Ok o.The first and the second condition are expressed in a formal way in [2]. Satisfying these two conditions is a prerequisite for thevalidation of the third condition, which is expressed by the logical implication:(1)Ok OrPk,where Ok is the set of all constraints of the subschema Pk, andOrPk is the set of all database schema constraints that are relevant for Pk.A database schema constraint o should belong to the set of relevant constraints for subschema Pk, if the operation that mightviolate o is allowed in Pk. The operations that might violate some constraint are called critical operations.There are two kinds of relevant constraints: The includible constrasints, denoted by Ini(O, Pk); and The extending constrasints, denoted by Exi(O, Pk),where O is the set of all constraints of the database schema.Suppose a database schema constraint o is relevant for subschema Pk. and satisfies conditions 1. and 2.The constraint o belongs to Ini(O, Pk) if it can be expressed using concepts of subschema Pk.A constraint o belongs to Exi(O, Pk) if and only if it is relevant for Pk, and o Ini(O, Pk) holds.A database constraint o is potentially inconsistent if it is a relevant for at least one subschema Pk and: it cannot be inferred from the set of subschema constraints Ok, i.e. (Ok o) holds, or it cannot be expressed using the concepts of the subschema Pk, i.e. o Exi(O, Pk) holds.Described constraint inconsistencies are called constraint collisions. They must be resolved since programs made over aninconsistent subschema do not guarantee safe database updates [4].For each potentially inconsistent constraint, a designer has to decide whether it should be embedded into the database schema.If the decision is positive, the potentially inconsistent constraint must be embedded into all the subschemas, for which it isrelevant.Otherwise, a potentially inconsistent constraint must not be embedded into the set of database constraints. It must beemphasized that subschema constraints may be stronger, but not weaker than the corresponding database constraints.Consequently, some subschema constraints may not be embedded into the database schema.A subschema constraint is considered as locally valid if it is embedded into a subschema, but it must not be embedded into thedatabase schema. Subschema constraints that are embedded into a database schema are considered as globally valid.Let us consider a potentially inconsistent constraint and the subschema into which it has already been embedded as a relevantone. During the process of collision resolving, a designer may decide between the following two alternatives: A potentially inconsistent constraint may be excluded from the subschema; or It may be pronounced as a locally valid constraint of the subschema.In the first step of the integration process, all constraints of the subschema may be pronounced as globally valid. Some of themmay be pronounced as locally valid in the subsequent iterations.There are three possible relationships between a subschema Pk and a potentially inconsistent database constraint o. A potentially inconsistent constraint o is not relevant for Pk, and consequently o does not induce inconsistency between Pkand the database schema. A designer does not need to redesign the subschema Pk, but probably needs to redesign anothersubschema. A potentially inconsistent constraint o is relevant for Pk, but it is not embedded into the set of constraints of Pk. Pk is potentially inconsistent and the designer may redesign it by embedding o into its set of constraints or by excluding criticaloperations from Pk. A potentially inconsistent constraint o is relevant for Pk and it is embedded into the set of constraints of Pk, but there isanother subschema Pl, for which o is also relevant, but not embedded into it. Pk has "introduced" o into the set of database

constraints. Accordingly, Pk is potentially inconsistent. A designer may redesign it by excluding o from its set of constraintsor by pronouncing o as a locally valid constraint for the subschema Pk.Figure 1. Form types for Personnel, Working Units, Work Orders and Operating Plans3 A CASE STUDYWe use a case study based on an imaginary production system to illustrate the application of IIS*Case. The example ispurposely simplified, in order to clearly present the process of detecting and resolving constraint collisions.We identified four groups of similar user requests: Personnel – insert/update/delete details of staff members; Working Units (WU) – insert/update/delete details of working units and update some details of staff members working inparticular WU; Work Orders (WO) – insert/update/delete the details of working orders and display some details of staff member wholaunched WOs; and Operating Plans (OP) – insert/update/delete the details of operating plans containing the work orders that should beaccomplished during a time period.Figure 1 contains some of the form types, generated by IIS*Case.For each group of user requests, IIS*Case maps form types into a relational subschema by inferring attributes and constraintsfrom form types and embedding them into the relational subschema. It also generates the appropriate database schema designprogress reports. This way, we get four subschemas, organized in IIS*Case as so called application systems:PERSONNEL Staff{{IdWU, SurN, DatB, Addr, JMBG, Name},{JMBG}}.WORKING UNIT WU{{WRoom, IdWU, NamWU, ManagWU}, {IdWU}}, Staff{{IdWU, SurN, DatB, Addr, JMBG, Name, School, IdS, Manag, CelTel},{IdWU IdS, JMBG}}, WU[ManagWU] Staff[JMBG],

Staff[IdWU] WU[IdWU], Staff[Manag] Staff[JMBG].WORK ORDERS WO{{IdPr, DatWO, Amount, IdWO, JMBG}, {IdWO}}, Staff{{SurN, JMBG, Name},{JMBG}}, WO[JMBG] Staff[JMBG].OPERATING PLANS WO{{SurN, IdWO, Name, IdOplan},{IdWO}} OP{{From,Until,IdOplan,DatB},{IdOplan}} WO[IdOplan] OP[IdOplan].IIS*Case produces the first version of a database schema by integrating independently designed subschemas. The order ofintegration is irrelevant. It is even possible to integrate all the subschemas at once. We believe that the order of integration,described in the paper, enables gradual and clear presentation of the detection and resolving of constraint collisions.IIS*Case uses a specialized algorithm for checking the consistency of constraints embedded in the database schema and thesubschemas. The consistency checking is performed for each constraint type, separately. The order of consistency checking,with respect to the constraint type, is relevant. Consistency of a set of constraints of a given type may be a prerequisite for theconsistency validation of another type of constraints. IIS*Case performs the consistency checking by imposing the followingorder: the attribute sets, the key and unique constraints, the null value constraints, and finally the referential integrityconstraints.4 THE COLLISION OF THE SETS OF ATTRIBUTESLet Pk be a subschema integrated into a database schema, and let Nk be one of relation schemes in Pk. The database schemamust have at least one relation schema that contains the attribute set of Nk. One of such relation schemes is pronounced as thecorresponding relation scheme for Nk. Often, there is only one relation scheme with the given property. However, if there aremore, IIS*Case will choose the one with the maximum set of attributes.Suppose we decide to integrate subschemas WORK-ORDERS and OPERATING PLANS first. Then, IIS*Case will make a newapplication system PLANING with two child application systems containing the starting sub-schemas. Here is the new databaseschema PLANING: OP{{From, Until, IdOplan},{IdOplan}} WO{{IdPr, DatWO, Amount, IdWO, JMBG, IdOPlan}, {IdWO}} Staff{{SurN, JMBG, Name},{JMBG}} WO[IdOplan] OP[IdOplan] WO[JMBG] Staff[JMBG].Figure 2. Report on attribute set collisionsAfter the integration, the analysis of collisions in obtained database schema is performed. In the first step, the attribute setcollisions are detected. The first part of the appropriate report is shown in Figure 2. Apart from this, the Database SchemaCollision Report has two other parts: Rule and Example:The rule for the attribute set collision

Each relation scheme from child application system must have a corresponding relation scheme in the parent applicationsystem, such that its attribute set is a subset of the attribute set of the corresponding scheme.ExampleNotation remarks:N(R, K) denotes the relation scheme N with the set of attributes R and the set of keys .S: N2({A,B},{A}), N3({B,C},{B})Collision:N1 from S1 has no corresponding relation scheme in S because {A,C} is not a subset of {A,B} or {B,C}.S1: N1({A,C},{A})Relation scheme WO from the subschema OPERATING-PLANS does not have a corresponding relation scheme in the databaseschema PLANING. In order to resolve the collision, designer may replace attributes SurN and Name, with the attribute JMBG.The replacement should be made on the form type Working order, which was used to generate the subschemaOPERATING PLANS.5 COLLISOINS OF KEY, UNIQUE AND NULL VALUE CONSTRAINTSNow, we integrate subschemas WORKING UNIT and PERSONNEL, and obtain the following ADMINISTRATION databaseschema: WU{{WRoom, IdWU, NamWU, ManagWU}, {IdWU}}, Staff{{IdWU, SurN, DatB, Addr, JMBG, Name, School, IdS, CelTel},{IdWU IdS, JMBG}}, WU[ManagWU] Staff[JMBG], Staff[IdWU] WU[IdWU].Figure 3. Reports on key and unique constraint collisionsThe analysis of attribute set collisions finishes successfully. However, collisions of key constraints are detected. The first partof the appropriate report is shown in the top part of Figure 3. The other two parts have the form:The rule for the key collisionSuppose there is a relation scheme N1 in the child application system S1, for which N2 is the corresponding relation scheme inthe parent application system S. A key X from N2 must be included in the relation scheme N1 if some of the attributes from Xbelong to the attribute set of N1 and insert or modify of those attributes is allowed in N1.Example

Notation remarks:- N(R, K) denotes the relation scheme N with the set of attributes R and the set of keys K.- Key(N, X) denotes that X is a key of N.- Role(N) denotes a set of database operations that are allowed in a relation instance over N.- Mod(N) denotes a set of attributes from R that are modifiable in a relation instance over N.S1, S2 are the child application systems of the application system S.S1: N1({A, B, C, E}, {A})Role(N1) {r, i, m},// Allowed operations: read, insert and modifyMod(N1) {C} // Modifiable attribute: CS2: N2({A, B, C, D, E}, {A, CD})S: N3({A, B, C, D, E}, {A, CD})Collision:N3 is the corresponding relation scheme for N1 and N2. The constraint Key(N1, CD) is not included in N1 but it must be,because:-The attribute C is included in both N1 and N3,- Insert or modify of C is allowed in N1, and- There is the constraint Key(N3, CD).Accordingly, the attribute D must be included in N1.We change the subschema PERSONNEL by adding attribute IdS and an additional key IdWU IdS into the appropriate formtypes. After those changes, analysis of key collisions successfully finishes.In the next step, unique constraint collisions are detected. The first part of the appropriate report is shown in bottom part ofFigure 3. Because the third part (example) is analogous to the example for the key collision, it is omitted here. The second partof the report is of the following form.The rule for the unique constraint collisionSuppose there is a relation scheme N1 in a child application system S1, for which N2 is the corresponding relation scheme inthe parent application system S. A unique constraint X from N2 must be included in the relation scheme N1 if:- some of the attributes from X belong to the attribute set of N1, insert/ modify of those attributes in N1 is allowed, and- X is not a key of N1.In this case, the collision is resolved by embedding a unique constraint for the attribute NamWU in the subschema PERSONNEL.After that change, unique collision analysis successfully finishes.Figure 4. Report on NULL constraint collisionsIn the fourth step, we analyze null value constraints. All detected collisions of null value constraints are automatically resolved.The appropriate report contains the list of the changes made (Figure 4). Like the others collision reports, it contains a rule andan example parts.The rule for the null value collisionA not null atribute A of a relation scheme N1 will be changed into a nullable one, if there exists a relation scheme N2 in a childapplication system, for wich N1 is the corresponding scheme and A in N2 is nullable.Example

Notation remarks:N(R, K,) denotes the relation scheme N with the set of attributes R and the set of keys KS1 is a child application system of the application system S.S1: N2({A, B}, {B}),Null(N2, A) True // Nulls for A in N2 are allowedS: N1({A, B}, {B})Null(N1, A) False // Nulls for A in N1 are not allowedCollision:The attribute A must be nullable in N1, because it is nullable in N2 and N1 is the corresponding scheme for N2.Automatic Correction Method:A in N1 will be changed into a nullable attribute (Null(N1, A) True).Figure 5. Report on referential integrity collisionsThe final step is the consistency analysis of the referential integrity constraints. After detecting collisions, IIS*CASE producesan appropriate report (Figure 5). The rule and examples for this type of constraint are pretty complex.The rule for the referential integrity collisionSuppose there is a referential constraint RC: N1[X] N2[Y] in the application system S and there is a relation scheme N3 in a childapplication system S1, for which N1 or N2 is the corresponding relation scheme. RC must be included in S1 if:- N1 is the corresponding scheme for N3 and insert or modify operations on X are allowed in N3; or- N2 is the corresponding for N3 and delete is allowed in N3.Example 1:S1: N1({A, B}, {A}),Role(N1) {r, i, m}Mod(N1) {B}S: N2({A, B}, {A}) ,N3({B, C, D}, {B})RC: N2[B] N3[B]Collision:Referential constraint RC must be included in S1 because N2 is the corresponding relation scheme for N1 and insert or modifyof B is allowed in N1.N3 is a missing relation scheme. It is not included in S1 but it should be.Example 2:S1: N4({B, C}, {B})Role(N4) {r,d}S: N2({A, B}, {A}) , N3({B, C, D}, {B})RC: N2[B] N3[B]Collision:Referential constraint RC must be included in S1 because N3 is the corresponding relation scheme for N4 and delete is allowedin N4.N2 is a missing relation scheme. It is not included in S1 but it should be.More examples of collisions may be found in [1].In this case the collision is resolved by adding the attribute Manag in the subschema PERSONNEL. After this change,referential integrity collision analysis successfully finishes.

7 CONCLUSIONThe paper discusses the application of a new release of IIS*Case (R.6.0) in detecting collisions of constraints, analyzes thecorresponding reports and possible designer's reactions. It is a Java integrated CASE tool, based on the relational data model. Itis designed to provide an automated and intelligent support for complex and highly formalized project tasks. Its basic functionsare: Conceptual modeling of a database schema, Automated design of relational database subschemas in the 3rd normal form, Automated integration of relational database subschemas, Checking the consistency of constraints embedded into a database schema and a set of subschemas, and Generating of detailed reports about the database design and analysis process.Some of detected collisions can be resolved automatically, but most of them cannot. The designers themselves must resolve thecollisions at the conceptual level. IIS*Case enables designers to cooperate in order to achieve the most appropriate solution.8 REFERENCE / REFERENCES[1][2][3][4][5][6]Pavićević, J.Razvoj CASE alata za automatizovano projektovanje i integraciju šema baza podataka, Magistarski rad, Univerzitetu Podgorici, Prirodno-matematički fakultet, Podgorica, 2005.Luković, I., Ristić, S., Mogin, P.A Methodology of A Database Schema Design Using The Subschemas, International Conference onComputational Cybernetics IEEE ICCC 2003.;Ristić, S.Istraživanje problema konsolidacije podšema baza podataka, Doktorska disertacija, Univerzitet u Novom Sadu,Ekonomski fakultet Subotica, 2003.Ristić S., Mogin P., Luković I.,Specifying Database Updates Using A Subschema Proceedings of the VII International Conference On IntelligentEngineering Systems INES 2003, March 4-6, 2003, Assiut-Luxor, Egypt, pp. 203-212.Mogin, P., Luković, I., Govedarica, M.Principi projektovanja baza podataka, FTN Izdavaštvo, Novi Sad 2004.Mogin P, Lukovic I.Resolving Collisions of Functional Dependencies in The Course of A Database Schema Integration,III Symposium of Computer Science and Information Technologies YUINFO, Brezovica, Yugoslavia, April 1997,Proceedings, pp. 99-103.

IIS*Case is a relational database schema design tool. It is based on a methodology of gradual integration of subschemas into a database schema. The first step of the design process is to identify groups of similar user requests. For each group of user requests a designer defines an external schema. An ex