Transcription

I.J. Modern Education and Computer Science, 2016, 7, 1-12Published Online July 2016 in MECS (http://www.mecs-press.org/)DOI: 10.5815/ijmecs.2016.07.01The Effect of MySQL Workbench in TeachingEntity-Relationship Diagram (ERD) to RelationalSchema MappingLi YangUniversity of West Georgia/ Department of Computer Science, Carrollton, GA, USAEmail: [email protected] CaoUniversity of West Georgia, Carrollton/Department of Educational Technology and Foundations, GA, USAEmail: [email protected]—While designing a database at the conceptuallevel using the Entity-Relationship (ER) model ischallenging for novices, mapping an Entity-RelationshipDiagram (ERD) to a relational schema is not an easy task,either. One of the problems is students’ lack ofunderstanding of the underlying mapping principles. Themental gap between the two models in students alsomakes it difficult for them to perform the conversion.This paper describes two studies that aimed at helpingstudents fill the gap and improve their ERD-Relationalschema mapping performance by the use of avisualization tool, MySQL Workbench. We investigatedthe effect of using MySQL Workbench in teaching ERDRelational schema mapping. We identified the pros andcons of using visualization in teaching the topic. Ourresults show that visualization can increase studentinterest and engagement, and facilitate studentsconnecting the concepts in the two models. With aninstructional delivery method that emphasizes theunderlying mapping principles, a visualization tool couldhelp undergraduate students improve their performanceon ERD-relational schema mapping and facilitate theirunderstanding of the ER model.Index Terms—Entity-relationship model, relationalmodel, relational schema, entity-relationship diagram,visualization, ERD-relational schema mapping.I. INTRODUCTIONData modeling is an important component of thecurriculum of both Computer Science [1] andInformation Systems programs [20]. According to theComputer Science Curricula 2013 report, ― studentneeds to be able to develop conceptual and physical datamodels‖. To develop a conceptual data model, ittypically involves analyzing the problem domain andcoming up with a database diagram using the EntityRelationship (ER) model [9]. The diagram is thenconverted into a relational schema before the physicaldatabase is created and implemented. While it isCopyright 2016 MECSchallenging for novices to learn to design a database atthe conceptual level due to the abstract and complexnature of data modeling [10, 12, 17, 19, 24], it is not aneasy task mapping an ERD (Entity-Relational Diagram)to a relational schema. Even though it may seemstraightforward to experienced database designersmapping an ERD to a relational schema, our over tenyears’ student performance record on database designshows that our CS students have not performed well onthis topic. Their average grade on ERD-Relationalschema mapping homework over an 11-year period is71.70 (out of 100) which is only 1.70 points higher thantheir average grade on ER modeling homework.Our observation of this challenge is also confirmed byrecent research studies. As described in [10] and [12],―the skills to map a conceptual model to a logical/designare ―hard‖ technical skills‖, and they are ―differentfrom knowing the components of an ER model .‖.To overcome these challenges, it is important forstudents to learn the concepts and skills, and to develop aholistic view of conceptual design and ERD-relationalschema mapping. Student understanding and skills inthese areas may impact the quality of their final databasedesign. One cannot expect a good logical model fromstudents who do not fully understand the principlesbehind the conversion process, even if they could comeup with a good conceptual model. However, littleresearch is found in investigating how to improve studentperformance in ERD-relational schema mappingOne of the problems we encountered frequently whenteaching ERD-Relational schema mapping is students’lack of understanding of the underlying mappingprinciples. Many students often mechanically follow therules to perform the mapping without a clearunderstanding why they should map things a certain way.While memorizing or mechanically following themapping rules may result in the temporary correct result,this kind of rote learning does not promote deep learningand understanding [24]. As a result, students often endup with incorrect relational schema after the initial classinstruction. This is mostly because either they misuse arule or completely forget to apply some rules. ThisI.J. Modern Education and Computer Science, 2016, 7, 1-12

2The Effect of MySQL Workbench in Teaching Entity-Relationship Diagram (ERD) to Relational Schema Mappingproblem reveals a disconnection between applying therules and understanding the effects of the rules amongstudents.This paper reports our research that aimed at helpingstudents establish the connection between the conceptualmodel and logical model via the use of visualization. Weposit that visualization may help in closing the mentalgap students have on the two models and assist studentsto achieve a better understanding of the mappingprinciples, which in turn may improve their mastery ofdatabase design at the conceptual level. The visualizationtool we chose is MySQL Workbench [25], which is afree client program with many features to facilitatedatabase design. One such feature is to create anEnhanced-ER Diagram (EERD). Users can drag and dropicons representing tables and connect them using 1:1(identifying or non-identifying), or 1:M (identifying ornon-identifying), or M:N (identifying) relationships. Thisfeature automatically creates an intersection table andadds the foreign key into it for M:N relationships. For1:1 and 1:M relationships, it adds the foreign key to thetable that user clicks first. The end result is essentially arelational schema, though named EERD.We conducted two studies over a two-year period. Thefirst study examined whether a visualization-basedinstructional approach, compared with a more traditionallecture-based instructional approach, could improvestudent performance on ERD-Relational schemamapping. As a follow-up study, the second studyexamined whether an enhanced studentunderstanding of 1:1 and M:N relationship mapping.The rest of the paper is organized as follows. Section 2presents the background information. Section 3 describesthe research questions, methodologies, procedures,results and discussions of the two studies. A generaldiscussion is presented in Section 4. The conclusions andfuture work are presented in Section 5.II. BACKGROUNDThough the concepts of the ER model are simple,conceptual database design using the ER model is often adifficult and complex task for novices. Researchers haveinvestigated the underlying obstacles to the problem [1, 4,5, 10, 12, 17, 18, 19, 24]. The obstacles include students’poor understanding and conception of the problemdomains, and abstract and complex relationships amongentities etc.To overcome the obstacles, researchers haveinvestigated varied visualization tool-based support forlearning ER modeling [12, 16, 17, 21, 22, 24].Visualization, playing an important role in learning andinstruction, has been used in teaching programming andalgorithms [6, 7, 8] and other topics in Computer Science.Research shows that if properly applied with activestudent engagement, visualization helps students developa correct mental model of the concepts in instructionalsettings [7, 26, 28]. Furthermore, an interactivevisualization with feedback on correctness further helpsCopyright 2016 MECSstudents engage in active learning instead of passivereceiving [7, 16].For instance, Hall and Gordon [16] described a textbased Virtual Learning Environment (VLE) where usersinteracted with objects in a virtual world and were givenimmediate feedback on their design. The preliminaryresults indicated a tool such as VLE might help increasenovice designers’ understanding of ER modeling.Similarly, Kolp and Zimanyi [21] developed anenvironment in which users’ design was checked andexplanation was provided at the time of errors. However,the study focused more on the development of themethod rather than on the effectiveness of theenvironment for student learning. Moreover, Murray andGuimaraes [24] presented a set of animations assupplementary instructional materials to teach databaseanalysis and design. They followed the typical databasedesign process and included animations for ER modeling,ER-Relational schema mapping, normalization, anddenormalization. However, neither qualitative norquantitative results were reported in the study.Alternatively, several researchers applied problembased learning [3] or project-based learning in teachingdatabase design and analysis [10, 12, 22, 23]. Bothproblem-based learning and project-based learning arebased on the constructivist theory of learning which wasattributed to Jean Piaget [27]. Constructivist theoryadvocates that learners internalize knowledge byconstructing it through active engagement in the process.It emphasizes student-centered learning and teachersacting as facilitators to guide rather than teach studentsduring the learning process. The main difference betweenproblem-based and project-based learning is the formeris focused on inquiry and research, and the problemgiven is typically well-articulated; while the latter isfocused on the end project and may involve any numberof problems [10]. Connolly, Stansfield, and McLellan[12] applied the problem-based learning approach in theirstudy. They investigated using an educational simulationgame to supplement traditional teaching methods toincrease student engagement and performance. Theirpreliminary result on a partially implemented visuallearning environment showed the online group that wasgiven access to the game, even though limited in itsfeatures, performed significantly better than the othergroups in terms of grades. Kreie and Ernst [22] alsoapplied problem-based learning in teaching databaseanalysis and design. They used several Oracle productsin hands-on tutorials and problem-solving assignments tofacilitate student learning. However, no qualitative orquantitative results were reported.On the other hand, Connolly & Begg [10] described aproject-based learning approach to teaching databaseanalysis and design. Their preliminary qualitative andquantitative results showed that the constructivistapproach improved student learning. However, ―theresults were not fully conclusive because the effect couldhave been entirely attributable to online delivery ratherthan the project-based approach‖ [10, pp. 50].I.J. Modern Education and Computer Science, 2016, 7, 1-12

The Effect of MySQL Workbench in Teaching Entity-Relationship Diagram (ERD) to Relational Schema MappingAs these studies show, different approaches have beenexplored [10, 11, 12, 22, 23], to facilitate studentlearning of database analysis and design. Databaseanalysis and design process as a whole encompasses userrequirement gathering and analysis, data modeling at theconceptual level using ER or Unified ModelingLanguage (UML), logical database design by mapping aconceptual model (ERD or UML class diagram) to arelational schema, and normalization. Compared to thesestudies, the present study focused only on ERD-relationalschema mapping, which is an integral step in databaseanalysis and design. Our study attempted to address thechallenges that students face in their initial mapping ofan ERD to a relational schema as novices. In particularly,our studies addressed the lack of understanding of the ERmodel and confusion in connecting the ER model andrelational model that students encountered in developinga complete and accurate relational schema. Similar toother colleagues [12, 22, 23], we explored visualizationin facilitating student learning. In particular, we used aproblem-based learning as our instructional approach forin-class exercises and a project-based learning for a termproject in which students were asked to design andimplement a database-driven application. To examine theeffects of visualization, we collected quantitative andqualitative data. Admittedly, it has become a commonpractice to use commercial visualization products inteaching database design including ERD-Relationalschema mapping. However, to the best of our knowledge,there has not been similar empirical studies thatexamined effects of using visualization tools withquantitative and qualitative results in the literature. It isour hope that the current research would shed some lighton the effects of visualization in ERD-Relational schemamapping.III. EVALUATION OF THE EFFECT OF MYSQLWORKBENCH IN TEACHING ERD-RELATIONAL SCHEMAMAPPINGIn order to examine the effect of MySQL Workbenchin ERD-Relational schema mapping, we carried out twostudies.Study A focused on comparing the use of MySQLWorkbench in teaching the topic with a more traditionallecture-based instructional approach. The studyexamined the effect of the visualization-based approachin different instructional settings (face-to-face vs. online),and in different student groups (undergraduate vs.graduate students). It also identified the strengths andweaknesses of the visualization-based instructionalapproach.After identifying some ineffective aspects of thevisualization approach in Study A, we revised theinstructional delivery of the visualization approach.Study B investigated whether the revised visualizationbased approach helped improving student performancefor undergraduate students in a face-to-face class andgraduate students in an online class.Copyright 2016 MECS3A. Study A: An Evaluation Of Visualization In TeachingERD-Relational Schema Mapping1. Research questionsIn Study A, we specifically addressed the followingresearch questions:I.Does the use of a visualization tool (i.e., MySQLWorkbench) improve student performance ofERD-Relational Schema mapping?II. Does the timing of when to introduce avisualization tool in the class have an impact onstudent performance of the mapping?III. What are the pros and cons of using avisualization tool in teaching the ERD-RelationalSchema mapping principles?2. Research methodologyThis study was conducted at a regional publicuniversity in the Southeast of US in fall of 2013. Therewere two sections of a Computer Science (CS)undergraduate face-to-face database course, CS3230Information Management (IM), and one section of anonline CS graduate database course, CS6231 DatabaseSystems I (DB I).The undergraduate IM course is a traditional databasecourse, which covers the relational data model,conceptual database design using the ER model, ERDRelational schema mapping, SQL, normalization,transaction processing, and programmatic databaseaccess. In each class meeting, after the normal instructiontime, the students were given hands-on exercises (e.g.,designing an ERD for a database, writing SQL queries,etc.) to practice and explore what they had just learned inclass, and to construct the knowledge on their own. Thestudents were encouraged to ask their fellow classmatesor the instructor for clarification and to search theinternet for additional resources. The instructor acted as acoach to facilitate their learning by making clarificationson the problems, providing hints, and offering feedback.In addition to the in-class problem-based learningactivities, the students were engaged in project-basedlearning. They were given a semester-long group projectto design and implement a database-driven applicationbased on instructor-designed requirements. They wereasked to integrate what they had learned in SoftwareEngineering I (the prerequisite of CS3230) and thepresent course to design and implement the front-end andback-end of the application. The project ran in parallelwith class instructions. After the topics on ER modeling,and ERD-Relational schema mapping were covered inclass, the students were asked to start to design andimplement the database for the project. The project-basedlearning aimed to create an environment in which thestudents could apply and integrate what they havelearned in class into a small-scale project.Our graduate program is an Applied M.S. program.This unconventional program includes students who donot usually have a Bachelor’s degree in ComputerScience. Because of this feature, the coverage of theI.J. Modern Education and Computer Science, 2016, 7, 1-12

4The Effect of MySQL Workbench in Teaching Entity-Relationship Diagram (ERD) to Relational Schema Mappinggraduate course CS6231 (DB I) is very similar toCS3230 (IM). The only main difference is some topics,such as transaction processing and programmaticdatabase access, are not covered in CS6231. Anotherdifference is the graduate students are not given a termproject because they need more preparation on softwaredevelopment. All these missing components are offeredin the subsequent course, CS6232 Database Systems II.Though the graduate course is on-line, the instructorprovides tutorials, videos, and practice exercises inaddition to homework and projects to facilitate theirlearning.In both courses, ERD-Relational schema mapping wascovered after ER modeling. The students in all thesections had never performed ERD-relational schemamapping previously. The total number of students ineach section was 18, 19, and 13, respectively. Aftereliminating the students who did not participate in all theassessments, there were 11, 12, and 12 studentsrespectively who agreed to participate in the study.For clarity, the demographics and nature of theteaching environment of these groups are shown below: G1 (n 11): undergraduate students in a face-toface classG2 (n 12): undergraduate students in a face-toface classG3 (n 12): graduate students in an online class3. Research procedureAs our teaching experience showed, the majorbottleneck for students to learn the ERD-relationalschema mapping is in the area of mapping binaryrelationships (1:1, 1:m, and m:m relationships) andidentifying referential integrity constraints. Enlightenedby this information, our instruction focused on analyzingthe correctness of students’ work in those areas. Notethat our lessons included additional concepts, such asweak entities, multi-valued attributes, and n-naryrelationships which are also difficult for students tounderstand in ER modeling. We did not, however, focuson how to map those constructs. This is because we useda variation of the crow’s foot notation [14] for ERDs. Inthat notation, multi-valued attributes and n-naryrelationships are represented as weak entities. Weakentities in the ERD are mapped to tables in the relationalschema. Therefore, we did not consider those constructshere. The study consisted of the following steps:1) Students in all the sections were given a pre-testbefore they were introduced to the topic. In the pretest, they were asked to convert an ERD into arelational schema. They had learned about the ERmodel and relational data model, and thus had abasic understanding of both models. The pre-testserved as the baseline measure for the study.2) After the pre-test, students were engaged in thelesson topic. To evaluate the effect of using avisualization tool, we adopted three differentinstructional approaches for the three sections.Copyright 2016 MECSBased on the specific instructional approaches,these sections were grouped into two categories asfollows: Category #1 (Lecture-based format):o Approach #1: For the face-to-face undergraduatesection 1 (G1), the mapping principles werepresented in a regular lecture format in which theinstructor explained the mapping rules withexamples for each case.o Approach #2: For the online graduate section G3,students were given a video of the lecture inwhich the instructor showed the PowerPointlecture slides and explained the mapping ruleswith examples for each case, and demonstratedhow to perform mapping on a sample problem. Category #2 (Visualization-based Demonstration):o Approach #3: For the face-to-face undergraduatesection 2 (G2), the instructor demonstrated howto use MySQL Workbench to perform ERDRelational schema mapping in class withexplanations.3) After the mapping rules were explained, thestudents in all the sections were given in-classexercises on ERD-Relational schema mapping. Forthe students in the undergraduate section (G1) andthe graduate section (G3), they were asked to writeout the mapping results based on the lecture slides.For the undergraduate section (G2), the studentswere asked to perform the mapping using MySQLWorkbench, and then write out the results. Thereasons for writing out the result is to make sure thestudents understand more abstract representation ofthe logical model and can make the transition froma graphical interface to it. From our experience,many students often take little attention onspecifying the primary keys, foreign keys, andwhich tables are the referenced tables. Graphicalrepresentations of tables and relationships mayinadvertently mask those fundamental and criticalcomponents of a relational schema. Writing out theschema requires students to reflect on their mapping,thus provides a better picture of their understandingof the mapping. We used these written results toexamine how the tool impacted studentcomprehension of the mapping rules. In the in-classexercises, there were one 1:1 relationship, three 1:Mrelationships (and we used the average grade toreport the 1:M case), and one M:N relationship.4) Then in the next class session, in order to furthertest the effect of the tool and to give all the studentsthe benefit of different instructional approaches, wereversed the instructional approaches for thesections as follows. For the undergraduate section (G1), students werepresented with the tool and showed how to use thetool to perform the mapping.I.J. Modern Education and Computer Science, 2016, 7, 1-12

The Effect of MySQL Workbench in Teaching Entity-Relationship Diagram (ERD) to Relational Schema Mapping For the undergraduate section (G2), students werepresented with the mapping principles in a lectureformat. For the graduate section (G3), the students weregiven a PDF tutorial that showed how to use theMySQL Workbench to perform the mapping.5) After that, the post-test was given. The post-testproblem was identical to the pre-test problem.Because no answers were given to the students afterthe pre-test, and the mapping problem was notsimple, we felt the likelihood is slim that thestudents may perform better on the post-test due tothe same problem being given at pre- and post-test.In addition, even if some students used rotememorization when performing mapping, weexpect that the impact on the final mapping resultwould be similar in all the sections. We believevisualization is to motivate students to achieve abetter understanding of the mapping and not toeliminate rote learning. There were two 1:1relationships (and we picked the one that wassimilar to the one in the in-class exercise to reportthe result for the 1:1 case), two 1:M relationships(and we used the average grade for the 1:M casebecause both 1:M relationships are very similar tothe ones in the in-class exercise), and one M:Nrelationship. The post-test measure asked thestudents to use MySQL Workbench to perform the5mapping, then to write out their solutions on paper.6) Along with the post-test, we also gave a post-survey,which asked the students how they felt about thetool and if the tool helped or impeded theirunderstanding of ERD-Relational schema mapping.On all the assessments (pre-test, in-class exercises, andpost-test), we assigned grades based on the correctness ofthe mapping of each component (e.g. 1:1, 1:M, M:N,specifying referential integrity) and used them in thequantitative analysis. The total points awarded for eachmapping task are shown in Table 1 and Table 4.4. FindingsTo address Research Question #1 (Does the use of avisualization tool improve student performance of ERDRelational Schema mapping?), we compared studentperformance on the in-class exercise and the pre-test. Thein-class exercises were given after all the groups wereinitially presented the material using differentinstructional approaches. The differences among thegroups would indicate which approach was moreeffective for student learning. The descriptive statistics ofmapping 1:1, 1:M, and M:N relationships, allrelationships (1:1, 1:M, M:N), specifying referentialintegrity constraints and the grand total in the pre-test, inclass exercises, post-test for all three groups arepresented in Table 1.Table 1. Descriptive Statistics of the Performance of G1, G2, and G3 in Mapping 1:1, 1:M, M:N Relationships, Specifying Referential IntegrityConstraints, All Relationships (1:1, 1:M, M:N Relationships Mapping) and the Grand Total in the Pre-test, In-class Exercise, and Post-test.G1 (n 11)(Lecture-based lationship(R1:M ,3pts)PostM(SD)G3 (n 12)(Lecture Slides &Video visualization )2.78(0.46)2.63(0.93)M:Nrelationship(RM:N, 3pts)Specifyingreferentialintegrityconstraints(rf, 0(2.22)6.424.66)10.83(2.98)10.75(2.38)Total of allrelationships(R1:1 R1:m Rm:n9 .82)7.40(2.61)6.04(2.73)7.53(1.96)8.33(1.17)Grand Total(R1:1 R1:m Rm:n rf, 9)Mappingtasks1:1relationship(R1:1 , 3pts)PostG2 (n 12)(Visualization-based Lecture)PostM(SD)Note: M grade mean, SD standard deviation.Copyright 2016 MECSI.J. Modern Education and Computer Science, 2016, 7, 1-12

6The Effect of MySQL Workbench in Teaching Entity-Relationship Diagram (ERD) to Relational Schema MappingAs Table 1 shows, Group 2 (G2) which was appliedwith the visualization-based approach made the biggestimprovement in all the cases, except for specifyingreferential integrity constraints, in which case Group 3(G3) made the biggest improvement. These resultssuggest that the use of MySQL Workbench throughdemonstration prior to lecture improved studentperformance of ERD-Relational Schema mappingcompared to the traditional lecture-based instructionalapproach.To answer Research Question #2 (if the ordering of theinstructions matters), we compared the post-test and pretest results of all three groups. The reason is all thegroups were presented the topic in both lecturepresentation as well as visualization demonstration by thepost-test time. The only difference is when thevisualization tool was introduced. The results in Table 1indicate that on the post-test: the students in the group that was introduced thevisualization tool earlier (G2) consistentlyperformed better on all the cases than the groupthat was introduced the tool later (G1). This resultindicates that introducing the tool earlier wouldenhance undergraduate student performance onERD-Relational schema mapping.the graduate class (G3) which was introduced thetool later performed better than G2 on 1:1, M:Nrelationshipsmapping,totalrelationshipsmapping, and the grand total in terms of meangrades. However, the group (G2) achieved thehighest gains in all the cases, when the post-testresults were compared to the pretest results.Table 2. One-way ANOVA and Bonferroni Post Hoc Test of Performance of G1, G2, and G3 on Pre-test, In-class Exercise, and Post-test with p 0.05.Mapping TasksIn-class (1:1 relationship, R1:1)In-class (1:M relationship, R1:M)In-class (specifying referential integrity constraints)In-class (grand total total relationships specifying referential integrityconstraints)Post-test (1: M relationships, R1:M)Next, we conducted the One-Way Analysis ofVariance (ANOVA) to find out if there was statisticallysignificant performance difference among all threegroups. Table 2 reports the ANOVA result for p 0.05.As Table 2 shows, there is a significant difference amongthe three groups on all in-class tasks except M:Nrelationship mapping, and on the post-test 1:Mrelationship mapping. Furthermore, to find out whichgroups performed significantly higher on the tasksevaluated, we conducted Bonferroni post hoc test (seeTable 2).In order to verify the quantitative data, we alsocollected qualitative data via an open-ended question inthe post-survey. The post-survey question is ―pleaseexplain honestly how you feel about using the tool forERD-Relational mapping (do you feel it helps youunderstand the mapping or it actually hinders yourunderstanding). And if it helps you, please tell me in whatway. If it confuses you, please also tell me in what way‖.The results are reported in Table 3.Table 3. Student Responses to the Survey Question (if the Tool HelpedThem Understand the Mapping).Positive72.72% (8)Neutral18.18% (2)Negative9.09% (1)G2 undergraduate(n 12)91.66% (11)0 (0)8.33% (1)G3(n 12)75% (9)25% (3)0 (0)80% (28)14.29% (5)5.71% (2)G1 undergraduate(n 11)GraduateTotal(n 35)Copyright 2016 MECSMean DifferenceMG3 – MG1 1.77MG2 – MG1 1.53MG3 – MG1 1.46MG2 – MG1 3.67MG3 – MG1 5.20MG2 – MG1 7.15MG3 – MG1 9.04MG2 – MG1 0.0020.0060.0010.044ŋ20.260.450.310.380.19As Table 3 shows, students in three groups all reportedpositively to the helpfulness of the visualization tool inunderstanding the mapping. Their perceived endorsementranged from 72.72% for the undergraduate students in G1,to 91.66% for the undergraduate students in G2, and to75% for the graduate students in G3.In addition to their rating, students commented that thevisual representation o

effects of visualization, we collected quantitative and qualitative data. Admittedly, it has become a common practice to use commercial visualization products in teaching database design including ERD-Relational schema mapping. However, to the best of our kno