Transcription

Visualization Methods of HierarchicalBiological Data: A Survey and ReviewIrina KuznetsovaHolzinger Group, HCI-KDD,Institute for Medical Informatics/StatisticsMedical University, Graz, [email protected] LugmayrVisualisation and Interactive Media, Lab.(VisLab) Curtin University, Perth, AU& Aalto Univ., Helsinki, [email protected] biomedical domain is a complex field of biologicalprocesses. In addition, the advancements of biological technologieshas led to a dramatic increase in data volumes [26], which haspresented new challenges in knowledge extraction. Working withhigh-volume data requires the application of data mining that drawsupon machine learning techniques. These methods help inextracting knowledge patterns and narrowing data to the smallervolumes. These include e.g. Support Vector Machines (SVM);Artificial Neural Networks (ANN); clustering [46][4]; statisticaltechniques (e.g. Bayesian statistics [51]; Hidden Markov Models(HMMs) [10]; Principle Component Analysis (PCA); classificationmethods [28][26][1].The interpretation of extracted knowledge and the result ofanalysed data through visualisation is an essential step in theanalysis pipeline, and becoming an important tool inbioinformatics. These not only includes simple visualizations (i.e.bar plots, pie charts, flow charts), but also advanced visualizationtechniques for representing final results in the biomedical domain(e.g. 3D). As also valid for other domains, visualisations shouldfollow information design principles, which are defined in [60]:(1) providing an overview of the data;(2) zoom in/out options;(3) filtering of unnecessary information;(4) detailization of region of interests;(5) relation between data points of interest;(6) history of actions; and(7) a possibility of extracting required parameters.Another example for applying visualisation in biology is theapplication of new technologies such as the Next GenerationSequencing (NGS), which delivers enormous volumes of genomicdata in a digital format. To visualise the data, genome browserapplications are utilized to allow a real-time visualisation andexploration of genomic sequences; of any region of interest; in anyrequired scale within a genome [53][31][56].Within the scope of this paper, we firstly give an overview ofexisting visualization techniques, followed by description of thecharacteristics of hierarchical data. We focus mainly on traditionalvisualization techniques, as i.e. the visualization of hierarchicallyorganized data in a 2D space at first place. We illustrate ourapproach based on a typical biological analysis workflow aspreviously discussed in [35]. The workflow narrows geneticalinformation into a meaningful smaller subset called differentiallyexpressed genes. These represent the active genes in the overallgenetic information, and can be utilized to obtain Gene OntologiesABSTRACTThe sheer amount of high dimensional biomedical data requiresmachine learning, and advanced data visualization techniques tomake the data understandable for human experts. Most biomedicaldata today is in arbitrary high dimensional spaces, and is notdirectly accessible to the human expert for a visual and interactiveanalysis process. To cope with this challenge, the application ofmachine learning and knowledge extraction methods isindispensable throughout the entire data analysis workflow.Nevertheless, human experts need to understand and interpret thedata and experimental results. Appropriate understanding istypically supported by visualizing the results adequately, which isnot a simple task. Consequently, data visualization is one of themost crucial steps in conveying biomedical results. It can andshould be considered as a critical part of the analysis pipeline. Stillas of today, 2D representations dominate, and human perception islimited to this lower dimension to understand the data. This makesthe visualization of the results in an understandable andcomprehensive manner a grand challenge.This paper reviews the current state of visualization methods in abiomedical context. It focuses on hierarchical biological data as asource for visualization, and gives a comprehensive survey ofvisualization techniques for this particular type of data.CCS CONCEPTS Human-centered computing Visualization techniquesKEYWORDSVisualization, hierarchical data, computer graphics, informationvisualization, big data, bioinformatics.1Andreas HolzingerHolzinger Group, HCI-KDD,Institute for Medical Informatics/StatisticsMedical University, Graz, [email protected] research domain of information visualization is broad, andinvolves a wide range of research fields, such as computer graphics(e.g. 2D and 3D graphics), information design to increasecommunication and sense making [24], creative aspects (e.g.design, layouts, colour use) [72], and methods from humancomputer interaction. Making data understandable from a cognitiveand machine learning point of view has emerged recently in theliterature [40, 41], including the idea to render data throughoutsmart environments [37–39][49].32In: Artur Lugmayr, Kening Zhu, Xiaojuan Ma (edts), Proceedings of the 10th International Workshop on Semantic Ambient Media Experiences (SAME 2017):Artificial Intelligence Meets Virtual and Augmented Worlds (AIVR), International Series for Information Systems and Management in Creative eMedia (CreMedia),International Ambient Media Association (iAMEA), n. 2017/2, ISSN 2341-5576, ISBN 978-952-7023-17-4, 2017, Available at: www.ambientmediaassociation.org/Journal

(GO), which are categorizing the functions of various genes.Visualisation supports the understanding of results, as well as theobtained ontologies.2allow the exploration of various samples of sequenced genomicdata.Another example of information visualization in biology isphylogeny, where hierarchical data structure is considered forimage development. The intuitive way of representing hierarchiesis as a tree diagram. The ETE Toolkit, PhyD3, EvolView and othervisualization programs enable phylogenetic trees to be studied inmore detail [21][33][33, 78].Several tools mentioned within this section are summarized inTable 1, which classifies them according data type, and biologicaltask. Visualisation is a very important research tool, which enablesresearchers to explore and study biological structures, investigatebiological data in various digital formats, and understand molecularprocesses in an intuitive and comprehensive way.RELATED WORKSWe would like to point to the following works in informationvisualisation and design for further reading: [60], [42], and theexcellent introductory guide [64]. Visualization plays a key role inthe biomedical domain. Various techniques aim to deliver thecorrect representation of results in visual format, and followvisualization design principles described in [60]. To state anexample, the visualization of a protein structure in 3D spaceenables researchers to have an overview of the studied protein; torotate a protein image in different dimensions; to see proteinprotein interactions; to measure an atomic distance; and to zoominto the region of interests. Several visualisation tools have beendeveloped and support the analysis process. Remaining in thedomain of protein structures Web3DMol, UCSF Chimera andPOLYVIEW-3D are just a few examples for visual proteinstructure investigation [59][48][50].3D3 CHARACTERISTICS AND PROCESSING OFHIERARCHICAL DATAThe hierarchical pattern is observed in numerous aspects of our lifeand various biological fields are not an exception: phylogenetics,GO, microarray analysis, differential expression analysis(dendrograms), and protein similarities represent data ashierarchies. Hierarchically organized data facilitates resultscomprehension and interpretation and provides a global overviewof the data. Hierarchical clustering belongs to an unsupervisedmachine learning technique used for building hierarchies. Data isusually presented as a parent-child relation, where a parent canhave zero or more related children (see Fig. 1 A-B). 3.11. A Protein Structure VisualizationWeb3DMol see [59]UCSF Chimera see [48]POLYVIEW-3D see [50]2. NGS Data VisualizationIGV see [53]UCS see [31]ZEMBU see [56]3. Hierarchical Data Visualization (Phylogeny)ETE Toolkit see [21]PhyD3 see [33]EvolView see [33, erviewTable 1: Examples of visualization softwareBiological TaskVisualization TaskProcessing Hierarchical Data StructuresTo process hierarchical structures, numerous computationalmethods have been developed such as neighbour-joining [54],UPGMA [63], maximum parsimony [11][13], and maximumlikelihood [57]. These utilize distance-based, character-based, orstatistical method of approaching hierarchies respectively.Distance-based clustering is the traditional method forhierarchical clustering. The input data is a matrix, where rowscharacterize a unique object, and columns show the object’sfeatures. The distance matrix, also called a proximity matrix, iscalculated with a linkage method (see Table 2) which enables theestimation of dissimilarity/similarity between objects. There aretwo types of algorithms for hierarchical clustering: (1)agglomerative, and (2) divisive [1]. The agglomerative or bottomup is one of the popular hierarchical clustering algorithms that startsfrom grouping the two closest data points of the distance matrixinto a cluster, updating the distance matrix for the just-generatedcluster and the original matrix based on the selected linkagemethod, and continuing this process until only a single clusterremains [25]. In other words, it starts from grouping the closest datapoints of the input data (“bottom”), and ends when each data pointis assigned to its related cluster (“up”). In contrast, the divisivemethod or top-down follows an opposite way of grouping datavalues. It considers an input data as one whole cluster and splits thedata into smaller clusters by moving from the “top” – (one cluster)to the “down”- (many clusters) [1]. Another set of tools allowing real-time visualisation andexploration of genomic sequences are the IGV, UCS, or ZEMBUgenome browsers, which have been mentioned in the introductionsection of the publication [53][31][56]. These, and many others33In: Artur Lugmayr, Kening Zhu, Xiaojuan Ma (edts), Proceedings of the 10th International Workshop on Semantic Ambient Media Experiences (SAME 2017):Artificial Intelligence Meets Virtual and Augmented Worlds (AIVR), International Series for Information Systems and Management in Creative eMedia (CreMedia),International Ambient Media Association (iAMEA), n. 2017/2, ISSN 2341-5576, ISBN 978-952-7023-17-4, 2017, Available at: www.ambientmediaassociation.org/Journal

In addition to existing clustering algorithms, new techniqueshave been developed with the aim of addressing the emergingissues associated with large data volumes produced by newtechnologies. Loewenstein and team proposed a memoryconstrained UPGMA (MC-UPGMA) algorithm that enablesclustering of the large data sets that was implemented in C [36].Kannan and Wheeler extended the parsimony score to phylogeneticnetworks; the algorithm was implemented in OCAML [30].Nested IntervalsFlat TableClosureTableMultiple LineageColumnsTable 2: Hierarchical clustering - types of linkage methods [26].Linkage methodsFormulaSingle-linkageD(Ci,Cj) mind(xp , xq )D(Ci,Cj) -linkageD(Ci,Cj) D(Ci,Cj) maxxp Ci, xq Cj[14][80][82]d(xp , xq )D(Ci,Cm) D(Cj,Cn)2D(Ci,Cm) Cm D(Cj,Cn) Cn 2D(Ci,Cj) d(ci , cj )where ci 1 ci xp CiMedian-linkage,WPGMCD(Ci,Cj) d(wi , wj )Ward’s linkageESS xn ϵ C‖xn x̅‖23.2[69,70]Representation of hierarchical data as adjacency list model (see Fig.1-A), where each element of the table has a pointer to its parent, orit can be visualized as a tree diagram (see Fig. 1-B).xp Ci, xq CjComplete-linkageSimilar to nested set techniques,however the numbering can applyreal/float/decimal numbering.Similar to adjacency method withaddition of rank and a levelinformation.Transitive way of representinghierarchies. Applied if database doesnot support iterative query.xp1where wj 2 (wm wn )Data Representation, Storage and QueriesOther essential aspects in dealing with hierarchical data is datarepresentation, and data encoding for e.g. storage in a database oras in-memory representations for applying algorithms. Theoreticalconsiderations can be found in database theory, formal languages,and query languages. An overview of these techniques can be foundin an interesting online article [82], are listed in Table 3, and moredetails about the theoretical aspects can be found in [81][8][52].Several techniques offer different ways of accessing the requiredinformation.Figure 1: Hierarchical data representation as: A- the adjacencylist model; B- a tree diagram.Table 3: Methods for storing hierarchical data [82].TechniqueAdjacency ListPath EnumerationNested SetDescriptionLinksRecursive method. Each node of thetree has a pointer to a parent node.Intuitive and simple forimplementation, but slow inperforming queries.Each entry is stored as a full path tothe root.Applies traversal method ofnumbering nodes. Each node is visitedtwice where each time the number ofthe visit is assigned (has two pointers)and stored. Fast for retrieving requiredinformation, but becomes slow forupdating a tree.[45]3.3Challenges in Hierarchical Data Processing3.3.1 Data Accessibility. The problem of modern health sciencesis that generated data may not be accessible to a health scienceresearcher directly [18], because certain patterns (“knowledge”) arehidden in arbitrarily high-dimensional spaces. Examples rangefrom longitudinal rheumatology data sets, in which cohorts ofpatients are attributed with vectors in R100 [62], to the uncertaintiesof RNA sequence base pairing variants with a potentially arbitrarynumber of dimensions [27]. The results gained by machine learningand knowledge extraction techniques need to be mapped down intothe lower dimensions to make them accessible to a human expert[73]. This calls for a closer cooperation between machine learningand visualization experts [74]. A crucial factor in clusteringtechniques is the curse of dimensionality [32]. With increasing[70][6]34In: Artur Lugmayr, Kening Zhu, Xiaojuan Ma (edts), Proceedings of the 10th International Workshop on Semantic Ambient Media Experiences (SAME 2017):Artificial Intelligence Meets Virtual and Augmented Worlds (AIVR), International Series for Information Systems and Management in Creative eMedia (CreMedia),International Ambient Media Association (iAMEA), n. 2017/2, ISSN 2341-5576, ISBN 978-952-7023-17-4, 2017, Available at: www.ambientmediaassociation.org/Journal

dimensionality, the volume of the space increases so quickly thatthe available data becomes sparse, hence becoming extremelydifficult to find reliable clusters. A further significant problem isthat distances become imprecise as the number of dimensionsgrows, since the distance between any two points in a given dataset converges; moreover, different clusters might be found in totallydifferent sub spaces. Consequently, a global filtering of attributeson its own is not sufficient.Table 4: Types of visualization methods of hierarchical data.TypeExplicitImplicit3.3.2 Subspace Clustering. The subspace clustering problem isdifficult, as very different characteristics for grouping can be used:this can be highly subjective and context-specific and requires anexpert-in-the-loop [19][20]. What is recognized as comfort for endusers of individual systems? It is interesting to note that humanexperts are quite capable in determining similarities anddissimilarities, which has been described by nonlinearMultidimensional Scaling (MDS) [58][68].We can represent similarity relations between entities as ageometric model consisting of a set of points within a metric space.The output of an MDS routine is a geometric model of the data,with each object of the data set represented as a point in ndimensional space. Consequently, there is urgent need to map veryhigh-dimensional data into a small number of relevant dimensionsto make it accessible for human expert analysis. For example, thesimilarity between patients may change by considering differentcombinations of relevant dimensions [22]. This is called subspaceanalysis and is a very interesting and relevant field of currentresearch [12]. For example, with the goal of finding a kdimensional subspace of Rd in a way that the expected squareddistance between instance vectors and the subspace is a minimum.This so-called subspace learning can also be used as adimensionality reduction technique [15]. Common tools includethe stationary subspace analysis toolbox [44], SubVIS [23] andMorpheus [43] – just to mention isualization methodrepresenting hierarchy asa node-link diagram [55]Dendrogram,Intended tree(see Fig. 2,Fig. 4)Circular tree(seeFig. 3)Visualization methodrepresenting hierarchy ina space-filling way [55]Tree-maps(see Fig. 5)Sunburst(seeFig. 6)There are a range of visualization graphs that enables hierarchies tobe shown in 2D format. Dendrograms and intended layouts (seeFig. 2, Fig. 4) are examples of the explicit method in an axesoriented layout, whereas a circular tree (see Fig. 3) is an explicitmethod in a radial layout. Space-filling techniques can also useaxes-oriented layouts, such as tree-maps (see Fig. 5); or in a radiallayout such as Sunburst (see Fig. 6).4.1Explicit VisualizationThis part of the paper describes 2D visualization techniques ofhierarchical data on GO data. The example subset data is takenfrom the REVIGO Web server at (http://revigo.irb.hr/), whichapplies a neighbor-joining hierarchical clustering algorithm toachieve hierarchies [66].The output data of REVIGIO tool may be used as input data fortree-maps, or Sunburst visualization methods, whereas a distancebased clustering used for a tree structured diagrams. The secondand the third columns of Table 5 named “representative” and“description” show the parent-child relationships respectively. Forexample, “nucleoside triphosphate metabolism” is a parent node offive related children annotations such as “nucleoside triphosphatemetabolic process”, “alanine biosynthetic process”, “inositolbiosynthetic process”, “isocitrate metabolic process” and“regulation of translation”. The “response to herbicide” and “iontransmembrane transport” have two and four related childrencorrespondingly, whereas “protein refolding” has no children at all.Although a tree diagram is a traditional way of representinghierarchies, REVIGO can visualize hierarchies as scatter plots,interactive graphs, tree-maps, tag clouds and intended trees [66].VISUALISATION OF HIERACHICAL DATATree-like structured graphs are a common way of representinghierarchical data. In general, a tree-structured graph is defined as aroot node, which is connected through links or edges to the parentand children nodes [77]. The traditional tree view is visualized inupside-down way, where the root is on the top and a parent-childrelation is shown towards the bottom. However, a tree graph can bealso represented as a left-to-right diagram [76].According to [55], visualization of hierarchical organized datacan be represented as (see Table 4) [55]:(1) explicit vs implicit; or(2) axes-oriented vs radial (see Table 4).The implicit method belongs to the space-filling technique that fitsprovided data into a defined space for example, rectangular,triangle, circular etc. The explicit method utilizes a traditional treelike structure.4.1.1 Dendrograms. A dendrogram, also called a binary tree (seeFig. 2), is a visualization technique commonly used in representinggroups of similarities (clusters) in the data produced by thehierarchical clustering method [16][25].It has a traditional tree-like structure, where leaves are placed at thesame level. The y-axis (height) shows the distance at which acluster is formed. The labels across the x-axis are equallydistributed for readability purposes. The dotted line is an exampleof a selected distance cut-off that enables the reader to see thenumber of clusters that found within that distance. Figure 2illustrates that four distinct clusters were identified (represented inred, purple, blue, and green colours) if the closeness of objects wasdefined as a distance of value two. In biology, a clustering approach35In: Artur Lugmayr, Kening Zhu, Xiaojuan Ma (edts), Proceedings of the 10th International Workshop on Semantic Ambient Media Experiences (SAME 2017):Artificial Intelligence Meets Virtual and Augmented Worlds (AIVR), International Series for Information Systems and Management in Creative eMedia (CreMedia),International Ambient Media Association (iAMEA), n. 2017/2, ISSN 2341-5576, ISBN 978-952-7023-17-4, 2017, Available at: www.ambientmediaassociation.org/Journal

Table 5: The hierarchical data as GO. The example data is taken from REVIGO [66], and modified for explanatory purposes.Term 18GO:0015797GO:0042026Representative (parent)nucleoside triphosphate metabolismnucleoside triphosphate metabolismnucleoside triphosphate metabolismnucleoside triphosphate metabolismnucleoside triphosphate metabolismresponse to herbicideresponse to herbicideion transmembrane transportion transmembrane transportion transmembrane transportion transmembrane transportprotein refoldingDescription (child)nucleoside triphosphate metabolic pr.alanine biosynthetic processinositol biosynthetic processisocitrate metabolic processregulation of translationresponse to herbicideresponse to acidic pHion transmembrane transporttransmembrane transporthydrogen transportmannitol transportprotein .3770.029is commonly used to find groups of genes that share similar featuresbased on results from Differential Expression (DE) analysis.Heatmaps represent a matrix of values of genes expression in acolor-coded way and are accompanied with dendrograms.Dendrograms are illustrated along the heatmap on the top and/orleft sides. The left side dendrogram represents the similaritybetween genes, and the top dendrogram the similarity betweensamples. Dendrograms are also common structures in representingphylogenetic trees [7].Figure 3: Circular Tree, (visualized with R).4.1.3 Intended Trees. The intended layout is another way ofrepresenting hierarchies (Fig. 4). The data is plotted along thevertical axis and indentations are used in representingparent/children relationships. This type of visualization iscommonly used for interface systems or online, as it allows easyaccess to required information by scrolling down. However, thistechnique has an unpublishable format and hence cannot be usedas an effective overview of the data.Figure 2: Dendrogram, (visualized with R).4.1.2 Circular Trees. There are another two ways of visualizingtrees as radial trees, and circular trees [76]. In the radial tree thehierarchical tree structure is visualized in an annulus wedge; thealgorithm was proposed by P. Eades [9]. In the circular treevisualization, the root is placed at the central positions and leafnodes are equally distributed around on the perimeter of a circle(see Fig. 3) [76]. The hierarchy in this case is shown with a treestructure graph. Coloring and labelling are used to improverepresentation of tree graphs. The circular tree layout that is alsoexplicit method used for representing phylogenetic trees [7].Figure 4: Intended Tree.36In: Artur Lugmayr, Kening Zhu, Xiaojuan Ma (edts), Proceedings of the 10th International Workshop on Semantic Ambient Media Experiences (SAME 2017):Artificial Intelligence Meets Virtual and Augmented Worlds (AIVR), International Series for Information Systems and Management in Creative eMedia (CreMedia),International Ambient Media Association (iAMEA), n. 2017/2, ISSN 2341-5576, ISBN 978-952-7023-17-4, 2017, Available at: www.ambientmediaassociation.org/Journal

4.2wedges sizes is relatively easy for the reader as each slice isrepresented in a familiar proportional way. However, in the case ofnarrow wedge sizes, the readability and evaluation of thevisualization becomes poor. This leads to the similar problem oflosing some graph labels, but can be addressed by using the emptyspace around the circular layout. As with tree-maps, the Sunburstuses colouring to improve readability of the visualization. Otherspace-filling visualization techniques available are the Voronoidiagram[2], Ellimaps [47], icicle plots [34] and Beamtree [17].Implicit Visualizations4.2.1 Tree-maps. Tree-maps are a space-filling technique, alsoknown as implicit, used to represent hierarchical structures,proposed in 1992 and are described in [55][67][61]. Tree-mapsapply a recursive algorithm for visualizing nested rectangles. Thetree-map uses outer rectangle as a tree’s root and the inner space ofthis rectangle is filled with nested rectangles representing theparent/children relationship. This space is divided between parentnodes according to its assigned weight in the shape of rectangles,and each parent is subdivided into the amount of related children asfurther rectangles (see Fig. 5). Alternative algorithms have beenproposed [5][79], as the original method suffers from the creationof narrow rectangles that impair the visualization’s readability. Theoriginal tree-map layout was “slice and dice”; the idea has sincebeen extended with the development of the web-based tree-map byWattenberg [75],the strip or ordered tree-map algorithm [3] and thespiral layout algorithm that enables the reader to see changes inhierarchical data [66].Figure 6: Sunburst diagram.5CONCLUSIONSWe reviewed the most common visualization techniques availablefor hierarchical structured data in 2D space within the scope of thispaper. Within the conclusion section, we pinpoint to the 7 mostrelevant categories for classifying and characterizing biologicalvisualizations to support the development of visualizationtaxonomies.Figure 5: Tree-map, R(treemaps).Tree-maps offer efficient usage of the available display space,and provides a good overview of the entire data hierarchy. The sizeof rectangles is relative to the size of the related data object, whichsimplifies data interpretation and evaluation. The color-codinghelps to distinguish between different cluster groups and also helpsshow the relationship between children to parent nodes. The maingraphical parameters for tree-map plotting are visualization areasize, position and color-coding [71].On the other hand, tree-map visualization becomes poor withthe increase of input data size. While tree-map graphs still providea data overview, supporting visualization objects such as labelscannot be drawn on small rectangles. Visualization of GO termswith tree-maps is an example of a using tree-map layout in biology[66].5.1 Visualization TechniqueVisualization techniques for hierarchical structured data in 2D canbe classified as explicit (dendrograms, circular tree, intended trees)and implicit (tree-maps, Sunburst) methods. All of them haveadvantages and disadvantages, and the choice of the most suitablevisualization technique depends on the final representation goal.For example, space-filling techniques are the best for representinga global overview of final results. However, with the increasing sizeof the data, details such as labels are often omitted to avoidcluttering the final picture. Tree-maps utilize the complete displayspace, while the Sunburst uses only part of it. On the other hand,the Sunburst provides a more intuitive understanding of therelationship between data values due to proportional representationof relationships [29]; it is harder to see the size difference betweenrectangles in tree-maps.4.2.2 Sunbursts. An alternative space-filling visualization is torepresent data in a radial layout such as the Sunburst (see Fig.6)[65][29]. The hierarchy is represented from the center outwardsfrom it. The inner circle is the root of the hierarchical data, andmultiple layers of rings represent the parent-child relationships nextto each other [67]. As the Sunburst is a circular space-fillingtechnique, the edges of the provided display space are unused. Thewedge size is relative to the cluster size. The interpretation of5.2 Visualization DesignIn addition to appropriate selection of the visualization method, itis important to apply suitable supporting visualization features:37In: Artur Lugmayr, Kening Zhu, Xiaojuan Ma (edts), Proceedings of the 10th International Workshop on Semantic Ambient Media Experiences (SAME 2017):Artificial Intelligence Meets Virtual and Augmented Worlds (AIVR), International Series for Information Systems and Management in Creative eMedia (CreMedia),International Ambient Media Association (iAMEA), n. 2017/2, ISSN 2341-5576, ISBN 978-952-7023-17-4, 2017, Available at: www.ambientmediaassociation.org/Journal

(1) applying color-coding;(2) providing legend information if necessary;(3) ordering results appropriately;(4) zooming into a region of interest;(5) displaying additional supporting parameters (e.g.numerical proportions of pie slices); and(6) using the same font for labeling, and others.The appropriate utilization of such features makes visualizationmore intuitive to comprehend.[3][4][5][6][7]5.3 Interactive Multimedia Features[8]Modern technologies provide various techniques for exploring bigdata in real-time, such as interactive methods an

Visualization, hierarchical data, computer graphics, information visualization, big data, bioinformatics. 1 INTRODUCTION The research domain of information visualization is broad, and involves a wide range of research fields, such as computer graphics