Transcription

Argonne Training Program on Extreme-Scale Computing(ATPESC)Data Analysisand Visualization

Visualization & Data AnalysisTimeTitle of presentationLecturer8:30 amVisualization IntroductionMike Papka, Joe Insley, Silvio Rizzi, ANL9:30 amLarge Scale Visualization with ParaView (Presentation)Dan Lipsa, Kitware10:30 amBreak11:00 amLarge Scale Visualization with ParaView (Hands-on Exercises)Dan Lipsa, Kitware12:00 pmVisualization and Analysis of Massive Data with VisIt (Presentation)Cyrus Harrison, LLNL12:30 pmLunch and Hands-on Exercises1:30 pmVisualization and Analysis of Massive Data with VisIt (Hands-onExercises)3:00 pmBreak3:30 pmScalable Molecular Visualization and Analysis Tools in VMDJohn Stone, UIUC4:30 pmExploring Visualization with Jupyter NotebooksMike Papka, Joe Insley, Silvio Rizzi, ANL5:30 pmDinner Talk: Big Data Brain Maps at Argonne National LaboratoryBobby Kasthuri6:30 pmHands-on Exercises2 ATPESC 2017, July 30 – August 11, 2017Cyrus Harrison, LLNL

Argonne Training Program on Extreme-Scale Computing(ATPESC)VisualizationIntroductionMike PapkaJoe InsleySilvio RizziArgonne Leadership Computing FacilityArgonne National LaboratoryQ Center, St. Charles, IL (USA)August 10, 2017

Here’s the plan Examples of visualizations Visualization resources Visualization tools and formats Data representations Annotation and movie creation Visualization for debugging In-Situ Visualization and Analysis4 ATPESC 2017, July 30 – August 11, 2017

Multi-Scale Simulation / VisualizationArterial Blood FlowAnterior CerebralMiddleCerebralAneurysm PlateletsLeft InteriorCarotidArteryBasilar5 ATPESC 2017, July 30 – August 11, 2017VertebralRight InteriorCarotid ArteryUltravis '15 Nov5 16, 2015Data courtesy of:George Karniadakisand Leopold Grinberg,Brown University

Climate6 ATPESC 2017, July 30 – August 11, 2017Data courtesy of: Mark Taylor, Sandia National Laboratory; Rob Jacob, Argonne NationalLaboratory; Warren Washington, National Center for Atmospheric Research

Aerospace(Jet Nozzle Noise)Data courtesy of: Anurag Gupta and Umesh Paliath, General Electric Global Research7 ATPESC 2017, July 30 – August 11, 2017

Materials Science / MolecularData courtesy of:SubramanianSankaranarayanan,Argonne NationalLaboratoryData courtesy of: Advanced PhotonSource, Argonne National LaboratoryData courtesy of: Jeff Greeley, NicholsRomero, Argonne National Laboratory8 ATPESC 2017, July 30 – August 11, 2017

CosmologyData courtesy of: Salman Habib, Katrin Heitmann, andthe HACC team, Argonne National Laboratory9 ATPESC 2017, July 30 – August 11, 2017

Cooley Analytics/Visualization cluster Peak 223 TF 126 nodes; each node has– Two Intel Xeon E5-2620 Haswell 2.4 GHz 6-core processors– NVIDIA Telsa K80 graphics processing unit (24GB)– 384 GB of RAM Aggregate RAM of 47 TB Aggregate GPU memory of 3TB Cray CS System 216 port FDR IB switch with uplinks to our QDR infrastructure Mounts the same GPFS file systems as Mira, Cetus10 ATPESC 2017, July 30 – August 11, 2017

VISUALIZATION TOOLS AND DATA FORMATS11 ATPESC 2017, July 30 – August 11, 2017

All Sorts of Tools Visualization Applications– VisIt– ParaView– EnSight Domain Specific– VMD, PyMol, Ovito APIs– VTK: visualization– ITK: segmentation ®istration12 ATPESC 2017, July 30 – August 11, 2017 GPU performance– vl3: shader-based volumerendering Analysis Environments– Matlab– Parallel R Utilities– GnuPlot– ImageMagick

ParaView & VisIt vs. vtk ParaView & VisIt– General purpose visualization applications– GUI-based– Scriptable– Extendable– Built on top of vtk (largely) vtk– Programming environment / API– Additional capabilities, finer control– Smaller memory footprint– Requires more expertise (build custom applications)13 ATPESC 2017, July 30 – August 11, 2017

Data File Formats (ParaView & VisIt) VTK Parallel (partitioned) VTK VTK MultiBlock (MultiGroup, Hierarchical,Hierarchical Box) Legacy VTK Parallel (partitioned)legacy VTK EnSight files EnSight Master Server Exodus BYU XDMF PLOT2D 14 ATPESC 2017, July 30 – August 11, 2017PLOT3DSpyPlot CTH PNG SAFHDF5 raw image data LS-Dyna Nek5000DEM OVERFLOWVRML paraDISPLYPolygonal Protein Data PATRANBank PFLOTRANXMol Molecule PixieStereo Lithography PuReMDGaussian Cube S3DRaw (binary) SASAVS TetradMeta Image UNICFacet VASP ZeusMP ANALYZE BOV GMV TecplotVis5DXmdvXSF

Data Wrangling XDMF– XML wrapper around HDF5 data– API for writing from simulation code– Can define data sets, subsets, hyperslabs vtk– Could add to your simulation code– Can write small utilities to convert data Use your own read routines Write vtk data structures– C and Python bindings15 ATPESC 2017, July 30 – August 11, 2017

Data Organization Format– Existing tools support many flavors– Use one of these formats– Use (or write) a format converter– Write a custom reader for existing tool– Write your own custom vis tool Serial vs. Parallel/Partitioned– Single big file vs. many small files: middle ground generallybest vtk data types XDMF for HDF5 (VisIt and ParaView) Custom16 ATPESC 2017, July 30 – August 11, 201716

Data Organization Serial vs. Parallel/Partitioned– Performance trade-offs vtk/paraview: serial files all data read on head node, partitioned and distributed vtk/paraview: parallel files: serial files partitionedPerformance example: Single serial .vtu file (unstructured grid)– Data size: 3.8GB– Read time on 64 processes: 15 minutes most of this was spent partitioning and distributing Partitioned .pvtu file (unstructured grid)– Data size: 8.7GB (64 partitions)– Read time on 64 processes: 1 second17 ATPESC 2017, July 30 – August 11, 201717

DATA REPRESENTATIONS18 ATPESC 2017, July 30 – August 11, 2017

Data Representations: Volume Rendering19 ATPESC 2017, July 30 – August 11, 2017

Data Representations: Glyphs 2D or 3D geometric object to representpoint data Location dictated by coordinate– 3D location on mesh– 2D position in table/graph Attributes graphical entity dictated byattributes of a data– color, size, orientation20 ATPESC 2017, July 30 – August 11, 2017

Data Representations: Contours (Isosurfaces) A Line (2D) or Surface (3D),representing a constant value VisIt & ParaView:– good at this vtk:– same, but again requiresmore effort21 ATPESC 2017, July 30 – August 11, 2017

Data Representations: Cutting Planes Slice a plane through the data– Can apply additional visualization methods to resulting plane VisIt & ParaView & vtk good at this VMD has similar capabilities for some data formats22 ATPESC 2017, July 30 – August 11, 2017

Data Representations: Streamlines From vector field on a mesh (needs connectivity)– Show the direction an element will travel in at anypoint in time. VisIt & ParaView & vtk good at this23 ATPESC 2017, July 30 – August 11, 2017

Molecular DynamicsVisualization VMD:– Lots of domain-specificrepresentations– Many different file formats– Animation– Scriptable– Not parallel VisIt & ParaView:– Limited support for these typesof representations, but improving VTK:– Anything’s possible if you try hardenough24 ATPESC 2017, July 30 – August 11, 2017

ANNOTATION AND MOVIE CREATION25 ATPESC 2017, July 30 – August 11, 2017

Annotation, compositing, scaling ImageMagick– convert, composite, montage, etc.convert comp-test-in-3200x2000.png –font Arial.ttf -pointsize 40 -gravitynorthwest -fill black -draw 'rectangle 18,103,821,157' legend-big.png -geometry 20 105 -composite -fill black -draw 'rectangle 2375,103,3178,157' legendbig.png -geometry 2377 105 -composite -fill black -draw 'rectangle18,1815,821,1869' legend-big.png -geometry 20 1817 -composite -fill black draw 'rectangle 2375,1815,3178,1869' legend-big.png -geometry 2377 1817 composite -stroke '#000F' -strokewidth 3 -annotate 13 155 '0.0' -stroke none fill white -annotate 13 155 '0.0' -stroke '#000F' -strokewidth 3 -annotate 755 155 '25.0' -stroke none -fill white -annotate 755 155 '25.0' -pointsize 40 gravity northeast -stroke '#000F' -strokewidth 3 -annotate 775 155 '0.0' -strokenone -fill white -annotate 775 155 '0.0' -stroke '#000F' -strokewidth 3 -annotate 20 155 '83.4' -stroke none -fill white -annotate 20 155 '83.4' -gravitysouthwest -stroke '#000F' -strokewidth 3 -annotate 13 83 '0.0' -stroke none -fillwhite -annotate 13 83 '0.0' -stroke '#000F' -strokewidth 3 -annotate 702 83'5.0e-27' -stroke none -fill white -annotate 702 83 '5.0e-27' -gravity southeast stroke '#000F' -strokewidth 3 -annotate 775 83 '0.0' -stroke none -fill white annotate 775 83 '0.0' -stroke '#000F' -strokewidth 3 -annotate 28 83 '5.0e-5' stroke none -fill white -annotate 28 83 '5.0e-5' -depth 8 comp-test-image02.png26 ATPESC 2017, July 30 – August 11, 2017

Annotation, compositing, scaling ImageMagick– scale, fade27 ATPESC 2017, July 30 – August 11, 2017

Movie Creation VisIt and ParaView can spit out a movie file (.avi, etc.)– can also spit out individual images Combine multiple segments of frames– Create a directory of symbolic links to all frames in order ffmpeg: Movie encoding– ffmpeg –sameq –i frame.%04d.png movie.mp428 ATPESC 2017, July 30 – August 11, 2017

VISUALIZATION FOR DEBUGGING29 ATPESC 2017, July 30 – August 11, 2017

Visualization for Debugging30 ATPESC 2017, July 30 – August 11, 2017

Visualization for Debugging31 ATPESC 2017, July 30 – August 11, 2017

Visualization as Diagnostics: Color by Thread ID32 ATPESC 2017, July 30 – August 11, 2017

IN-SITU VISUALIZATION AND ANALYSIS33 ATPESC 2017, July 30 – August 11, 2017

Multiple in-situ infrastructuresLibSim34 ATPESC 2017, July 30 – August 11, 2017

Can WE . Enable use of any in situ framework? Develop analysis routines that are portable between codes? Make it easy to use?OUR APPROACH Data model – to pass databetween Simulation & Analysis API – for instrumentingsimulation and analysis codes35 ATPESC 2017, July 30 – August 11, 2017

Data Model: VTK Used by ParaView/Catalyst and VisIt/Libsim Supports common scientific dataset typeshttp://www.vtk.org/ On going independent efforts to evolve for exascale Supports using simulation memory directly (zero-copy) formultiple memory layoutsDATA MODELsimulation36 ATPESC 2017, July 30 – August 11, 2017analysisanalysisanalysis

Sensei: API: ge37 ATPESC 2017, July 30 – August 11, 2017analysis

INSTRUMENTATION TASKSFOR SIMULATION Write a Data Adaptor to mapsimulation data to VTK datamodel Write a Bridge to define APIentry points for simulation38 ATPESC 2017, July 30 – August 11, 2017FOR ANALYSIS Write analysis adaptor that usesData Adaptor API to access Data Transform data, if needed andinvoke analysis

Adding A catalyst Python Script analysis 13 lines of CMake code changes 18 lines of C code In situ work can be specified via SENSEI XML39 ATPESC 2017, July 30 – August 11, 2017

Example with Catalyst Python aptoranalysisadaptorbridge40 ATPESC 2017, July 30 – August 11, 2017CatalystPython ScriptAnalysis

Catalyst Live through python aptor41 ATPESC 2017, July 30 – August 11, sisParaViewServer

I·;D IDcamugPipeline BrowserJbl builtin:(Eopertiei] InformationInformation- StatisticsType:NANumber of Cells:NANumber of Points:NAMemory: NA Data ArraysName IBoundsNANAZ ranqe: NAX range: Y range:I Data Ra!Data TypeI -1 H ffi :Time: o.'· . Xcheckstreamlines-X Y-Y Zimageblanking-z (\ 90makeverts» »

QUESTIONS?Silvio [email protected] ATPESC 2017, July 30 – August 11, 2017Joe [email protected] [email protected]

2 ATPESC 2017, July 30 – August 11, 2017 Visualization & Data Analysis Time Title of presentation Lecturer 8:30 am Visualization Introduction Mike Papka, Joe Insley, Silvio Rizzi, ANL 9:30 am Large Scale Visualization with ParaView (Presentation) Dan Lipsa, Kitware 10:30 am Break 11:00 am Large Scale Visualization