
Transcription
Software RequirementsSpecificationforLinux File RAID[SQA Example]Version 1.0 draftPrepared by Sam Siewert[To be updated, extended and improved by students]Embry Riddle Aeronautical UniversityComputer, Electrical, Software EngineeringSoftware Engineering ProgramFebruary, 19, 2018Copyright 1999 by Karl E. Wiegers and 2018 by Sam Siewert. Permission is granted to use, modify, anddistribute this document.
Software Requirements Specification for Linux File RAIDPage iiTable of ContentsTable of Contents . iiRevision History . iii1. Introduction .11.11.21.31.41.5Purpose . 1Document Conventions . 1Intended Audience and Reading Suggestions . 2Product Scope . 2References . 32.12.22.32.42.52.62.72.8Product Perspective . 3Product Functions . 7Use Cases . 9Use Case and Requirements Tracing. 10Operating Environment . 10Design and Implementation Constraints . 11User Documentation . 11Assumptions and Dependencies . 113.13.23.33.4User Interfaces . 12Hardware Interfaces . 12Software Interfaces . 12Communications Interfaces . 124.14.2System Feature 1 . 13System Feature 2 (and so on) . 135.15.25.35.45.55.65.75.8Checkout and build . 14Running Unit tests . 16Path coverage. 18Profiling . 19I&T scripts . 19System tests and scripts . 21Regression tests and scripts . 22Acceptance tests . 256.16.26.36.46.5Performance Requirements . 26Safety Requirements. 26Security Requirements . 27Software Quality Attributes . 27Business Rules . 272. Overall Description .33. External Interface Requirements .124. System Features.125. Test Driven Design Reference Prototypes .136. Other Nonfunctional Requirements .267. Other Requirements .28Appendix A: Glossary.28Appendix B: Analysis Models .28Appendix C: To Be Determined List .29
Software Requirements Specification for Linux File RAIDPage iiiRevision HistoryNameDateReason For ChangesVersionSam Siewert2/19/2018Initial creation or draft to provide documentation forprototype.1.0
Software Requirements Specification for Linux File RAIDPage 11. IntroductionNote that this reference design included reference prototype code, which can be obtained from annon-managed source for convenience as noted in the text below, but the ideal place to obtain thelatest reference code corresponding to this documentation is ID-Unit-Test ). I will make every attempt to keepthe code and documentation in reasonable synchronization, but it is up to the reader and thesoftware engineer re-using this example to “make it their own”. If you want to check inimprovements, you can request to be added to my github, but otherwise the intent is for you to cloneas a starting point and fork your own project including documentation, analysis, design, and codefiles.1.1 PurposeThis SRS provides an overall specification for a Linux File RAID version 1.x.x (Redundant Array ofInexpensive Disk) library of operations as well as example application that can be used to createstorage as a service, similar to commercial products such as Google drive, MS One drive, AWSS3, or Dropbox. This specification and the associated software is not intended to be a product orCloud service, but serves as a pedagogical example of how to analyze, design and build softwarelike these products, starting with the data storage and protection services, to which students canadd features such as web access, security, user friendliness (e.g. browsing), and a wide range ofadditional features to create their own STaaS (Storage as a Service) application. File level RAIDwas selected rather than block storage DAS (Direct Attach Storage) or SAN (Storage AreaNetworking) for simplicity and portability so that a derived STaaS application can be hosted almostanywhere on an platform file system. The Linux File RAID software is composed of a library ofRAID write (encode), read (decode), rebuild (reconstruction of lost or corrupted file chunk) andbasic data compare functions, a file storage and retrieval system with data protection, and unit testcode for the library along with basic system testing for the file RAID application. The library codeis intended to be re-useable (perhaps with modification and extension) for use in new file RAIDapplications for new platforms or to provide new STaaS with a web interface. The goal is tosupport learning objectives related to software analysis, architecture, design, and implementationquality assurance along with testing examples at the unit, I&T (Integration and Test), system, andacceptance testing levels.1.2 Document ConventionsThis document is based upon the IEEE SRS 830-1998 template and documentation standards forsoftware requirements and specification (IEEE 830-1998, 29148-2011).In this document, UML analysis and design methods are used, where requirements must correlateto use cases, which have hierarchy and priority. Thus, the requirements are prioritized and placedinto hierarchy based upon UML use case hierarchy and priority analysis. This is consistent withprinciples learned in SE 300 (SA/SD - Structured Analysis and Design), SE 310 (Object OrientedAnalysis - OOA, OO Design - OOD, OO Programming - OOP) and SE 420 (Software QualityAssurance).
Software Requirements Specification for Linux File RAIDPage 21.3 Intended Audience and Reading SuggestionsThe primary readers of this document are anticipated to be upper division undergraduate studentsworking on exercises and projects related to coursework in software engineering analysis, design,and quality assurance. Given the educational goals, this SRS contains both SA/SD as well asOOA/OOD methods of specification such that the analysis and design supports both standardmodular procedural programming implementations (e.g. C) and object oriented class structuredprogramming implementations (e.g. C ).The SRS contains a complete set of:1. Concept goals and objectives,2. Requirements identification through use case analysis,3. System, architecture and module analysis and design (SA/SD and OOA/OOD) including:a. Requirements specification,b. System use cases and requirements consistency, completeness and correctnessanalysisc. Architectural module cohesion and coupling analysis for SA and/or CRC (ClassResponsibility-Collaboration) analysis for OOAd. Module design (architectural decomposition into abstract and concrete modules orclasses and interfaces between them and behavior),e. Modules detailed design and prototype (detailed structural and behavioralspecification of modules – class and/or package diagrams and directory structureas well as flow-charts or activity diagrams, interaction sequence diagrams, andstate machines) along with C or C module prototype code.4. Requirements validation and verification with acceptance test plan5. System design (use cases and high level abstract modules) and end-to-end test plan of aCSCI (Computer Software Configuration Item)6. I&T test plan for architecture CSC (Computer Software Component)7. Unit tests for modules or CSUs (Computer Software Units)Overall, the SRS is intended to provide a reasonably complete specification of the example LinuxFile RAID software at the level of a working example that can be improved by students forpractice and as a starting point if they wish for a more complete STaaS final project.1.4 Product ScopeGoals for this SRS and associated, analysis, design, testing validation and verification andprototype software are strictly educational. The learning objectives include experience reviewing,improving, refactoring, finding defects, composing fixes and regression testing, all levels ofanalysis, design, and implementation of this example.Specific objectives include learning objectives related to each phase of the “V” model, with an Agileevolutionary approach to analysis, design, development and test. Specficially, it is assumed thatrather than strict waterfall phases of development, students will make quick two week sprintsthrough phases of analysis, design, and implementation with a test-driven design approach(consideration of testing first or at least concurrent with design).By the time a student is done reviewing and improving this reference SRS and associatedsoftware at the unit level (CSU – Computer Software Unit) and test application (CSCI – ComputerSoftware Configuration Item) level, they should have a good working knowledge of test-drivenevolutionary software development. This can also be a starting point for a longer term project or
Software Requirements Specification for Linux File RAIDPage 3even a capstone design project to build an entire system service (e.g. STaaS) or platform hostedapplication suitable for deployment.1.5 References1. https://www.dau.mil/glossary/Pages/Default.aspx2. International Organization for Standardization/International ElectrotechnicalCommission/Institute of Electrical and Electronics Engineers (ISO/IEC/IEEE) Standard24765:2010: Systems and Software Engineering – Vocabulary3. https://www.computer.org/web/swebok/v3, Bourque, Pierre, and Richard E. Fairley. Guideto the software engineering body of knowledge (SWEBOK (R)): Version 3.0. IEEEComputer Society Press, 2014.4. http://agilemanifesto.org/5. https://www.omg.org/spec/UML/2.5.1/, UML 2.5.x specification6. Kendall, Penny A. Introduction to systems analysis and design: a structured approach.Business & Educational Technologies, 1995.7. Rumbaugh, James, Ivar Jacobson, and Grady Booch. Unified modeling language referencemanual, the. Pearson Higher Education, 2004.8. https://raid.wiki.kernel.org/index.php/RAID setup, Linux software RAID using MDADM.9. Welch, Brent, et al. "Scalable Performance of the Panasas Parallel File System." FAST. Vol.8. 2008.10.Libes, Don. Exploring Expect: a Tcl-based toolkit for automating interactive programs. "O'Reilly Media, Inc.", 1995.2. Overall Description2.1 Product PerspectiveThe purpose of the Linux File RAID software libraries and example application is to provide aworking example of data storage with data protection and to serve as a starting point for a studentwishing to build a more complete application or storage service and to practice QA and analysisand design methods to improve this example. As such, the design includes a basic library of RAIDoperations and the example File RAID application simply provides ability to store files in chunksthat can be mapped to distinct and separate storage subsystems to prevent loss of data when astorage subsystem fails or is lost (a chunk erasure). While RAID is most often implemented at ablock and HDD (Hard Disk Drive) or SSD (Solid-State Disk) device level, it can also beimplemented at a file level where large files are automatically broken down into smaller chunks(sub-files) that are distributed across multiple file systems to provide data protection equivalent toblock level RAID, but with more portability and simplified testing compared to block level RAID.For students, it is not practical to expect them to have access to SAS/SATA (Serial Attached SCSIor Serial Attached ATA) disk array with root access on a Linux system, so the Linux File RAIDallows them to learn about RAID and at the same time learn about software system (service) and
Software Requirements Specification for Linux File RAIDPage 4application analysis, design, implementation and test. Students wishing to compare block RAID tofile RAID may want to consult Linux MDADM (Multi-Disk Administration) and examples of use withRAM (Random Access Memory) disks for tutorials on configuration, use and operation of softwareRAID without dedicated HDD arrays. The Linux File RAID is mostly for learning and practice withRAID, but could be used in an actual deployment as long as each file chunk is mapped to a uniquefile system.RAID has levels including striping (level 0), mirroring (level 1) and XOR parity (level 5) as well asdouble XOR parity (level 6). These levels can be combined so that unique block devices arestriped and mirrored (level 10 or sometimes called 0 1) or striping of XOR parity sets (level 50).Furthermore, the number of independent disks (or file systems) is variable, but normally rangesfrom the minimum of a mirror pair (two disks or file systems) up to any number of three or more forXOR parity schemes. The fundamentals of RAID systems are covered in CS 317, file anddatabase systems, and many excellent references and examples such as Linux block levelsoftware RAID (MDADM). The same principles used to map blocks (typically 512 bytes of data ata time up to 4 kilobytes) onto a set of independent HDD or SSD storage devices can also be usedto break a large file up into file chunks (sub-files) that can be mapped onto multiple independentfile systems. The block level is the most common RAID system type where a file system is thenbuilt on top of this new RAID volume (virtual block storage device). However, file level is used inreal products (e.g. Panasas - https://www.panasas.com/ ) and serves as a convenient alternativeto dealing with low-level block devices. All of the same data protection can be achieved as long asfile chunks are mapped to independent file systems rather than independent storage devices. Thisapproach is actually taken for scale-out file systems (federated file systems) and is an alternativeto SAN (Storage Area Networking) and can work with NAS (Network Attached Storage) or be usedas a NAS alternative. It is potentially a great way to provide STaaS (Storage as a Service) wherethe main goal is to provide a “bit bucket” to share files, back them up, or provide a web baseddistribution for files. Figure 1 shows a basic RAID 10, where mirror pair block level storagedevices (HDD or SSD) are striped.Figure 1. RAID-10 striping and mirroring of data blocks on 6 storage devices
Software Requirements Specification for Linux File RAIDPage 5The idea behind RAID-10 is to provide speed-up for reads and writes by using multiple disk drivesin parallel to read portions of a larger set of blocks composing a file or other storage object and atthe same time, mirror blocks onto multiple independent disk drives in case one fails. If one doesfail, a data erasure, the lost data can simply be read from any device that mirrors the same data.When the failed disk drive is replaced by a disk array technician, the system can restore themirroring automatically. So, RAID has two main system level design goals – speedup by writinglarge data sets to many disks in parallel by chunking the original data into blocks as well asproviding protection against data loss due to temporary device malfunction or failure. The key toRAID is that the blocks (or more generally chunks) of data must be allocated (mapped) toindependent devices or file systems to provide actual speed-up and data loss protection whendevices (or file sub-systems) fail. In order to simply learn about RAID and practice with theoperations, actual independent devices are not required, but not speed-up and no failure faulttolerance will actually be realized – for example if file RAID is used with just one file system ratherthan multiple independent. However, not only can one learn about RAID, the basic features andfunctions can still be tested, and when a system with “n” file systems or “n” devices is available,then the speed-up and failure fault tolerance can likewise be tested (performance constraints anderror recovery requirements). Figure 2 shows a basic RAID 50, where XOR parity block levelstorage devices are striped.Figure 2. Striping of data blocks over two 4 1 distributed parity RAID-5 sets
Software Requirements Specification for Linux File RAIDPage 6For RAID-50, RAID-5 sets are striped and each RAID-5 set of devices can in fact be any numberof devices as long as each set is one parity plus at least one or more data devices – typically n 1(5 shown here). While it’s possible to make one of the devices a dedicated parity device, mostoften the parity is distributed, and parity for any particular write set (n blocks or chunks) mustsimply be placed on an independent device ( 1 device) so that any missing block can berecovered by re-computing the parity of the remaining blocks.Due to ability to emulate and test a wide range of RAID configurations with file RAID, the designdoes not provide any block level RAID capability, only file chunk RAID. This supports learningobjectives better by allowing students to run code on any Linux system (and easily port thesolution to a wide range of POSIX compliant systems such as Mac OS-X, Free BSD, etc.)To simplify and allow for experimentation with RAID anywhere, the Linux File RAID simply modelseach independent storage chunk in a RAID mapping as a sub-file that composes a larger file asshown in Figure 3.Figure 3. Linux File RAID Dataflow
Software Requirements Specification for Linux File RAIDPage 72.2 Product FunctionsThis section summarizes the major functions the Linux File RAID system or application mustperform. Details are provided in Section 3, so only a high level summary list is provided here. Thefunctions are organized into high level capability requirements, test requirements (for test drivendesign goals) and system constraints (performance, scaling, sizing). A class diagram is providedthat goes beyond what is currently prototyped to provide ideas for extending the design to includeadditional user authentication, access control, and session features as well as file, directory andfile date, time, and version information.The product must perform:1. Storage of entire file as chunks on file systems A, B, C, D and a dedicated XOR file system2. Read of chunks contained in a file from A, B, C, D and XOR files to reconstruct theoriginally written file3. Recovery of any lost chunk from the original file in order to restore the file4. Detection of corruption of any chunk and recovery of data if one chunk is corrupted5. Ability to map chunks to any file system (same for testing) or independent (for safe use)6. Management of original file names and association with storage chunks for reconstructionand basic file manipulation operations (copy, delete, rename)7. Error reporting if two or more chunks are lost or corrupted (recovery is not possible)Test requirements:1. Show that any chunk, A, B, C, D or XOR can be deleted or corrupted and recovered2. Show that any file size up to constraint size can be written and stored3. Show that read and recovery results in an identical file to that written by full data compareor a data digest4. Test basic number of RAID writes per second possible5. Test basic number of RAID reads per second possible6. Test immediate read-after-write RAID operations possible with our without data compareSystem constraints:1. Storage of files up to 1 gigabyte in size (e.g. typical documents, image files, code, text,binary files)2. RAID operations of 100 K read/write operations per second or better in the library such thata 1 gigabyte file could be written at a rate of 50 kilobytes/second or better, completing filetransfer of a megabyte in 20 seconds or less
Software Requirements Specification for Linux File RAIDPage 83. The current prototype can only handle files, not directories of multiple files, but this couldbe an extension pursued for a project or feature addition exercise which would requiremanagement of namespace metadata.4. The current prototype does not implement any file access control or authentication, but thiscould be pursued as a feature addition exercise or project.Figure 4 shows a basic class diagram for file RAID. This model can be found as a workingexample as a UML model for Modelio http://mercury.pr.erau.edu/ siewerts/se310/design/Modelio/3.4-SD/Figure 4. More fully featured class diagram for Linux File RAID STaaS or ApplicationNote that Figure 4 shows class attributes and associations that go beyond what the currentprototype has implemented such as:1.2.3.4.5.File attributes including time and date, revision, and typeFile system name space including directory and file pathAccess controlIndexing for searchSession management with user authenticationA significant, but useful feature that could be added is journaling, whereby all versions of the samefile (same name and path) could be preserved at a chunk level (which therefore works for any filetype including binary as well as text) where only modified chunks are re-written on update. This isa complex feature, but many file systems and RAID systems do provide this journaling feature bycomparing chunks and writing all changed chunks with time and date metadata. One major
Software Requirements Specification for Linux File RAIDPage 9challenge of journaling are the potential read-modify-write operations (and associated workload)for this feature.2.3 Use CasesTwo main use cases are anticipated for this software including STaaS where the software providesback-end storage for a front-end web interface, much like Dropbox or AWS S3, but with specificoptions for data protection and possibly journaling features. The second use case is a basicapplication that provides software RAID on a Linux system with a browser application assumingthe Linux system has multiple file systems mounted on multiple storage devices to provide dataprotection. Figure 5 shows uses cases for both STaaS and browsing locally for RAID-10 andRAID-50.Figure 5. Use cases for Linux File RAIDFor STaaS, it is assumed that users will mostly want a basic archiving capability for files to beused for backup, file sharing, and file distribution. Common uses would be storing, sharing anddistributing digital photos, design files, or other file types other than code. Code is most often beststored in a CMVC system, so while source code files could be stored, the system provides nospecific advantage and is intended to mostly provide a bit bucket for files.For a local Linux File RAID browser and data protection application, the user will want the filesystem to be as easy to use as current command line directory manipulation tools and graphicalfile explorers commonly available on Ubuntu. Basic access control (user, group and otherpermissions) and user authentication should be supported.The key advantage of the Linux File RAID STaaS and browser are the built in speedup and dataprotection features, which may be sufficient to make the service and applications built fordeployment using this software design advantageous compared to block level software RAID(easier to use on a directory basis rather than volume of disks in array). However, additional
Software Requirements Specification for Linux File RAIDPage 10features such as journaling, directory level authentication, web access to Linux File RAID andother feature extensions possible could increase value and advantages for users. These usecases could be broken into smaller more specific use cases and expanded based upon theintended application of STaaS or a local browser.2.4 Use Case and Requirements TracingAs studied in SE 310, use cases and requirements can be correlated to validate that all requirementssupport a use case and all use cases are supported by requirements. Furthermore, use cases can beprioritized (or requirements as is often done by the commercial industry if all requirements are notmission critical). For example:This should be completed as a validation of requirements – are the right requirements and use casesdriving what is built so that we build the right thing. Whether the design has been built right(verification) can be determined by steps such as code generation from CASE (Computer AidedSoftware Engineering) tool code generation (e.g. Modelio class C generation, MySQLWorkbench schema SQL generation, etc.) as well as prototyping, design walk-throughs, and codewalk-throughs along with SQA test plans, scripts, and drivers outlined in section 5 for test-drivendesign with prototyping.2.5 Operating EnvironmentThe operating environment for Linux File RAID is a standard Linux platform such as Ubuntu,CentOS, or RHEL Linux running on a scalable server or workstation that is configured to havemultiple storage devices and file systems. This could be as simple as two disk drives, each with astandard file system (e.g. ext4) for a RAID-1 configuration, but scaling up to an even number ofstorage devices and file systems for RAID-10, and scaling up to a number of equally sized RAID-5sets that can be striped for RAID-50. For data protection each file system must be on anindependent device, but for basic testing, the software can be run on one partitioned storagedevice with multiple file systems or even on one file system with multiple directories (both of whichprovide no protection, but allow for testing). No specialized hardware is required otherwise, whichis a distinct advantage of file RAID compared to block RAID.
Software Requirements Specification for Linux File RAIDPage 112.6 Design and Implementation ConstraintsDevelopers may test on an unsafe system with one file system or fewer file systems than requiredfor the RAID level, but all deployed systems should have a sufficient number of independent filesystems for data protection so that a user is not mislead into thinking their data is protected whenit’s not – the unsafe configur
RAID without dedicated HDD arrays. The Linux File RAID is mostly for learning and practice with RAID, but could be used in an actual deployment as long as each file chunk is mapped to a unique file system. RAID has levels including striping (level 0), mirroring (level 1) and XOR