Freie Universität BerlinMaster thesis at Department of Mathematics and ComputerscienceIntelligent Systems and Robotic LabsGenerating Data to Train a Deep NeuralNetwork End-To-End within a SimulatedEnvironmentJosephine MertensStudent ID: [email protected] Examiner:Second Examiner:Prof. Dr. Daniel GöhringProf. Dr. Raul RojasBerlin, October 10, 2018AbstractAutonomous driving cars have not been a rarity for a long time. Major manufacturers such as Audi, BMW and Google have been researching successfully inthis field for years. But universities such as Princeton or the FU-Berlin are alsoamong the leaders. The main focus is on deep learning algorithms. However,these have the disadvantage that if a situation becomes more complex, enormousamounts of data are needed. In addition, the testing of safety-relevant functionsis increasingly difficult. Both problems can be transferred to the virtual world.On the one hand, an infinite amount of data can be generated there and on theother hand, for example, we are independent of weather situations. This paperpresents a data generator for autonomous driving that generates ideal and undesired driving behavior in a 3D environment without the need of manually generated training data. A test environment based on a round track was built usingthe Unreal Engine and AirSim. Then, a mathematical model for the calculationof a weighted random angle to drive alternative routes is presented. Finally, theapproach was tested with the CNN of NVidia, by training a model and connect itwith AirSim.

Declaration of AuthorshipI hereby certify that this work has been written by none other than my person. Alltools, such as reports, books, internet pages or similar, are listed in the bibliography.Quotes from other works are as such marked. The work has so far been presented inthe same or similar form to no other examination committee and has not beenpublished.October 10, 2018Josephine Mertensi


Contents1Introduction1.1 Challenges in Generating Training Data in Automatic Mode . . . . . . .1.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12342State of the Art2.1 Classical Machine Learning Compared to Deep Learning Algorithms .2.2 Elevation of Data for Autonomous Driving . . . . . . . . . . . . . . . . .2.3 Simulation Frameworks for Driving Tasks and Artificial Intelligent Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.3.1 TORCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.3.2 VDrift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.3.3 CARLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.3.4 Airsim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.4 Methods to Follow a Path in a 3D Simulation Environment . . . . . . .6678910101112Concept and Architecture3.1 Approach for Generating Training and Test Data . . . . . . . . .3.2 Challenges to Accomplish . . . . . . . . . . . . . . . . . . . . . .3.3 Simulation Environment . . . . . . . . . . . . . . . . . . . . . . .3.4 Calculation of a Steering Angle in a Simulated 3D Environment3.5 Distribution Model for Deviations in Ideal Driving Behaviour .1414151517204Generation of Training and Test Data with AirSim4.1 Modelling a Single-Lane Track . . . . . . . . . . . . . . . . . . . . . . . .4.2 Extending AirSim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.3 Generating Training and Test Data within the Simulator . . . . . . . . .232324265Applicability of Automatically Generated Data to Train a Neural Network5.1 Evaluation Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.2 Integration of CNN Model to Simulation Environment . . . . . . . . . .5.2.1 CNN Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . .5.2.2 Training of CNN within Simulated Environment . . . . . . . . .5.2.3 Autonomous Driving within AirSim Environment . . . . . . . .5.3 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.4 Interpretation of the Results . . . . . . . . . . . . . . . . . . . . . . . . . .30303131323335366Summary and Future6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3838383Bibliography.40iii


1. Introduction1IntroductionCompanies, research institutes, universities and governments work together to enablecars to drive autonomously on public roads. Google’s car fleet, one of the best-knownexamples, passed the fifth million self-driven mile in February 2018 and the numberseems to grow exponentially, as shown in Figure 1. Google required many years ofdevelopment and research to achieve driving on public roads. Regarding the complexity of driving tasks without human intervention, best-performing approaches usedeep learning to extract patterns (see Chapter 5.2.1).Figure 1: Total autonomous miles driven by WAYMO [27].Deep Learning methods need a high amount of data which is difficult to get andrequires significantly more time to train, compared to classical machine learning algorithms based on statistical methods such as regression. By using those methods amachine is able to generate new features by itself based on the given data. This process can be divided into two main phases. At the beginning data has to be collected,that is discussed in more detail in Chapter 2.2. After that this data will be dividedinto training data for the first phase and test data for the second phase.Concerning general approaches, it is possible to drive manually around the track torecord driving behaviour for the first phase. This process is very time consuming anda solution is required that generates data automatically to train autonomous driving.In this context, it shall be mentioned that there are a lot of successful groups in thefield of autonomous driving. But in this work, only those who use simulated drivingand published their results were considered. Looking again at Google’s car fleet,1

1. Introductionnowadays known as WAYMO [36], we see that they use simulated driving. Theyhave driven 2.7 billion simulated miles in 2017 in combination with 5 million selfdriven miles on public roads „to build the world ‘s most experienced driver“ [27].Additionally, the RAND cooperation determined how much miles an autonomouscar needs to drive to emit a statistical degree of safety and reliability [21], see Table 1.Table 1: RAND cooperation - Driving to Safety [21]Statistical QuestionsHow many miles (years ) wouldautonomous vehicles have to bedriven.(1) without failure to demonstrate with 95%confidence that their failure rate is at most.(2) to demonstrate with 95% confidence theirfailure rate to within 20% of the true rate of.(3) to demonstrate with 95% confidence and80% power that their failure rate is 20 %better than the human driver failure rate of .Benchmark Failure Rate(A) 1.09 fatalities(B) 77 reportedper 100 millioninjuries per 100miles?million miles?(C) 190 reportedcrashes per 100million miles?275 million miles(12.5 years)3.9 million miles(2 months)1.8 million miles(1 month)8.8 billion miles(400 years)125 million miles(5.7 years)65 million mile(3 years)11 billion miles(500 years)161 million miles(7.3 years)65 million miles(3 years) Weassess the time it would take to complete the requisite miles with a fleet of 100 autonomous vehicles (largerthan any known existing fleet) driving 24 hours a day, 365 days a year, at an average speed of 25 miles per hourLooking at the facts in Table 1, a virtual driving simulation has become more necessary to fulfil such requirements. It can reduce the time and harm to people whentesting a car on public streets significantly. Moreover, they enable tests in the reproduction of extreme situations, such as bad weather conditions and slippery roads. Arealistic environment offers the opportunity to test new functions and refine existingones by adapting many parameters without any risk. In this regard, this work investigates how to generate training data in automatic mode within a virtual drivingsimulator to demonstrate autonomous driving on a simulated circular track.1.1Challenges in Generating Training Data in Automatic ModeThis thesis shall to contributes to reducing the expenditure for generating data forneural networks by using a virtual driving environment. For this purpose, the following three main challenges have to be faced: Generating high-quality camera data that allows modelling details like rain orsnow to train realistic weather conditions without the need of adding noise, derive or calculate trajectories for the vehicle to drive automatically with a sufficient coverage for machine learning, train a neural network within the simulation environment to demonstrate andevaluate the approach.2

1.21.2ApproachApproachFigure 2 shows the high-level process to train a neural network to drive autonomouslyon a road. As mentioned in Section 1 the main challenge is given by the high amountof data that is required to train high realistic driving behaviour.Figure 2: Process to drive autonomously on a roadTo reduce the required effort when driving data is collected manually while a humandriver controls the car the part of collecting data will be automated. With the trainingdata generator, described in Chapter 2.2, it is possible to automatically generate datawithin a simulated 3D environment for the training and validation phase to train aneural network. The current market offers many simulation frameworks for drivingtasks (see Chapter 2.3). Therefore, a simulation framework is needed which fulfil thefollowing requirements: Nearly photorealistic rendering to include realistic details such as reflectionsand weather conditions, an accessible framework which can be supplemented by additional functions, open source and cross-platform in order to enable further development, an active repository which offers documentation and tutorials,3

1. Introduction APIs which provide access to car or extraction of camera data.AirSim [29], developed as a researching platform for autonomous driving, meets allof these requirements. It is available as a plugin for the Unreal Engine [5]. The UnrealEngine is a complete suite of creation tools that is typically used to build highlyrealistic 3D-scenes for games.Based on the test environment for the model vehicle of the FZI-Branch Office in Berlina single-lane round track within a closed room will be built in Unreal to proof theconcept. Therefore basic 3D-objects such as walls and road segments are required tobuild the scene which is used later to render the training and test data.The simulation environment will be used to train a convolutional neural network endto-end by collecting camera data from the onboard cameras of the vehicle. For thispurpose, AirSim has to be supplemented by functions for generating data automatically. The data should be individually configurable within a suitable interface to fitthe requirements for the respective use case. The goal is to develop an algorithm thatallows following a given path. To achieve this, the target point closest to the positionof the vehicle must be calculated. Once the target point is present, an algorithm isneeded to calculate the required steering angle in the scene. Since undesired driving behaviour is also to be simulated, an additional method is being developed togenerate specific deviations in the steering angle.1.3MethodologyThe following work is divided into four main parts. In the beginning, common methods to collect data to train neural networks and common simulation frameworks forautonomous driving will be introduced. The appropriate 3D simulation frameworkis chosen by comparing the mentioned frameworks based on various criteria such asaccess to the code, supported operating systems or technical requirements. This is followed by a small overview of different path planning strategies in 3D environmentswhich are suitable for autonomous driving.The second part deals with the concept and architecture of the framework to generatedata. An use case will be derived to decide how the round track will be designed andthen connected to AirSim as simulation framework. Additionally, the mathematicalmodel that is used to calculate the steering angle is described in detail. Finally, theconcept of generating undesired driving behaviour is explained which is based on aweighted random value that defines the deviation from the optimal steering angle.The third part handles the implementation of the announced approach together witha description of how the scene is built in Unreal Engine. This is followed by anexplanation of the changes made in the AirSim Client to calculate the steering anglewithin the scene.Following this, the implementation of a convolutional neural network (CNN) developed by NVidia [24] and an evaluation of the training results takes place. This includes a detailed description of how AirSim can be used for training and validation4

1.3Methodologyof the neural network. The automatically and manually generated data will be used totrain the CNN within the simulated environment. Afterwards, the results were evaluated related to the pre-defined challenges. It integrates a summary and discussionof the results as well as an outlook on its future development.5

2. State of the Art2State of the ArtThis chapter introduces the dependency between the required data to train autonomousdriving and machine learning algorithms. Criteria were derived to chose an appropriate simulation environment. After that, a number of simulation frameworks thatenable the exploration of deep learning algorithms are presented and compared onvarious criteria. This is followed by a small overview of path planning strategies in3D environments which are suitable for autonomous driving.2.1Classical Machine Learning Compared to Deep Learning AlgorithmsThe nature of the input data needed for the neural network depends on the underlyingmachine learning algorithm. In general, we can roughly distinguish between classicalmachine learning and deep learning algorithms.Classical machine learning algorithms require classified data as input. This meansthat each object in the scene is labeled. With the classified data, the vehicle can learnto distinguish between people, other vehicles, the road and objects next to the roadlike trees or buildings. Table 2 provides a brief insight into the suitability of somelabeling types associated with the complexity measured by time and costs for thecustomer. Further details of the announced labeling methods are presented by thepaper from Fisher et al. [38]. Looking now at large companies like Uber, Tesla, Lyft orWaymo, who have huge car fleets which collect millions of hours of driving material,we can imagine that it is nearly impossible to label all of these data.Table 2: Brief insight about costs and complexity of some labeling types provided bythe company Playment [25]Type ofLabeling2D BoundingBoxesSemanticSegmentationVideo3D CuboidsPolygonSuitabilityTime perAnnotationObject detection and localizationin images and videosLowestPixel level scene understandingHighestLocate and track objects frame byframe in a sequence of images3D perception from imagesand videosPrecise object shape detection andlocalization in images and videosLowCost siveHighExpensiveModerateExpensiveDue to the immense effort caused by classifying and labeling a scene and the fact thatthe labeling is usually done manually, deep learning methods can be considered asalternative. These methods only require a small training dataset and independentlyderive rules based on the further unprocessed input data. Often used deep learningalgorithms are for example imitation [7] and reinforcement learning [10].6

2.2Elevation of Data for Autonomous DrivingHigh-resolution images contain plenty of features. The challenge of the deep learningalgorithms is to filter the right ones, for example, the recognition of road markings.Due to the high information density of the image data, deep learning algorithms require a lot of data. Furthermore, they cannot be trained in the real world or only withvery high effort, because the vehicle will learn to avoid collisions with objects by driving against them. Therefore, 3D simulation environments are required to provide asafe training and learning environment with the possibility to generate as much dataas the algorithm will need to learn realistic driving behaviour. In order to derive criteria that lead to the selection of the appropriate simulation environment, the followingsection introduces common data formats that are used to train autonomous driving.2.2Elevation of Data for Autonomous DrivingAs mentioned in the section above cars which own the possibility to drive autonomously need at first a high amount of data for the training and validation phasedepending on the machine learning algorithm. Now, the following describes the common formats and how the data for this two main phases is collected.The type and format of the generated or collected data depend on the installed sensorson a virtual or real car and also belongs to the required information by the neuralnetwork. Normally, the system consists of high-resolution cameras, light detectionand ranging sensors also known as lidar and radar systems, shown in Figure 3.(a) Virtual car sensor system of Waymo [27].(b) Real car sensor system of FU Berlin [2].Figure 3: Example of installed sensors on real cars.Together, all information gained by the sensors represents a driving behaviour relatedto a situation captured by one or more cameras. Training data usually consists ofideal and undesired driving behaviour, such as turning in on a straight track due tomicrosleep or inattention of the driver.The data is often recorded while a human controls the vehicle through various situations over multiple hours and miles. This method was used, for example, by theresearchers at the Karlsruhe KIT to create the KITTI dataset [16], by the Daimler AGR&D et. al to collect the Cityscapes dataset [9] and by Xinyu Huang et. al to collect7

2. State of the Artthe Apollo dataset [20]. Further, Table 3 shows the recorded training data, collectedby a team from Udacity. Where they recorded 70 minutes of driving data throughmountains. Which means 223 GB of image and log data open source accessible onGitHub [34]. Each entry contains the actual position (represented as latitude and longitude), gear, break, throttle, speed and the steering angle related to an image frame,shown in Figure 4.Table 3: Sample log of mountain driving dataset collected by 96Steering 6.222222Figure 4: Sample images from Udacity Mountain dataset [34].The datasets include data in form of a variety of weather conditions, nature and cityenvironments and different terrains. The simulation environment should, therefore,be able to reflect these situations and calculate different information such as the current position in GPS coordinates, the orientation of the car e.g. by an IMU sensor ora distance measurement by one or more lidar sensors. The information collected bythe sensors and the captured camera views are used to set control commands such assteering, velocity or brake inputs to the car.In the following, thus simulation environments are discussed which are used for autonomous driving research and compared according to the aforementioned criteriafor selection of the appropriate simulation environment.2.3Simulation Frameworks for Driving Tasks and Artificial Intelligent ResearchIn the following, simulation environments are presented that have already been usedin research to generate data for autonomous driving. Furthermore, only results from8

2.3Simulation Frameworks for Driving Tasks and Artificial Intelligent Researchcompanies and research institutes who have published their work and provide opensource access to it are considered. Finally, the mentioned frameworks are comparedto each other in terms of:1. Extensibility and flexibility,2. full access to scenery and vehicle parameter,3. physical model,4. sensor model integration,5. graphical quality,6. simulating weather conditions, different terrains.2.3.1TORCSThe open car racing simulator (TORCS) was first developed by Eric Espieé andChristoph Guionneau in 1999. Since 2005 it’s managed by Bernhard Wymann andopen source available under GNU GPL license [37] as open source software. Theprimary purpose is to support the development of AI-driven cars next to manualdriven cars. A user can choose between popular Formula 1 car models and differentracing tracks, where the variety could be extended by add-ons. A big advantage isgiven by the opportunity to add a computer-controlled driver or an own self-drivingcar to the game if it’s written in C/C . Furthermore, the user is able to let the selfdriving cars drive against up to 8 other players on the TORCS racing board platform[3], shown in Figure 5. Furthermore the portability, modularity and extensibility ofthe framework are reasons, why TORCS is often used as a base for research purposes.Figure 5: Sample front view torcs[3].The Princeton Vision Group used TORCS to evaluate and test their approach to learning affordance for direct perception [5]. They collected screenshots associated with9

2. State of the Artspeed values as training data, while the car was controlled manually around thetrack. It’s also useful to capture different views of the lane and generate datasets [22].2.3.2VDriftVDrift represents a car racing simulator with drift racing in mind. It’s open source,cross-platform and released under GNU GPL license [35]. Created by Joe Venzon in2005 with the goal to accurately simulate vehicle physics and to provide a platformfor artists and developers. VDrift offers over 45 tracks and cars based on real-worldmodels with several different camera modes.Figure 6: Sample front view vdrift [35].The BMW group developed a framework for the generation of synthetic ground truthdata for driver assistance applications [17] in cooperation with the TU Munich. Theymodified the VDrift game engine to simulate real traffic scenarios to generate data fordifferent image modalities, like depth image, segmentation view or optical flow. Theyused the integrated replay functionality and drove the track multiple times, whilethey record each round. In this way, a scene with multiple vehicles could be createdto simulate traffic. The limited graphics resolution and low-resolution textures are themain disadvantages of this engine.2.3.3CARLACARLA is an open source framework based on the Unreal Engine [14], which is ableto render high realistic textures, shadows, weather conditions and illumination, asdepicted in Figure 7 [11]. It was released in 2017 under MIT license and developed bya group of five research scientists to "support development, training, and validationof autonomous urban driving systems" [12].10

2.3Simulation Frameworks for Driving Tasks and Artificial Intelligent ResearchFigure 7: Scene with vehicle parameters captured from the town environment inCARLA.As part of the research, it’s the only framework, which provides an autopilot module.This is realized by encoding driving instructions to the ground, see Figure 8. Afunction iterates over each pixel of the scene and calculates the direction to encode theassociated colour. The green colour defines the driving direction for the lane. Whitecolour points signal intersection areas, where the car can decide between multipleroutes. Areas next to the road are displayed with black pixels. The car does not getan instruction and the car will turn left to get back to the road.Figure 8: Debug view of road instructions rendered in the scene [12].2.3.4AirsimAirSim is developed by the Microsoft Research Group with the aim to provide a platform to experiment with deep learning, computer vision and reinforcement learningfor autonomous vehicles [30]. To use this framework an 3D environment built withthe Unreal Engine is needed first. For this, there are prebuilt binaries of some envi11

2. State of the ArtFigure 9: Screenshot of the Neighborhood Package [13]ronments such as the neighbourhood environment shown in Figure 9. These binariescan be used to train neural networks or t collect data.Even though the age of just one year, there are many research institutes, which areusing AirSim in their projects, who are listed on the GitHub repository [1]. The centrefor intelligent systems laboratory, CS, Technion is one of them, who use AirSim ina self-driving simulation manner for an autonomous Formula Technion project [19].Their approach was to teach a formula car to drive fully autonomously through theracing track, based on reinforcement and supervised imitation learning approaches.This work only considers open source frameworks. Therefore all frameworks offerthe possibility to access and extend the entire structure and parameters of the vehicle.Furthermore, all frameworks have a physics engine that can represent realistic drivingbehaviour. Related to section 2.2 radar such as Lidar, IMU and GPS sensors are oftenused in reality. The whole sensor suite is only available at CARLA. AirSim providesthe mentioned sensors except for the Lidar sensor. Simulation frameworks such asVRep [8] or Gazebosim [15] offer an even larger sensor suite but are not considereddue to the low frame rate during the scene rendering. Furthermore, it is not possibleto build a photorealistic 3D environment with different weather conditions. But thisis possible by using the Unreal Engine which comes with an editor to build realisticenvironments.Therefore only AirSim and CARLA are considered in the following. Looking atCARLA there is only a small number of tutorials and documentation available. Furthermore, it is written for Linux operating system and difficult to install on otheroperating systems. Therefore the Unreal Engine in combination with AirSim willbe used as an appropriate simulation environment. The advantages of AirSim arediscussed in detail in Chapter 3.3.12

2.42.4Methods to Follow a Path in a 3D Simulation EnvironmentMethods to Follow a Path in a 3D Simulation EnvironmentIf the vehicle shall be able to drive autonomously in the 3D environment a path tofollow is required. There are different methods available to achieve this. Codevillaet al. developed an algorithm based on conditional imitation learning that allows anautonomous vehicle trained end-to-end to be directed by high-level commands [7].The driving command is then produced by a neural network architecture. A furtherapproach is given by Chou and Tasy [6], who developed an algorithm for trajectoryplanning which uses a range sensor to take the path where no obstacle is detected.A third method developed by Chue et al. uses predefined waypoints, which arethe base frame of the curvilinear coordinate system. The optimal path is derived byconsidering the path safety cost, path smoothness and consistency.The calculation of trajectories is a complex task, with a high computational effort.Therefore the approach of Chu et al. is adopted. Based on the geometry model of theroad, a spline is calculated from several points to calculate the steering angle to thenearest point on the path. To enable driving on alternative roads, the steering angleis adjusted using a distribution function and a given error rate.13

3. Concept and Architecture3Concept and ArchitectureThis chapter presents an approach for generating training and test data to train aneural network on how to drive autonomously in a simulated environment. Thisincludes the experimental setup together with the chosen framework AirSim and themathematical model, which calculates the steering angle within a geometrical model.At the end, I introduce an algorithm to generate ideal and undesired data by a givendistribution function and an error rate.3.1Approach for Generating Training and Test DataThe high amount of required data to train realistic driving behaviour is the mainbottleneck behind the concept of autonomous driving, as mentioned in Chapter 2.1.Therefore a data generator, shown in Figure 10, should be developed.Figure 10: Process to generate data within the driving environmentThe approach, shown in Figure 10, is to use AirSim to calculate a steering anglewithin the driving environment (step 1). After that, the steering angle is passed toa python client (step 2) which calculates a randomized weighted offset that is addedto the steering angle. The calculated offset should depend on an input error rate andon a given distribution function. The resulting angle can then lead to the ideal orundesired driving behaviour. In step 3 and 4, the weighted angle is applied to the carwithin the driving environment by AirSim. The last step handles the writing of imagefiles and related information such as the applied steering angle, the image path, theposition data of the car and the distance between the car and the centre line of thetrack.14

3.2Challenges to AccomplishWith the aim to generate realistic driving behaviour it is important to simulate theideal driving behaviour in combination with undesired one. Undesired driving behaviour describes situations where the driver steers the vehicle in a direction thatdoes not correspond to the destination for example, due to carelessness or fatigue.Furthermore, the steering angle of human drivers varies slightly while driving, forexample, to react to road irritations or to avoid holes or small

A test environment based on a round track was built using the Unreal Engine and AirSim. Then, a mathematical model for the calculation . data generator, described in Chapter2.2, it is possible to automatically generate data within a simulated 3D environment for the training and validatio