- Project report
One of the main application fields of derived LiDAR 3D point clouds is the generation of DTMs, which have revolutionized topographic terrain capturing (KRAUS & PFEIFER 2001). DTMs describe the bare Earth surface as a mathematical representation in a digital form, while digital surface models (DSMs) describe the top surface of the ground including natural and man-made objects such as plants, buildings etc. (VOSSELMAN & MAAS 2010). Traditional approaches for the acquisition of DTMs through manual field measurements (stereo-photogrammetry or tachymetry) are difficult to apply in riverine environments with steep river banks and dense vegetation. They have mostly been replaced by state-of-the-art semi-automated laser scanning techniques (MANDLBURGER et al. 2009). Furthermore LiDAR DTMs provide higher vertical and horizontal accuracy (sub-meter) and high resolution compared with digital elevation models derived by satellite remote sensing. Further advantages are that acquiring data is relatively affordable and that it is possible to receive the ground points under the canopy cover as well as a high potential of automated post-processing (HOHENTHAL et al. 2011). While Airborne Laser Scanning (ALS) is more suitable for area-wide surveys, Terrestrial Laser Scanning (TLS) can generate more accurate data, higher resolution, better flexibility and cheaper surveys. Field campaigns usually can be done by two people within one day; however, the survey area is limited.
The aim of Working Group I of the HydroChange³ study project is the creation and implementation of an automated workflow to derive a DTM which can be used as geometric basis for further investigations, for example roughness mapping and multi-temporal volume balances of the gravel bar (see chapter 4 and 5). The derived 3D point cloud provides the basis of the investigation. The main task within the DTM generation and for geomorphological applications in general is to distinguish terrain points and off-terrain points (in this case vegetation), the so-called process of filtering. There are several filter approaches suggested by PFEIFER & MANDLBURGER (2008) with different complexity and performance, but there is no best or standardized algorithm. One possibility to filter is the geometric parameter Echo Ratio (ER) which is derived from the vertical distribution of laser points and was successfully used for vegetation classification in ALS (HÖFLE et al. 2009; 2012) and TLS (HÄMMERLE et al. 2013). Recent approaches for identifying and classifying objects such as vegetation are using the full-waveform of the recorded point cloud and derived parameters therefrom (e.g. HÖFLE et al. 2012).
Previous to DTM generation some pre-processing of the point cloud is necessary, such as registration and geo-referencing (Chapter 5). Based on this the actual DTM generation can be processed. As the point cloud includes both the terrain surface and off-terrain objects (vegetation) the point cloud has to be filtered whereby these objects are being removed. The filtering can be done in the raw point cloud or in the already rasterized DSM, as well as by a combination of both (HÖFLE et al. 2009). Rasterization aggregates the points to pixel cells which is always accompanied by a loss of 3D information but also provides a better data handling (HÖFLE & RUTZINGER 2011). There are several options to build raster cells from the point cloud, using, for instance its lowest, highest or mean Z value (elevation) within the raster cell’s area in given defined resolution (PFEIFER & MANDLBURGER 2008). Such initial DTMs are unsatisfying and can still contain vegetation or any other objects, especially in areas with a high density of vegetation without terrain points (e.g. bushes) or built-up areas. An approach to solve this problem is the usage of a morphological filter. In this study, the filter concept of SIART et al. 2013 is applied and extended. In this process it is possible to consider the ER as additional vegetation detector.
The morphological filter erodes objects iteratively by comparing the height of the considered cell with the minimum height of adjacent raster cells. If the height difference exceeds a specified value (Δz_threshold) the considered raster cell is eroded to the local minimum (Fig. 3.1). Choosing an appropriate value for Δz_threshold results in eroding only real objects but not the terrain's inclination. Due to the fact that objects can extend over few raster cells the erosions process is performed over a given number of iterations (gapsize). For the next erosion step the previous result is used to remove bigger objects (SIART et al. 2013).
The ER is derived for each laser point and measures the local transparency and roughness. It describes the ratio of 3D neighbors to 2D neighbors of a laser point and is defined as:
ER [%] = n3D / n2D • 100.0; with n3D ≤ n2D
where n3D defines the number of points found in a fixed search 3D distance and n2D defines the number of points found in same distance measured in 2D (Höfle et al. 2009).
A high ER indicates a low vertical distribution of points like flat surfaces or bare ground while a low ER (< 50 %) is an indicator for a high vertical distribution of points like vegetation covering surfaces (Höfle et al. 2012). By using the ER it is possible to erode only raster cells with a certain ER value. This could be useful in steep areas which exceeds the Δz_threshold and would be eroded even without vegetation.
The DTM generation is carried out of a 3D point cloud using the LiDAR software OPALShttp://geo.tuwien.ac.at/opals/html/index.html (last access: Oct. 09, 2013). For the automation of the entire workflow the single OPALS commands are implemented in a Python script (Fig. 3.2). The script uses the previously explained morphological filter. In order to receive a sufficient quality of the DTM, several parameter combinations are used by applying a brute force method. The combination is built up of four different parameters. According to the morphological filter these are Δz_threshold and gapsize; additionally the DTM's cell size and an echo-filter to erode only cells which do not exceed a certain mean value of the ER are used. To obtain the best fitting parameter combination all computed DTMs are compared with reference data. For this purpose several points were measured via tachymeter. According to variable ground conditions which reflect a different potential for errors of the DTM generation, several profiles were registered representing the relative planar gravel bar, vegetation covered areas and such with inclination (Fig. 3.5). A separate script calculates the descriptive statistics of the comparison of the vertical difference between DTMs and reference data.
During the import of the raw point cloud (ASCII) into the OPALS Data Manager (ODM) format, which is necessary for further processes, it is possible to sort out outliers. In this case points with an incorrect Z value outside a range from 91.5 m to 115 m were filtered out. The Z-range was selected based on manual observation of the point cloud. Such ghost points occur e.g. through reflections during the scanning process (Fig. 3.3). Due to the large volume of data a special import mode is necessary, using a spatial index to load the data fractional into the main memory to avoid memory problems (OPALS parameter: tileSize). Besides the simple XYZ values the full-waveform information can be imported, also if an adjusted import format is used. The same procedure is executed for the reference data.
On the basis of the ODM file OPALS provides a module to calculate a raster (opalsCell) by a given cell size (0.25 m) and aggregation method (Z-min). The result is the initial DTM mentioned above. Dependent on the particular parameter combination the ER is necessary. To use the ER as vegetation detector it is required to calculate it according to the ODM (opalsEchoRatio). Based on the mean ER in each cell, a raster (cell size = 0.25 m) is computed. Since the calculation of the ER is very time-consuming, it is possible to turn this step on/off depending on the chosen parameter combination (with/without echo-filter).
In a next step the morphological filter is applied with every possible parameter combination. Within a loop the DSM is eroded step by step, returning for each erosion stage an interim raster file which is eroded again as long as it has been limited by gapsize. According to Δz_threshold and echo-filter only those cells are eroded which fulfill the requirements (height difference and ER). The final raster (final DTM) is interpolated to smooth the terrain and fill holes (cells with no-data values) which result by the erosion. For the interpolation two modes are used: one to fill areas with only a few holes and a second mode to fill larger areas of linked cells with no-data values. The first mode uses the values of less neighboring cells (search radius = 0.75 m) to fill holes compared to the second mode where values of more distant cells (search radius = 3 m) are necessary to fill larger holes since more neighboring cells contain no-data. Appropriate settings for a moving plane interpolation (movingPlanes) are used to fill all holes on the gravel bar and its closer area without affecting too many valid cells. Besides the search radius, important parameters are the cell size for the interpolation which is the same as the cell size for raster generation (0.25 m) and the number of cells (nearest neighbours) which are selected quadrant-wise and used for the interpolation. Due to the automation, several final DTMs are generated depending on the count of parameter combinations.
For the quality check the height difference between every DTM and reference point is calculated (opalsAddInfo → NormalizedZ=Zreference - ZDTM). A negative value implies that the DTM is higher whereas positive values indicate a lower DTM at this position. Furthermore, for each reference profile the root mean square error (RMSE) is calculated. In a last step both the best parameter combination (lowest RMSE) for every reference profile and their lowest by aggregating all reference profiles are printed out. A separate script provides more detailed statistics such as mean values, median and the RMSE for every point.
Through the interpolation the water areas (in this case the Neckar) and other regions with no-data are filled-up. For post-classification the water area can be derived from the none-interpolated DTM (after the last erosion stage) by converting the raster into a polygon vector file which differs between data and no-data. Since there is only one linked water area (Neckar) and no puddles, the largest no-data polygon represents the Neckar and has to be cut out from the interpolated DTM. The area for potential water zones has to be limited because in the lower and upper region of the study area the availability of 3D point cloud is not sufficient to represent the exact waterside in the DTM.
To compute a best possible DTM several values for the different parameters were chosen:
This results in 45 different parameter combinationsIn the following parameter combinations are represented in this form: Δz_threshold_gapsize_echo-filter_cell size; e.g. 0.4_3_75_0.25, which were all applied to compute a DTM and were compared with the reference data. Since the captured reference data by tachymeter did not fit to the point cloud's height (Z-values; Fig. 3.3) they had to be adjusted manually. The respective correction of the adjustment of the Z values was carried out by an affine transformationA Python script was used, written by Bernhard Höfle (2012). More information about the affine transformation is available on elonen.iki.fi/code/misc-notes/affine-fit/ (last access: Oct. 09, 2013). For this purpose, six reference points were selected which are evenly distributed on the reference points’ bounding box. The origin Z value of the reference points was manually replaced by the Z value of corresponding points in the cloud based on XY coordinates. These six points were used to transform all other reference points. A manual verification of all transformed reference points was executed to ensure that all transformed points fit to the point cloud. Sixteen points were not satisfyingly transformed and had been removed so that 70 reference points remained.
As stated above, in order to obtain the most accurate result, further statistical analyses have been made, using the corrected reference points (Fig. 2.2). The focus lies on mean, median and root mean square errors (RMSE), as well as in detecting extreme values.
In summary, the mean error of all combinations averages -2.0 cm with only four combinations with a positive mean error, which indicates that the DTMs tend to be a bit too high. The absolute mean error ranges between 7.4 cm and 12.1 cm, the median error between 6.7 cm and 7.5 cm. The small differences in median errors indicate that the differences between the varied parameters are quite small. The main difference between the combinations is determined in its different extreme values. While the maximum error within the negative range (DTM is too high) varies only between -25.6 cm and -33.2 cm, the positive errors (DTM is too low) vary between 20.0 cm and 68.2 cm. Values >40 cm only appear if the echo ratio parameter is turned off (echo-filter=0). This results in RMSEs between 9.8 cm and 18.9 cm. In summary, the most influencing parameter is echo-filter, for which reason its influence will be discussed later. Furthermore, a too high Δz_threshold seems to produce too high DTMs, while a variation of gapsize mostly produces only mean variations in the range of some millimeters. The latter is probably due to the lack of greater linked objects (in this case dense vegetation objects).
Due to its lowest RMSE (9.8 cm) and the second lowest mean absolute error (7.4 cm) of all combinations, the combination 0.2_4_50_0.25 (Δz_threshold_gap-size_echo-filter_cell size) has been chosen for further examinations and as resulting DTM for the other groups in this study. The maximum errors of this DTM are -27.7 cm and +21.0 cm. A standard deviation of 9.7 cm has been calculated with 94.3 % of errors within two standard deviations (Fig. 3.6). A lot of combinations like 0.2_3_50_0.25 (variation in gapsize) provide similar results with slightly higher maximum errors. The central section of the final DTM with the research area is shown in figure 3.4.
As shown in figure 3.5 and table 3.1 the recorded reference profiles are quite different in matters of slope and vegetation coverage. While the profiles 2, 4 and 5 are located on the more or less flat gravel bar, the profiles 1, 3 and 6 also extend partially over the steep slope edge. Also in respect of vegetation coverage there are huge differences among the profiles. While the profiles 4 and 5 do not have any vegetation cover, profile 2 which is located around the central bush of the gravel bar is strongly influenced by branches and leaves at the border of bush and gravel bar even if there are no reference points within the bush. Also there is a drawn out depression along the north side of the bush. The profiles 1, 3 and 6 are partially vegetation covered with the densest vegetation in profile 1 and the lowest vegetation coverage with mostly sandy soil in profile 6.
A Pearson product-moment correlation coefficient of 0.09 with a p-value of 2*10-7 has been calculated between investigated errors and DTMs' slope for which reason no direct relation between error and slope could be determined. Instead the quality of the DTM seems to be strongly influenced by the vegetation coverage and the choice of echo-filter. In table 1 the RMSE for the different profiles is listed for two chosen parameter combinations (0.2_4_0_0.25 and 0.2_4_50_0.25) which differ in echo-filter. For profile 5, both combinations show the same result since the echo-filter does not have any influence here. It is the best result for 0.2_4_0_0.25. For the other combination (echo-filter=50) profile 1 and 6 show slightly better results. If there is a vegetation cover within the reference profile, a major difference between echo-filter can be observed (profile 1, 3, 6). With echo-filter = 50, a major improvement can be achieved, compared with a DTM generated without echo-filter=0. The discrepancy between the two parameter settings is particularly apparent in profile 1, in which echo-filter leads to an improvement from 26.7 cm to 7.3 cm due to a dense vegetation cover.
|No.||RMSE [cm] echo off (filter = 0)||RMSE [cm] echo on (filter = 50)||Profile characteristics|
|1||26.7||7.3||gravel bar, slope edge with dense vegetation|
|2||10.2||10.4||gravel bar around bush|
|3||26.1||16.3||gravel bar, slope edge with sand, dense vegetation cover|
|4||12.3||7.9||gravel bar, some big stones|
|6||13.6||6.1||gravel bar, slope edge with sand, thin vegetation cover|
The result around the bush (profile 2) seems to be influenced more by the depression on the north side of the bush since there are no major differences between the different echo-filter settings. The difference of profile 4 (4.4 cm) results by only one reference point which is located at the waterside and is not covered by vegetation. The height difference at the waterside exceeds 20 cm so that turning off the echo-filter (=0) results in an erosion although there is no object to remove. For both combinations two reference points registered at the top of large stones show comparative high errors. This is explainable since those stones are too small for a 0.25 m raster and so thus are removed during raster creation using z-min. The influence of echo ratio is finally shown in figure 3.6. In comparison with the parameter combination with echo-filter = 0, the echo-filter = 50 combination reduces high errors in the positive value range (DTM is too low) due to the removal of vegetation, but also increases the number of errors in the negative value range (DTM is too high). Altogether, the parameter echo-filter=50 leads to lower maximum errors and hence to a lower root mean square error and an overall better result.
The implementation of a morphological filter for DTM generation requires testing multiple settings of different influencing parameters to find appropriate settings for the distinct investigation area, for example a suitable Δz_threshold corresponding to vegetation height. The impact of different values is reflected in the wide range (9.8 cm to 18.9 cm) of determined root mean square errors. The RMSE (9.8 cm) of the best fitting DTM bears resemblance to a previous study of the same investigation area last year (10.9 cm); however, for the gravel bar that is classified as non-vegetation area, the same method (Z-min) was used (HÄMMERLE et al. 2013). The separate analysis of the reference profiles shows good results for the Z-min method for non-vegetation areas. For areas without terrain points (e.g. vegetation covered areas), the morphological filter is used. The uncertainty and thus the potential for errors rises for such areas as well as for cliffy ones due to the erosion process. By using the morphological filter, an echo ratio filter is added as another requirement besides the height difference, and this leads to better results (Fig. 3.6) in vegetated areas with a mean difference in RMSE of approximately 9 cm. This becomes clear by considering the profiles with dense vegetation or high inclination at gravel bar's slope edges (Tab. 3.1).
Due to the mentioned problem with the reference data (Chapter 3.2.2), the evaluation of the results has to be viewed critically. Besides this problem more reference data for areas with dense vegetation and without terrain points would increase the validity of the echo ratio filter. Also, due to the interpolation, especially larger holes (generated by erosion or based on missing registered points by the laser scanner) are a source of error potential. For example, the greater bush on the gravel bar is surrounded by a very irregular ground: a small sink towards the Neckar, a small elevation in the middle and a ramp towards south. In this case, reference data of the ground in the middle of the bush would be useful to validate results in this area.
The determined RMSE of 9.8 cm has to be considered as a limiting factor for the further usage of the DTM in the following groups sections. For multi-temporal analysis, DTMs of 2011 and 2012 have been regenerated with the developed script to ensure comparability of the different years point clouds. However, a potential error can arise due to the possible displacement between the different point clouds, despite performed fine-registration. Due to potential height changes of only some decimeters, these errors can be major limiting factors for the evidence of fluvial studies, as last years' study indicates (HÄMMERLE et al. 2013).