A brief introduction
At present, there are two main formats of spaceborne lidar data: full waveform data and photon-counting data.
You can find the explanation and the comparison of the two data formats here https://zhuanlan.zhihu.com/p/596205670
LiDAR data processing – methods and implementation
Different data format requires different processing.
Three formats in lidar RS:
Discrete return data – airborne lidar (Optech, Leica)
Waveform data – spaceborne lidar (ICESat, GEDI)
Photon-counting data – spaceborne lidar (ICESat-2)
LiDAR data processing – waveform data pre-processing
Pre-processing
Normalization
Background Noise Removal
Gaussian filtering
LiDAR data processing – waveform decomposition
The return waveform can be regarded as the sum of multiple components, including the return waveform components of trees, buildings, land surface, etc.
Thus, the decomposition of the waveform is needed.
Initial parameter estimates % creating a searching window for waveform peaks
Least square fitting % MATLAB fminsearch
LiDAR data processing – photon-counting data
Batch process skills
We need to plot the global lidar points. How to reduce the data volume and improve the efficiency?
I have improved the code through:
- algorithm design (mainly)
- programming design
- paralleling toolbox in MATLAB
Algorithm design to reduce the data volume (my first paper)
The accuracy of ICESat-2 data varies from the changes of environmental conditions (cloud cover, mountains, terrain undulation, etc.). Thus, data with low accuracy can be removed at first.
From 6 TB to 3.6 TB: First, I analyzed the relationship between data quality and multiple environmental factors (slope, land covers, detecting time, etc.). And I concluded that the data accuracy is decreased with the terrain slope and cloud cover, while the data from strong laser beams are more accurate than that from weak laser beams. Setting threshold values based on the above analysis, the raw data is then filtered.
Algorithm design to reduce the data volume (my second paper)
In this paper, lidar data are used as height control points, thus, only the (lon, Lat, h) coordinates need to be remained.
From 3.6 TB to 21 G: The coordinates of lidar points are extracted from the HDF5 format data, and then stored as matrix.
Programming design to promote the processing efficiency
- a) Using ‘matrix’ rather than ‘cell’ to promote the calculating speed
- b) Out of memory - check the memory usage of each function and release the objects timely
- c) Convert to C++ to improve the processing speed
MALAB Parallel Processing Toolbox
‘parfor’ can be used for data batch processing
%‘parfor’ can be used for data batch processing
parfor loopVar = initVal:endVal; statements; end
parfor (loopVar = initVal:endVal,M); statements; end
parfor (loopVar = initVal:endVal,opts); statements; end
parfor (loopVar = initVal:endVal,cluster); statements; end
end