A LAScatalog class is a representation in R of a file or a set of las files not loaded in memory. Indeed, a regular computer cannot load the entire point cloud in R if it covers a wide area. For very high density datasets it can even fail loading a single file (see also the “LAS formal class”" vignette). In lidR, we use a LAScatalog to process datasets that cannot fit in memory.
ctg <- catalog("path/to/las/files/")
# or
ctg <- catalog("path/to/las/files/big_file.las")
print(ctg)
#> class : LAScatalog
#> features : 62
#> extent : 883166.1, 895250.2, 625793.6, 639938.4 (xmin, xmax, ymin, ymax)
#> coord. ref. : +init=epsg:3005 +proj=aea +lat_1=50 +lat_2=58.5 +lat_0=45 +lon_0=-126 +x_0=1000000 +y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0
#> variables : 7
#> names : Max.X, Min.X, Max.Y, Min.Y, Min.Z, Max.Z, filename
#> min values : 883657.8, 883166.1, 625882.9, 625793.6, 0.61, 109.26, path/to/las/files/file1.las
#> max values : 895250.2, 894233.1, 639938.4, 638602.1, 338.86, 1021.87, path/to/las/files/file9.las
A LAScatalog inherits a SpatialPolygonsDataFrame. Thus, it has the structure of a SpatialPolygonsDataFrame plus some extra slots that store information relative to how the LAScatalog will be processed.
The slot data
of a LAScatalog object contains a data.frame
with the most important information read from the header of .las or .laz files. Reading only the header of the file provides an overview of the content of the files very quickly without actually loading the point cloud. The columns of the table are named after the LAS specification version 1.4
The other slots are well documented in the documentation ?`LAScatalog-class` so they are not described in this vignette.
A LAScatalog is formally a SpatialPolygonsDataFrame although its purpose is not to manipulate spatial data in R. The purpose of a LAScatalog is to represent a set of existing las/laz files. Thus a LAScatalog cannot be modified because it must be related to the actual content of the files. The following throws an error:
ctg$Min.Z <- 0
#> Error in `$<-`(`*tmp*`, Min.Z, value = 0): LAScatalog data are read from standard files and cannot be modified
Obviously it is always possible to modify an R object by bypassing such simple restrictions. In this case the user will break something internally and a correct output is not guaranteed.
Users commonly report bugs arising from the fact that the point cloud is invalid. This is why we introduced the function lascheck
to perform an inspection of the LAScatalog objects. This function checks if a LAScatalog object is consistent (files are all of the same type for example). For example, it may happen that a catalog mixes files of type 1 with files of type 3 or files with different scale factors.
lascheck(ctg)
#>
#> Checking headers consistency
#> - Checking file version consistency... ok
#> - Checking scale consistency... ok
#> - Checking offset consistency... ok
#> - Checking point type consistency... ok
#> - Checking VLR consistency... ok
#> - Checking CRS consistency... ok
#> Checking the headers
#> - Checking scale factor validity... ok
#> - Checking Point Data Format ID validity... ok
#> Checking preprocessing already done
#> - Checking negative outliers...
#> warning: 2 file(s) with points below 0
#> - Checking normalization... no
#> Checking the geometry
#> - Checking overlapping tiles...
#> warning: Some tiles seem to overlap each other
#> - Checking point indexation... no
The function lascheck
when applied to a catalog does not perform a deep inspection of the point cloud unlike when applied to a LAS object. Indeed the point cloud is not actually read.
lidR provides a simple plot
function to plot a LAScatalog object:
plot(ctg)
The option mapview = TRUE
displays the catalog on an interactive map with pan and zoom and allows the addition of a satellite map in the background. It uses the package mapview
internally. It is often useful to check if the CRS of the file are properly registered. The epsg code of the las file appears to be often incorrect, according to our own experience.
plot(ctg, mapview = TRUE, map.type = "Esri.WorldImagery")
.
Being a SpatialPolygonsDataFrame, the function spplot
from sp
can be used on a LAScatalog to check the content of the files. In the following we can immediately see that the catalog is not normalized and is likely to contain outliers:
spplot(ctg, "Min.Z")
Most of lidR functions are compatible with a LAScatalog and work almost like with a single point cloud loaded in memory. In the following example we use the function grid_metrics
to compute the mean elevation of the points. The output is a continuous wall-to-wall RasterLayer. It works exactly as if the input was a LAS object.
hmean <- grid_metrics(ctg, mean(Z), 20)
However, processing a LAScatalog usually requires some tuning of the processing options to get better control of the computation. Indeed, if the catalog is huge the output is likely to be huge as well, and maybe the output cannot fit in the R memory. For example, lasnormalize
throws an error if used ‘as is’ without tuning the processing options. Using lasnormalize
like in the following example the expected output
would be a huge point cloud loaded in memory. The lidR package forbids such a call:
output <- lasnormalize(ctg, tin())
#> Error in check_and_fix_options(ctg, need_buffer, check_alignment, need_output_file, : This function requires that the LAScatalog provides an output file template.
Instead, one can use the processing option opt_output_files
. Processing options drive how the big files are split in small chunks and how the outputs are either returned into R or written on disk into files.
opt_output_files(ctg) <- "folder/where/to/store/outputs/{ORIGINALFILENAME}_normalized"
output <- lasnormalize(ctg, tin())
Here the output is not a point cloud but a LAScatalog pointing to the newly created files. The user can check how the catalog will be processed by calling summary
summary(ctg)
#> class : LAScatalog
#> extent : 883166.1 , 895250.2 , 625793.6 , 639938.4 (xmin, xmax, ymin, ymax)
#> coord. ref. : +init=epsg:3005 +proj=aea +lat_1=50 +lat_2=58.5 +lat_0=45 +lon_0=-126 +x_0=1000000 +y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0
#> area : 111.1 km²
#> points : 0 points
#> density : 0 points/m²
#> num. files : 62
#> Summary of the processing options:
#> - Catalog will be processed by file respecting the original tiling pattern
#> - Catalog will be processed using 1 core(s).
#> Summary of the output options:
#> - Outputs will be returned in R objects.
#> Summary of the input options:
#> - readLAS will be called with the following select option: *
#> - readLAS will be called with the following filter option:
Also the plot
function can displays the chunks pattern i.e. how the dataset is split into small chunks that will be sequentially processed
opt_chunk_size(ctg) <- 0
plot(ctg, chunk = TRUE)
opt_chunk_size(ctg) <- 900
plot(ctg, chunk = TRUE)
Load a catalog. Process each file sequentially. Returns a RasterLayer into R.
ctg <- catalog("path/to/las/files/")
hmean <- grid_metrics(ctg, mean(Z), 20)
Load a catalog. Process each file sequentially. For each file write a raster on disk named after the name of the processed files. Returns a lightweight virtual raster mosaic.
ctg <- catalog("path/to/las/files/")
opt_output_files(ctg) <- "folder/where/to/store/outputs/dtm_{ORIGINALFILENAME}"
dtm <- grid_terrain(ctg, tin())
Load a single big file too big to be actually loaded in memory. Process small chunks of 100 x 100 meters at a time. Returns a RasterLayer into R.
ctg <- catalog("path/to/las/files/bigfile.las")
opt_chunk_size(ctg) <- 100
chm <- grid_canopy(ctg, p2r())
Load a catalog. Process small chunks of 200 x 200 meters at a time. Each chunk is loaded with an extra 20 m buffer. Use 3 cores. Returns a Shapefile into R.
ctg <- catalog("path/to/las/files/")
opt_chunk_size(ctg) <- 200
opt_chunk_buffer(ctg) <- 20
opt_cores(ctg) <- 3
ttops <- tree_detection(ctg, lmf(5))
This is forbidden. The output would be too big.
ctg <- catalog("path/to/las/files/")
decimated <- lasfilterdecimate(ctg, homogenize(4))
Load a catalog. Process small chunks of 500 x 500 meter sequentially. For each chunk write a laz file on disk named after the coordinates of the chunk. Returns a lightweight LAScatalog. Note that the original catalog has been retiled.
ctg <- catalog("path/to/las/files/")
opt_chunk_size(ctg) <- 500
opt_output_files(ctg) <- "folder/where/to/store/outputs/project_{XLEFT}_{YBOTTOM}_decimated"
opt_laz_compression(ctg) <- TRUE
decimated <- lasfilterdecimate(ctg, homogenize(4))
Load a catalog. Load a shapefile of plot centers. Extract the plots. Returns a list of extracted point clouds in R.
ctg <- catalog("path/to/las/files/")
shp <- shapefile("plot_center.shp")
plots <- lasclip(ctg, shp, radius = 11.2)
Load a catalog. Load a shapefile of plot centers. Extract the plots and immediately write them into a file named after the coordinates of the plot and an attributes of the shapefile (here PLOTID
if such an attribute exists in the shapefile). Returns a lightweight LAScatalog.
ctg <- catalog("path/to/las/files/")
shp <- shapefile("plot_center.shp")
opt_output_files(ctg) <- "folder/where/to/store/outputs/plot_{XCENTER}_{YCENTER}_{PLOTID}"
plots <- lasclip(ctg, plot_centers, radius = 11.2)