logo


Image Processing.


Part 1 - R and Image Geoprocessing


Basic packages and images

In this quick intro we will talk about two basic package that we will use to load and manipulate images, the raster package. Install it using:

> install.packages('raster')

To show how to use the packages you will need to load the following Sentinel2 images and ALOS World 3D DEM.

  • Band 2
  • Band 3
  • Band 4
  • Band 5
  • Band 6
  • Band 7
  • Band 8
  • Band 8A
  • Band 11
  • Band 12
  • DEM


  • Part 2 - Multi-spectral images


    Visible (TCC)

    Band 4 covers the visible red wavelength, 3 green and 2 blue. By creating an RGB (red, green, blue) image with this bands (4,3,2) we will create a True Color Composite (TCC).

    > library(raster)
    Loading required package: sp
    > tcc<-stack('b4lr.tif','b3lr.tif','b2lr.tif')
    > plotRGB(tcc,stretch='lin')
    



    Very Near Infra Red (VNIR)

    A VNIR composite is used to evaluate vegetation. We will create a False Color Composite (FCC) with the Sentinel2 bands 7, 6 and 5. Note the healthy vegetation shown in yellow.

    > vnir<-stack('b7.tif','b6.tif','b5.tif')
    > plotRGB(vnir,stretch='lin')
    



    Short Wave Infra Red (SWIR) and Near Infra Red (NIR)

    The SWIR wavelength is sensible to changes in soil and rock composition. Using a combination of sentinel2 swir bands 12 and 11 with the NIR band 8 in the RGB composite we generate an image with good contrast of soil and rocks.

    > swir<-stack('b12.tif','b11.tif','b8lr.tif')
    > plotRGB(swir,stretch='lin')
    



    These color composites are a few examples. Try different band composites and compare their results.


    Part 3 - Digital Elevation Model


    DEM

    A DEM is an image where each pixel has an elevation value. We used an image generated from JAXA "ALOS World 3D - 30m (AW3D30)".

    > dem<-raster('dem.tif')
    > plot(dem,col=grey(0:100/100),main='Digital Elevation Model - DEM')
    


    Other elevation related images from DEM

    Other elevation related products can be generated from a DEM image.
    Gradient and Aspect
    Slope is the tangent plane at a surface point and it is divided into two components:
  • Gradient: It is the angle between the surface and a horizontal plane in radians. From 0 to pi/2.
  • Aspect: It is the direction of the maximum plane dip related to the north in radians. From 0 to 2*pi.

    > gradient<-terrain(dem,opt='slope',unit='radians')
    > aspect<-terrain(dem,opt='aspect',unit='radians')
    > plot(gradient,zlim=c(0,1.6),axes=FALSE)
    > plot(aspect,zlim=c(0,6.3),axes=FALSE)
    

    Gradient

    Aspect

    TPI
    Topographic Position Index (TPI) is the difference between the value of this pixel (cell) with the average of the surrounding cells. Positive values means that this cell is higher, negative values lower and values close to zero means that the surface is flat or with a constant gradient.

    > tpi<-terrain(dem,opt='TPI')
    > plot(tpi,zlim=c(-12,12),axes=FALSE)
    


    TRI
    Terrain Ruggedness Index (TRI) is the mean absolute difference between the cell value and the values of each one of the eight surrounding cell adjacent to it.

    > tri<-terrain(dem,opt='TRI')
    > plot(tri,zlim=c(0,30),axes=FALSE)
    


    Roughness
    Terrain Roughnees is expressed by the difference of the maximum and minimum of the eight surrounding cells.

    > roughness<-terrain(dem,opt='roughness')
    > plot(roughness,zlim=c(0,55),axes=FALSE) 
    


    Flow
    Flow indicates the greatest fall or smallest incline (if all the surround cells are higher)

    > flow<-terrain(dem,opt='flowdir',unit='radians')
    > plot(flow,zlim=c(0,20),axes=FALSE)
    


    Hillshade

    Hillshade is generated by using the previous gradient and aspect values created above.

    > hillshade<-hillShade(gradient,aspect,angle=45,direction=270,normalize= TRUE)
    > plot(hillshade,col=grey(0:100/100),zlim=c(0,250))
    


    We can overlay the SWIR composite with the hillshade.


    > plotRGB(swir,stretch='lin')
    > plot(hillshade,add=TRUE,legend=FALSE,axes=FALSE,col=grey(0.4:0.9),alpha=0.3,zlim=c(100,200))
    



    Saving images.

    Raster images in R can be saved as geoTiff (or other format).

    > writeRaster(hillshade, filename='hillshade.tif', drive='GeoTiff')
    



  • Part 4 - Principal Component Analysis


    PCA

    PCA is an statistic procedure that uses an orthogonal transformation to convert a dataset of possibly correlated variables into a dataset of non linear correlated variables called principal components.
    If n observations of p variables occur, then the number of distinct principal components will be smallest of n-1 and p. This transformation it is done in such a way that the first principal component has the greatest possible variance and each subsequent component has the greatest variance respecting that it is orthogonal to the previous component.
    First we will create a raster dataset with the bands 2, 3, 4, 5, 6, 7, 8, 11 and 12 where we will perform the PCA.

    > library(raster)
    Loading required package: sp
    > all<-stack('b2lr.tif','b3lr.tif','b4lr.tif','b5.tif','b6.tif','b7.tif','b8lr.tif','b11.tif','b12.tif')
    > all
    class      : RasterStack 
    dimensions : 5490, 5490, 30140100, 9  (nrow, ncol, ncell, nlayers)
    resolution : 20, 20  (x, y)
    extent     : 499980, 609780, 7990240, 8100040  (xmin, xmax, ymin, ymax)
    crs        : +proj=utm +zone=23 +south +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
    names      :  b2lr,  b3lr,  b4lr,    b5,    b6,    b7,  b8lr,   b11,   b12 
    min values :     0,     0,     0,     0,     0,     0,     0,     0,     0 
    max values : 65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535 
    

    Random sample selection of our dataset (population) with a size that represents the population. Due to the size of this sample, it may take 20 to 30 minutes to process, depending on you computer configuration!

    > set.seed(1)
    > sample<-sampleRandom(all,300000)
    

    Execute the PCA to get the predictors of each one of the components and apply then into our population generating the 9 principal components of our dataset.

    > pca <- prcomp(sample, scale = TRUE)
    > pca
    Standard deviations (1, .., p=9):
    [1] 2.4938825 1.4405068 0.6506558 0.3245173 0.2507924 0.2101730 0.1725472
    [8] 0.1521670 0.1297261
    
    Rotation (n x k) = (9 x 9):
                PC1         PC2          PC3          PC4         PC5        PC6
    b2lr -0.3500839  0.21786006 -0.487521294  0.451152030 -0.37757283 -0.2949587
    b3lr -0.3703183  0.14580034 -0.445177787 -0.009218293  0.00742522  0.2270376
    b4lr -0.3709968  0.18873698 -0.240328170 -0.331789692  0.65117918  0.2191660
    b5   -0.3833114  0.07602207  0.178701902 -0.636503618 -0.27668479 -0.5143922
    b6   -0.3020342 -0.44426207  0.045971733 -0.193578987 -0.30085933  0.2728383
    b7   -0.2546075 -0.53012978  0.017193159  0.049618293 -0.11197494  0.2017823
    b8lr -0.2447077 -0.53589076  0.008766872  0.319041692  0.39225323 -0.3215784
    b11  -0.3505529  0.21457965  0.520360445  0.306977293  0.24457257 -0.2678492
    b12  -0.3419363  0.27972942  0.448538596  0.212598578 -0.20015435  0.5084364
                 PC7        PC8           PC9
    b2lr  0.33783895 -0.1951474  0.0835675215
    b3lr -0.69691336  0.2935101 -0.1410700518
    b4lr  0.37005967 -0.2149661  0.0606220352
    b5    0.06198892  0.2171109 -0.1344925163
    b6   -0.12450615 -0.2888241  0.6387034513
    b7    0.10786017 -0.2715668 -0.7160607841
    b8lr  0.08512160  0.5058650  0.1750840968
    b11  -0.37475020 -0.4405098 -0.0035602696
    b12   0.29081065  0.4186952 -0.0004243326
    > pci1<-predict(all,pca,index=1)
    > pci2<-predict(all,pca,index=2)
    > pci3<-predict(all,pca,index=3)
    > pci4<-predict(all,pca,index=4)
    > pci5<-predict(all,pca,index=5)
    > pci6<-predict(all,pca,index=6)
    > pci7<-predict(all,pca,index=7)
    > pci8<-predict(all,pca,index=8)
    > pci9<-predict(all,pca,index=9)
    

    Plotting the components.

    > par(mfrow=c(3,3))
    > plot(pci1,axes=FALSE,col=grey(0:100/100))
    > plot(pci2,axes=FALSE,col=grey(0:100/100))
    > plot(pci3,axes=FALSE,col=grey(0:100/100))
    > plot(pci4,axes=FALSE,col=grey(0:100/100))
    > plot(pci5,axes=FALSE,col=grey(0:100/100))
    > plot(pci6,axes=FALSE,col=grey(0:100/100))
    > plot(pci7,axes=FALSE,col=grey(0:100/100))
    > plot(pci8,axes=FALSE,col=grey(0:100/100))
    > plot(pci9,axes=FALSE,col=grey(0:100/100))
    



    Saving each one of the components into a geoTiff file

    > writeRaster(pci1,filename='pca1.tif',driver='Geotiff')
    > writeRaster(pci2,filename='pca2.tif',driver='Geotiff')
    > writeRaster(pci3,filename='pca3.tif',driver='Geotiff')
    > writeRaster(pci4,filename='pca4.tif',driver='Geotiff')
    > writeRaster(pci5,filename='pca5.tif',driver='Geotiff')
    > writeRaster(pci6,filename='pca6.tif',driver='Geotiff')
    > writeRaster(pci7,filename='pca7.tif',driver='Geotiff')
    > writeRaster(pci8,filename='pca8.tif',driver='Geotiff')
    > writeRaster(pci9,filename='pca9.tif',driver='Geotiff')
    



    Part 5 - Intensity, Hue and Saturation


    RGB to IHS

    Let's see how to create an IHS (intensity, hue, saturation) composite from a RGB composite. This procedure is used to generate a wider spectral response in some cases, mostly for H and S.
    We will use the SWIR composite in this processing.

    > library(raster)
    > swir<-stack('b12.tif','b11.tif','b8lr.tif')
    

    The data has to be normalized to values between 0 and 1:

    > Nor<-swir/65535
    > Nor
    class      : RasterBrick 
    dimensions : 5490, 5490, 30140100, 3  (nrow, ncol, ncell, nlayers)
    resolution : 20, 20  (x, y)
    extent     : 499980, 609780, 7990240, 8100040  (xmin, xmax, ymin, ymax)
    crs        : +proj=utm +zone=23 +south +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
    source     : /tmp/RtmpGjvJwB/raster/r_tmp_2019-10-16_102611_32256_56357.grd 
    names      :        b12,        b11,       b8lr 
    min values :          0,          0,          0 
    max values : 0.38136873, 0.19211109, 0.08664073 
    

    We convert RGB to IHS using the following formulas:
    I=(R+G+B)/3
    S=1-(3/(R+G+B))*mínimo(R,G,B)
    H*=acos(0.5*((R-G))+(R-B)))/sqrt((R-B)²+((R-G)*(G-B)) ) ))


    > I<-(Nor[[1]]+Nor[[2]]+Nor[[3]])/3
    > S<-1-(3/(Nor[[1]]+Nor[[2]]+Nor[[3]]))*min(Nor[[1]],Nor[[2]],Nor[[3]])
    > H<- acos((0.5*((Nor[[1]]-Nor[[2]])+(Nor[[1]]-Nor[[3]])))/(sqrt((Nor[[1]]-Nor[[2]])^2+((Nor[[1]]-Nor[[3]])*(Nor[[2]]-Nor[[3]])))))
    

    Plotting images I, H and S individually.

    > par(mfrow=c(1,3))
    > plot(I,axes=FALSE,legend=FALSE,col=grey(0:100/100))
    > plot(H,axes=FALSE,legend=FALSE,col=grey(0:100/100))
    > plot(S,axes=FALSE,legend=FALSE,col=grey(0:100/100))
    


    Creating an IHS composite

    > ihs<-brick(I,H,S)
    > plotRGB(ihs,stretch='lin')
    


    Saving the I, H and S as geoTiff:

    > writeRaster(I,filename='I.tif',driver='Geotiff')
    > writeRaster(H,filename='H.tif',driver='Geotiff')
    > writeRaster(S,filename='S.tif',driver='Geotiff')
    



    Part 6 - Band Combination


    IHS and PCA combinations

    Let's load the I, H, S and the first 5 principal components to compose some images.

    > library(raster)
    > ihs<-stack('I.tif','H.tif','S.tif')
    > pca<-stack('pca1.tif','pca2.tif','pca3.tif','pca4.tif','pca5.tif')
    > relevo<-raster('hillshade.tif')
    

    Creating 4 combinations of the loaded images. Try different ones if you wish.

    > par(mfrow=c(2,2))
    > PC1PC2H<-stack(pca[[1]],pca[[2]],ihs[[2]])
    > plotRGB(PC1PC2H,stretch='lin')
    > PC1PC2S<-stack(pca[[1]],pca[[2]],ihs[[3]])
    > plotRGB(PC1PC2S,stretch='lin')
    > PC1PC5H<-stack(pca[[1]],pca[[5]],ihs[[2]])
    > plotRGB(PC1PC5H,stretch='lin')
    > PC1PC5S<-stack(pca[[1]],pca[[5]],ihs[[3]])
    > plotRGB(PC1PC5S,stretch='lin')
    



    I selected the PC1PC2H combination and applied the hillshade file. See the result below.

    > writeRaster(PC1PC2H,filename='p1p2h.tif',driver='Geotiff')
    > par(mfrow=c(1,1))
    > plotRGB(PC1PC2H,stretch='lin')
    > plot(relevo,add=TRUE,legend=FALSE,axes=FALSE,col=grey(0.4:0.9),alpha=0.3,zlim=c(100,200))
    




    Part 7 - Band Ratio


    Indexes

    Band ratio is often used in image geoprocessing, mostly to create indexes. Let's check some of them now.

    NDVI

    NDVI or normalized difference vegetation index highlights the vegetation chlorophyll response difference of bands 4 (RED) and 8 (NIR). Healthy vegetation has a good reflectance in both bands but, dry vegetation has a good reflectance only for band 8 (NIR). The index, calculated by using the normalized band ratio, gives higher values for healthy vegetation.

    > library(raster)
    > b4<-raster('b4lr.tif')
    > b8<-raster('b8lr.tif')
    > ndvi<-(b8-b4)/(b8+b4)
    > co<-colorRampPalette(c('gray','yellow','green'), bias = 1, space = c("rgb", "Lab"),interpolate = c("linear", "spline"), alpha = FALSE)
    > plot(ndvi,col=co(24),zlim=c(0,0.8))
    


    NDMI

    NDMI or normalized difference moisture index highlights the moisture difference of bands 11 (SWIR) and 8 (NIR) from healthy vegetation response. The index, calculated by using the normalized band ratio, gives higher values for moisture and wet zones.

    > library(raster)
    > b11<-raster('b11.tif')
    > b8<-raster('b8lr.tif')
    > ndmi<-(b8-b11)/(b8+b11)
    > co<-colorRampPalette(c('gray','yellow','green'), bias = 1, space = c("rgb", "Lab"),interpolate = c("linear", "spline"), alpha = FALSE)
    > plot(ndmi,col=co(24),zlim=c(-0.4,0.7))
    


    NDWI

    NDWI or normalized difference water index highlights the water presence difference of bands 5 (VNIR) and 3 (green). The index, calculated by using the normalized band ratio, gives higher values for zones with water.

    > library(raster)
    > b5<-raster('b5.tif')
    > b3<-raster('b3lr.tif')
    > ndwi<-(b3-b5)/(b3+b5)
    > co<-colorRampPalette(c('gray','blue'), bias = 1, space = c("rgb", "Lab"),interpolate = c("linear", "spline"), alpha = FALSE)
    > plot(ndwi,col=co(24),zlim=c(0,1))
    





    Part 8 - Kmeans Classification


    Loading the data

    Here we will follow a sequence of steps in order to generate an unsupervised classification that will result in a ‘geological map’.
    We will need to install the 'cluster' and 'randomForest'. Load the libraries and the band composite created above to perform the classification.

    > install.packages('cluster')
    > install.packages('randomForest')
    > library('raster')
    > library('cluster')
    > library('randomForest')
    > relevo<-raster('hillshade.tif')
    > img<-brick('p1p2h.tif')
    

    Performing the classification

    Use the code below to create 5 kmeans classifications with 3, 6, 9, 12 and 15 classes. The complete procedure may take 2 hours to complete since we are using a full Sentinel2 scene.

    > ar <- getValues(img) #loading values into a vector
    > values <- which(!is.na(ar)) #index of valid values (non NA values)
    > ar <- na.omit(ar) #removing NA’s from de vector
    > Result <- clara(ar,3,samples=500,metric="manhattan",pamLike=T)
    > kmraster3 <- raster(img)
    > #creating the raster to be
    > kmraster3[values] <- Result$clustering #populated with class values
    > Result <- clara(ar,6,samples=500,metric="manhattan",pamLike=T)
    > kmraster6 <- raster(img)
    > kmraster6[values] <- Result$clustering
    > Result <- clara(ar,9,samples=500,metric="manhattan",pamLike=T)
    > kmraster9 <- raster(img)
    > kmraster9[values] <- Result$clustering
    > Result <- clara(ar,12,samples=500,metric="manhattan",pamLike=T)
    > kmraster12 <- raster(img)
    > kmraster12[values] <- Result$clustering
    > Result <- clara(ar,15,samples=500,metric="manhattan",pamLike=T)
    > kmraster15 <- raster(img)
    > kmraster15[values] <- Result$clustering
    > par(mfrow=c(2,3))
    > plotRGB(img,stretch='lin',axes=T,main='PC1PC2H')
    > plot(relevo,add=TRUE,legend=F,axes=F,col=grey(0:100/100),alpha=0.4)
    > plot(kmraster3, legend=F,axes=F, col = rainbow(3),main='Kmeans 3')
    > plot(relevo,add=TRUE,legend=F,axes=F,col=grey(0:100/100),alpha=0.4)
    > plot(kmraster6, legend=F,axes=F, col = rainbow(6),main='Kmeans 6')
    > plot(relevo,add=TRUE,legend=F,axes=F,col=grey(0:100/100),alpha=0.4)
    > plot(kmraster9, legend=F,axes=F, col = rainbow(9),main='Kmeans 9')
    > plot(relevo,add=TRUE,legend=F,axes=F,col=grey(0:100/100),alpha=0.4)
    > plot(kmraster12, legend=F,axes=F, col = rainbow(12),main='Kmeans 12')
    > plot(relevo,add=TRUE,legend=F,axes=F,col=grey(0:100/100),alpha=0.4)
    > plot(kmraster15, legend=F,axes=F, col = rainbow(15),main='Kmeans 15')
    > plot(relevo,add=TRUE,legend=F,axes=F,col=grey(0:100/100),alpha=0.4)
    



    Kmeans3 showed a good response in differentiating the main units and it highlights the structural architecture.
    Kmeans6, Kmeans9 and Kmeans12 resulted in a good lithological classification response.