Culture classification using polarimetric information from SIR-C/X-SAR mission : Bebedouro region, Brazil

SUELI PISSARRA CASTELLARI RIBEIRO

LUCIANO VIEIRA DUTRA

CAMILO DALELES RENNÓ

JOÃO VIANEI SOARES

INPE - Instituto Nacional de Pesquisas Espaciais

DPI - Divisão de Processamento de Imagens

Av. dos Astronautas, 1758

12227-010 São José dos Campos, SP, Brazil

{sueli,dutra@dpi.inpe.br}

{camilo,vianei@ltid.inpe.br}

ABSTRACT

This paper investigates the use of polarimetric information extracted from SIR-C/X-SAR complex images for crop discrimination. The test site is the Bebedouro Hydrology super site in Brazil. In addition to the widely used absolute values of complex channels (amplitude), another two channels, modulus and phase from complex correlation coefficient between HH and VV complex channels, were also investigated. The phase of the correlation coefficient is a smoothed version of phase difference information between HH and VV, and was used in lieu of it. Five classes; corn, soya beans, stubble, bare soil and a regional savanna type named "caatinga" were defined. To assess the discrimination power of the extracted features, confusion matrix of maximum likelihood and Jeffreys-Matusita distance were calculated. Results showed that phase information can greatly improve the classification accuracy and the modulus of correlation coefficient also carries discrimination power.

1. Introduction

The objective of this report is to evaluate the performance of SIR-C L-band polarimetric information for crop classification in a semi-arid irrigated region in northeast Brazil. Amplitude and extracted channels using modulus and phase from complex correlation coefficient were used. Section 2 describes the features used, section 3 briefly describes the site and material, section 4, methodology, section 5 presents the results and conclusions follow on section 6.

2. Feature extraction

In this section we introduce two features for information extraction used: phase difference and complex correlation coefficient.

2.1 Phase Difference

The phase difference (hh - vv) between the two co-polarized channels is calculated by

(1)

where and indicate the real and imaginary parts, respectively. Shh and Svv are obtained from the scattering matrix (co-polarized complex scattering components). Phase differences can also be calculated between cross-polarized channels.

2.2 Complex correlation coefficient

The complex correlation coefficient between the co-polarized elements of the Stokes matrix is calculated as:

(2)

from which one can obtain the magnitude and the phase. This definition can also be extended to the cross-polarized channels.

3. Image and Site description

The study area includes the "Projeto de Irrigação de Bebedouro (PIB)", a SIR-C/X-SAR Supersite situated at the region of the "Sub-médio São Francisco (907'S, 4018'WGr)", about 40 km Northeast from Petrolina, Pernambuco state [2].

3.1 Site description

The "PIB" is divided in 2 parts , "PIB I" and "PIB II", with total area about 3500 ha and 2000 ha, respectively. "PIB I" is constituted by small properties from 5 to 12 ha, large areas owned by private enterprises, natural vegetation reserve areas and small residence centers. "PIB II" has an area belonging to private enterprises and another for Basic Seeds Production Service of EMBRAPA, the Brazilian Agronomic Research Institute.

In this study we focus on "PIB II" area, specifically the region composed by 4 central pivots with classes of corn, soya beans, stubble and bare soil, plus the natural savanna ("caatinga") class.

At the time of the SIR-C/X-SAR overpasses (imagery acquisition), the region was surveyed and a fully agronomic characterization of crops was accomplished.

3.2 Images parameters

Images were acquired by the Space Shuttle SIR-C/X-SAR mission in April 1994. Table I shows image parameters used in this study, acquired on April, 13.

Table I - Image parameters
FrequencyL(1.254 GHz) C(5.304 GHz)
PolarizationHH, HV, VV, VH
Incidence angle37.97
Platform altitude219.38 Km
Orbital directiondescending
Number of looks16 looks
Geometric representation Ground range
Pixel spacerg 12.5m / az 12.5m

4. Methodology

As mentioned previously only L-band complex images in three polarizations (HH, HV, VV), were investigated in this study. Figure 2 shows an image in HH-polarization from the study area along with the analyzed classes. Table II presents the number of samples and number of pixels dor each class.

Figure 2 - L Band, HH polarization, classes : 1 - corn, 2 - soya beans, 3 - stubble, 4 - bare soil, 5 - "caatinga"

Table II - Study classes, numer of samples and pixels
Classes
Number

of samples
Number

of pixels
corn 2 6245
soya beans 2 6635
stubble 1 1498
bare soil 1 5592
"caatinga" 4 5245

We investigated the following combinations to assess the discrimination power of the extracted features:

It is considered here that these features follow a joint gaussian distribution due to high number of looks of the original channel and the averaging process involved in correlation coefficient computation. Even the phase, for high values of correlation coefficient, can be approximately considered gaussian, because in this case, all phase histograms were relatively narrow and entirely included into [-p,p] range around the mode, as in [3].

The evaluation of the contribution of these channels combinations for crop discrimination is made in two ways :

  1. comparing the average performance (AP) and average confusion (AC), calculated over confusion matrices regarding the training areas and for the three combinations, using the classical maximum likelihood classification method (section 4.1).

  1. comparing statistic distances between distributions evaluated for the study area and for all combinations (original amplitude plus derived channels). This method has the advantage of being independent of the classification algorithm (section 4.2).

4.1. Maximum Likelihood Classification

The most common supervised classification, the Maximum likelihood classification method ([4] [5]), is used to classify the training areas of the study classes.

The average performance, AP, average abstention AA, and average confusion, AC, given by AC = 1-(AP+AA), were derived from the confusion matrix calculated for each set of channels. AP is calculated using the population weighted average of the corrected classification index for each class (main diagonal of the confusion matrix). Similar procedure was used for AP calculation using index data over abstention column.

4.2. J-M Distance

The Jeffries-Matusita distance between a pair of probability distributions is defined as [6]

(3)

where and are the conditional probabilities density functions (pdf) of the ith and jth class distributions. For normally distributed classes equation (3) becomes

(4)

in which

(5)

The JM distance can be used for measuring the separability of a fixed set of channels and a given set of classes; or can be used for a chosen subset of channels, also considering a fixed set of classes. If the set has more than two classes, one can choose the best subset either by maximizing the average JM distance between a pair of classes, or by maximizing the minimum distance between some pair of classes, for each set of channels.

5. Results

5.1. Classification

The three amplitude bands corresponding to HH, HV and VV polarization will be called original channels. Average performance (AP) of 61.4%, (see Table III), and an average confusion of 38.6% was obtained using maximum likelihood classification of this original data. Average abstention (AA) in this table and in the next ones was null because the classification threshold was fixed to classify all points. Considering the classes individually one can notice that bare soil class had a result of 89.21%, higher than the AP, on the other hand, soya beans class had a result of 34.05%, well under average, confused almost in the same proportion between stubble and "caatinga" classes. This behavior is due to the fact that the average tone value of these classes where quite close as it can be depicted from Table IV. The other classes showed results close to the average performance.

Table III - Confusion Matrix using HH, HV e VV Amplitude
true

classified

corn
soya beans
stubble
bare soil
caatinga
corn
65.56 %

8.12 %

14.46 %

1.32 %

10.52 %
soya beans
6.86 %

34.05 %

29.14 %

9.60 %

20.32 %
stubble
5.80 %

8.41 %

65.02 %

19.75 %

1.00 %
bare soil
1.00 %

0.59 %

9.17 %

89.21 %

0.01 %
caatinga
9.28 %

25.07 %

6.93 %

0.57 %

58.13 %

Average performance (AP) : 61.4 %

Average confusion (AC) : 38.6 %

Table IV- Average tone means and Standard deviation for HH, VV and HV channels.
corn
soya beans
stubble
bare soil
caatinga
HH 0.3453 0.2514 0.2104 0.1363 0.2864
0.1129 0.1012 0.0718 0.0539 0.0832
HV 0.1116 0.1022 0.0642 0.0336 0.1449
0.0328 0.0425 0.0212 0.0133 0.0440
VV 0.4074 0.2207 0.2078 0.1524 0.2598
0.1422 0.0809 0.0924 0.0606 0.0783

Table V shows the classification result when using modulus and angle of complex correlation coefficient. Average performance improves, in comparison to the previous classification, from 61.4% to 69.1%. Corn and soya beans classification accuracy also improved, from 65.5% to 87% and from 34% to 45.8%, respectively. The main factor in improving the separability of corn class is the phase difference between HH and VV (around , see Table VI).

Table V - Confusion Matrix using the modulus and angle of complex correlation coefficient

true

classified

corn
soya beans
stubble
bare

soil
caatinga
corn
87.00 %

9.37 %

0.19 %

0.00 %

3.42 %
soya beans1

2.76 %


45.89 %

6.61 %

0.22 %

34.50 %
stubble
0.06 %

0.00 %

60.08 %

29.17 %

10.68 %
bare soil
0.00 %

0.00 %

12.41 %

87.41 %

0.17 %
caatinga
3.20 %

22.36 %

17.21 %

0.01 %

57.19 %

Average performance (AP) : 69.1 %

Average confusion (AC) : 30.9 %

The behavior of modulus of complex correlation is also very different in this case. Soya beans showed a reduction in confusion in relation to stubble, but increased in relation to "caatinga". This result can be explained by the classes average when using the modulus of the complex correlation coefficient (see Table VI), in this case the average of soya beans (=0.2556) differentiates from stubble (=0.6239), which is diverse from the previous case. Stubble class reduced the performance in 5% and increased the confusion with bare soil in 10%. "Caatinga" class kept the same performance index, but reduced the confusion with corn class and increased in relation to the stubble class, as said above. Although some isolated results were not satisfactory, the use of combination of these extracted features from the polarimetric data improved the overall classification. HV channel was not used in this case.

Table VI- Average and Standard deviation of the Modulus and angle of complex correlation coefficient
corn
soya beans
stubble
bare soil
caatinga
Modulus 0.4590 0.2556 0.6239 0.7897 0.3126
0.1397 0.1313 0.1359 0.0823 0.1431
Angle -1.6939 -0.4557 0.1430 0.1608 -.0074
0.5418 1.3094 0.2523 0.1351 0.8240

The use of all bands, original and extracted, improved the classification performance for all classes, as shown in Table VII. An improvement of 18.1% in relation to the first classification and of 10.4% in relation to the second was observed, regarding AP.

Table VII - Confusion Matrix using all channels together: HH, HV,VV, Modulus and angle of the complex correlation coefficient
true

classes

corn
soya beans
stubble
bare soil
caatinga
corn
87.15 %

10.09 %

0.46 %

0.00 %

2.27 %
soya beans
7.59 %

66.19 %

5.66 %

0.11 %

20.43 %
stubble
0.20 %

2.46 %

83.51 %

12.28 %

1.53 %
bare soil
0.05 %

0.00 %

7.42 %

92.47 %

0.05 %
caatinga
2.32 %

22.45 %

4.38 %

0.00 %

70.82 %

Average performance : 79.5 %

Average confusion : 20.5 %

Corn class performance held the same index that obtained in the second classification, while soya beans and stubble classes improved by about 20% and bare soil and "caatinga" classes by about 5% and 13% respectively. The confusion between soya beans and "caatinga" kept the same index observed in the first classification.

5.2 Distance between distributions

JM distance was calculated for each set of channels mentioned in the previous section. Table VIII shows minimum and average JM distances in the analysis of each set of channels separately. The result of this analysis confirm the ones obtained previously.

As an additional way to measure the relative importance of the features used here, JM distance was used to select the best 3 channels out from all 5. Table IX shows all possible combinations of 3 channels considering all defined classes. The combinations of the angle of complex correlation coefficient with pairs of amplitude (HH and HV), (HH and VV) and (HV and VV), that are selections 4, 5 and 6 of Table IX, showed the best performances.

Table VIII - Minimum and average JM Distance
HH, HV, VV

Amplitude

Modulus and angle compl. cor. coef. HH,HV,VV Amplitude

Modulus and angle

compl. correl. coef.

JM min 0.364460 0.568256 0.673633
JM ave 0.895871 1.06852 1.31420

Table IX - minimum and average JM Distance of 5 bands in combinations from 3 to 3
JM min JM ave
1
B1/B2/B3 0.60 1.12
2
B1/B2/B4 0.57 1.16
3
B1/B2/B5 0.57 1.13
4
B1/B3/B4 0.63 1.18
5
B1/B3/B5 0.59 1.20
6
B1/B4/B5 0.57 1.19
7
B2/B3/B4 0.33 0.74
8
B2/B3/B5 0.24 0.75
9
B2/B4/B5 0.17 0.79
10
B3/B4/B5 0.36 0.89
B1 - Angle of the complex correlation coefficient
B2 - Modulus of the complex correlation coefficient
B3 - HH Amplitude
B4 - HV Amplitude
B5 - VV Amplitude

Table X - Classification average performance
Features
Average

performance
HH, HV, VV Amplitude +

Modulus and angle of the complex correlation coefficient

79.5 %
Angle of the complex correlation coefficient

HV, VV Amplitude


73.7 %
Angle of the complex correlation coefficient

HH, HV Amplitude


71.1 %
Modulus and angle of the complex correlation coefficient
69.1 %
Angle of the complex correlation coefficient

HH, VV Amplitude


67.6 %
HH, HV, VV Amplitude
61.4 %

It was also performed classifications using these 3 sets of features and the results are shown in Table X, as well as the results of classifications achieved previously. One notes that in Table X all combinations with angle showed higher average performance compared to the combination of the amplitude channels only.

6. Conclusions

It was shown that the phase information (between HH and VV) can distinctly improve the discrimination, such that phase information should not be, a priori, discarded in the classification process when using complex polarimetric data.

Modulus of correlation channel can also improve separability, as was observed with respect of classes soya beans and stubble.

Further studies using other features extracted from complex polarimetric data will be made and tested in other sites with distinct type of classes.

6. Acknowledgment

This work was partially supported by the Projeto Temático CNPq Geotec (Process No. 680.061/940).

The authors would also like to thank Mr. Iedo B. de Sá and Gilberto Cordeiro from EMBRAPA for their very valuable logistic help.

References

[1] J.J.van Zyl, H.A. Zebker, and C. Elachi, "Imaging Radar Polarization Signatures: Theory and Observations," Radio Science , vol. 22, pp. 529-543, 1987.

[2] C.D. Rennó, "Avaliação de Medidas Texturais na Discriminação de Classes de Uso Utilizando Imagens SIR-C/X-SAR do Perímetro Irrigado de Bebedouro, Petrolina, PE." INPE, São José dos Campos, 1996. Msc Dissertation.

[3] J.S. Lee, K.W. Hoppel, S.A. Mango and A.R Miller, "Intensity and Phase Statistics of Multilook Polarimetric and Interferometric SAR Imagery," IEEE Trans. Geosci. Remote Sensing, vol. 32 pp. 1017-1028, Sept. 1994.

[4] K. Fukunaga, "Introduction to Statistical Pattern Recognition", Academic Press, Inc., 1990.

[5] N.D.A. Mascarenhas e F.R.D.Velasco, "Processamento Digital de Imagens," IV EBAI 1989.

[6] Richards, J.A., "Remote Sensing Digital Image Analysis - An Introduction", Springer-Verlag Berlin Heidelberg 1986.