What can we say about a dock if we examine it based on the Sentinel-2 satellite images? Can we determine how much the port was used during a period of time? Is it possible to produce a trend curve showing the change of utilization? In this post I will attempt to do exactly that …
Trend analysis with remote sensing
The whole experiment started when I was browsing some Copernicus pages and found the webpage of www.race.esa.int, where you can find great statistics based on satellite images, like how the COVID epidemic has affected the EU economy, air pollution, agriculture. One of my personal favorites is the study of steel production in Hunedoara (RO).
Hunedoara grew into a very important central of iron and steel industry during the Austro-Hungarian Monarchy, and then became one of the most important and largest iron and steel centers in Romania. Such a huge steelwork has a larger, uncovered area where larger steel blocks are placed during production, which can be clearly seen on satellite images.
Based on this satellite data, www.race.esa.int investigated how production volumes were affected by the economic downturn caused by the COVID epidemic. I really recommend the site for statistics and data driven enthusiasts!
After browsing the page above, I was wondering if I could extract data from an area in a similar way, just by analyzing the Sentinel-2 satellite images.
One of the main considerations in selecting the examined area was that change in “bodies” of at least 10 meters has to be visible, since the best spatial resolution of the satellite is 10 meters. Elements smaller than this are simply not visible.
The other aspect of selecting a study area was that the result should not be affected by changes in the environment, such as seasonal changes in vegetation.
Considering the requirements above, the larger ports of Lake Balaton (HU) seemed to be a very good choice for my study, as the ships moored there are large enough to be seen on Sentinel-2 satellite images, and the water around them changes very rarely, therefore the satellite’s NIR (near infrared) camera can give a very good contrast image.
In case of Balatonaliga the satellite image in the NIR band looks like the following:
Examining the satellite images made from the beginning of 2020, the utilization of the port of Balatonaliga has noticeably changed, since there is a period where the piers are barely visible, but in summer their lines are noticeably stronger due to the ships moored there. Comparing the images from March and August, the difference is clearly visible in the interactive comparison box below.
Data processing model
Anyone who has come this far by reading my post, has probalby already guessed that I will try to produce a trend curve for the utilization of the docks of Balatonaliga based on the Sentinel-2 satellite images.
But how will the images become trend curves? The following infographics are intended to show you exaclty this.
In Step 1, I downloaded all satellite images from 2020 which contain the dock of Balatonaliga. To do this, I used the QGIS SCP plugin, which downloads data from the Copernicus Open Access Hub. In Step 2, I created the shapefile of the examined area in QGIS, basically the Region Of Interest, also known as ROI. In Step 3, I cut out the ROI area from all satellite imagery with the shapefile, since the rest of the images are worthless and slows down the processing. I used Python and OpenCV for this and for the following operations as well. In Step 4, I converted the grayscale images to two-color, black-and-white images with thresholding. This means that a pixel can have either 0 (black) or 255 (white) value and nothing in between. With this, I actually produced images where the pier and moored boats are represented by white pixels, while the water is represented by blacks. In Step 5, I counted the white pixels in each image and loaded the date of the images. In Steps 6 and 7, I converted the white pixel values to percentages with some tedious mathematical operations, where 0% indicates the lowest utilization of the year (least white pixels) and 100% the highest (most white pixels). In Step 8, I drew the percentage values for the given dates, which is basically the raw data set. As you can clearly see in the chart below, the resulting curve is relatively “noisy”. Fortunately, this can be smoothed out quite well by applying a simple moving average (SMA) to the data points, but to do this I had to fill in the days without data with values. In Step 9 resampling must be performed on the data set and then the percentage values of the empty days must be filled using interpolation. At last, in Step 10, all that is left is to draw the 15-day moving average.
Based on the trend curve, utilization of the dock peaked in May, right at the beginning of the sailing season and then declined slightly in the summer months, presumably due to ever-changing traffic.
However, it is also important to mention that the trend curve above can be influenced by many factors, thereforeit is really only suitable for signaling trends.
The parameters that may affect the results of the data processing model above are the following:
- proper definition of test area
- atmospheric conditions, cloud cover, lighting
- spatial resolution of satellite images
- frequency of data
- thresholding method, limit value selection
- correct selection of the moving average sampling window
Similarities with other docks
Now let’s see how the trend curve looks like for other docks:
For each of the above docks, it is clear that port utilization will peak at some point in May, possibly in early June.