Tutorial 2: Natural Hazard Risk Assessment#
In the second tutorial of this week, we are going to use publicly available hazard data and exposure data to do a risk assessment for an area of choice within Europe. More specifically we will look at damage due to wind storms and flooding.
We will use both Copernicus Land Cover data and OpenStreetMap to estimate the potential damage of natural hazards to the built environment. We will use Copernicus Land Cover data to estimate the damage to specific land-uses, whereas we will use OpenStreetMap to assess the potential damage to the road system.
Important before we start#
Make sure that you save this file before you continue, else you will lose everything. To do so, go to Bestand/File and click on Een kopie opslaan in Drive/Save a Copy on Drive!
Now, rename the file into Week5_Tutorial2.ipynb. You can do so by clicking on the name in the top of this screen.
Learning Objectives#
To know how to download data from the Copernicus Climate Data Store using the
cdsapi
and access it through Python.To be able to open and visualize this hazard data.
To know how to access and open information from the Copernicus Land Monitoring System. Specifically the Corine Land Cover data.
To understand the basic approach of a natural hazard risk assessment.
To be able to use the
DamageScanner
to do a damage assessment.To interpret and compare the damage estimates.
Tutorial Outline
1.Introducing the packages#
Within this tutorial, we are going to make use of the following packages:
GeoPandas is a Python packagee that extends the datatypes used by pandas to allow spatial operations on geometric types.
OSMnx is a Python package that lets you download geospatial data from OpenStreetMap and model, project, visualize, and analyze real-world street networks and any other geospatial geometries. You can download and model walkable, drivable, or bikeable urban networks with a single line of Python code then easily analyze and visualize them. You can just as easily download and work with other infrastructure types, amenities/points of interest, building footprints, elevation data, street bearings/orientations, and speed/travel time.
xarray is a Python package that allows for easy and efficient use of multi-dimensional arrays.
Matplotlib is a comprehensive Python package for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible.
We will first need to install the missing packages in the cell below. Uncomment them to make sure we can pip install them
!pip install osmnx
!pip install rioxarray
!pip install cdsapi
Now we will import these packages in the cell below:
import os
import cdsapi
import shapely
import matplotlib
import urllib3
import pyproj
import osmnx as ox
import numpy as np
import xarray as xr
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
import networkx as nx
from matplotlib.colors import ListedColormap
from zipfile import ZipFile
from io import BytesIO
from urllib.request import urlopen
from zipfile import ZipFile
from tqdm import tqdm
urllib3.disable_warnings()
Connect to google drive#
To be able to read the data from Google Drive, we need to mount our Drive to this notebook.
As you can see in the cell below, make sure that in your My Drive folder, you have created a BigData folder and within that folder, you have created a Week5_Data folder in which you can store the files that are required to run this analysis.
Please go the URL when its prompted in the box underneath the following cell, and copy the authorization code in that box.
from google.colab import drive
drive.mount('/content/gdrive/')
import sys
sys.path.append("/content/gdrive/My Drive/BigData/")
data_path = os.path.join('/content/gdrive/My Drive/BigData/','Week5_Data')
2. Downloading and accessing natural hazard data#
We are going to perform a damage assessment using both windstorm data and flood data for Europe.
Windstorm Data#
The windstorm data will be downloaded from the Copernicus Climate Data Store. As we have seen during the lecture, and as you can also see by browsing on this website, there is an awful lot of climate data available through this Data Store. As such, it is very valuable to understand how to access and download this information to use within an analysis. To keep things simple, we only download one dataset today: A winter windstorm.
We will do so using an API, which is the acronym for application programming interface. It is a software intermediary that allows two applications to talk to each other. APIs are an accessible way to extract and share data within and across organizations. APIs are all around us. Every time you use a rideshare app, send a mobile payment, or change the thermostat temperature from your phone, you’re using an API.
However, before we can access this API, we need to take a few steps. Most importantly, we need to register ourselves on the Copernicus Climate Data Store portal. To do so, we need to register, as explained in the video clip below:
Now, the next step is to access the API. You can now login on the website of the Copernicus Climate Data Store. After you login, you can click on your name in the top right corner of the webpage (next to the login button). On the personal page that has just opened, you will find your user ID (uid) and your personal API. You need to add those in the cell below to be able to download the windstorm.
As you can see in the cell below, we download a specific windstorm that has occured on the 28th of October in 2013. This is storm Carmen (also called St Jude).
uid = XXX
apikey = 'XXX'
c = cdsapi.Client(key=f"{uid}:{apikey}", url="https://cds.climate.copernicus.eu/api/v2")
c.retrieve(
'sis-european-wind-storm-indicators',
{
'variable': 'all',
'format': 'zip',
'product': 'windstorm_footprints',
'year': '2013',
'month': '10',
'day': '28',
},
'Carmen.zip')
Flood Data#
The flood data we will extract from a repository maintained by the European Commission Joint Research Centre. We will download river flood hazard maps from their Flood Data Collection.
Here we do not need to use an API and we also do not need to register ourselves, so we can download any of the files directly. To do so, we use the urllib
package.
## this is the link to the 1/100 flood map for Europe
zipurl = 'https://jeodpp.jrc.ec.europa.eu/ftp/jrc-opendata/FLOODS/EuropeanMaps/floodMap_RP100.zip'
# and now we open and extract the data
with urlopen(zipurl) as zipresp:
with ZipFile(BytesIO(zipresp.read())) as zfile:
zfile.extractall(data_path)
The download and zip in the cell above sometimes does not work. If that is indeed the case (e.g., when it seems to remain stuck), download the files manually through the link and upload them in the data folder for this week (as explained at the start of this tutorial.)
Set location to explore#
Before we continue, we need to specify our location of interest. This should be a province that will have some flooding and relative high wind speeds occuring (else we will find zero damage).
Specify the region in the cell below by using the geocode_to_gdf()
function again. I have chosen Gelderland, but feel free to choose a different region. It would be good to double check later on whether there is actually some high wind speeds and flooding in your chosen area.
place_name = "Gelderland, The Netherlands"
area = ox.geocode_to_gdf(place_name)
3. Explore the natural hazard data#
As you can see in the section above, we have downloaded the storm footprint in a zipfile. Let’s open the zipfile and load the dataset using the xarray
package through the open_dataset()
function.
Windstorm Data#
with ZipFile('Carmen.zip') as zf:
# Let's get the filename first
file = zf.namelist()[0]
# And now we can open and select the file within Python
with zf.open(file) as f:
windstorm_europe = xr.open_dataset(f)
Let’s have a look at the storm we have downloaded!
windstorm_europe['FX'].plot()
Unfortunately, our data does not have a proper coordinate system defined yet. As such, we will need to use the rio.write_crs()
function to set the coordinate system to EPSG:4326 (the standard global coordinate reference system).
We also need to make sure that the functions will know what the exact parameters are that we have to use for our spatial dimenions (e.g. longitude and latitude). It prefers to be named x
and y
. So we use the rename()
function before we use the set_spatial_dims()
function.
windstorm_europe.rio.write_crs(4326, inplace=True)
windstorm_europe = windstorm_europe.rename({'Latitude': 'y','Longitude': 'x'})
windstorm_europe.rio.set_spatial_dims(x_dim="x",y_dim="y", inplace=True)
Following, we also make sure it will be in the European coordinate system EPSG:3035 to ensure we can easily use it together with the other data. To do so, we use the reproject()
function. You can simple add the number of the coordinate system.
windstorm_europe = windstorm_europe.rio.reproject(XXXX)
Now we have all the information to clip the windstorm data to our area of interest:
windstorm_map = windstorm_europe.rio.clip(area.envelope.values, area.crs)
And let’s have a look as well by using the plot()
function. Please note that the legend is in meters per second.
windstorm_map['FX']. XXXX
Flood data#
And similarly, we want to open the flood map. But now we do not have to unzip the file anymore and we can directly open it through using xarray
:
flood_map_path = os.path.join(data_path,'floodmap_EFAS_RP100_C.tif')
flood_map = xr.open_dataset(flood_map_path, engine="rasterio")
flood_map
And let’s make sure we set all the variables and the CRS correctly again to be able to open the data properly. Note that we now use EPSG:3035. This is the standard coordinate system for Europe, in meters (instead of degrees).
flood_map.rio.write_crs(3035, inplace=True)
flood_map.rio.set_spatial_dims(x_dim="x",y_dim="y", inplace=True)
Now it is pretty difficult to explore the data for our area of interest, so let’s clip the flood data.
We want to clip our flood data to our chosen area. The code, however, is very inefficient and will run into memories issues on Google Colab. As such, we first need to clip it by using a bounding box, followed by the actual clip.
A few hints:
carefully read the documentation of the
.clip_box()
function of rioxarray. Which information do you need?is the GeoDataFrame of your region (the area GeoDataframe) in the same coordinate system? Perhaps you need to convert it using the
.to_crs()
function.how do you get the bounds from your area GeoDataFrame?
The final step of the clip would be to use the
.rio.clip()
function, using the actual area file and the flood map clipped to the bounding box. Please note that you should not use the envelope here, like we did in the previous clip. Here we really want to use the exact geometry values.
As you will see, we first clip it very efficiently using the bounding box. After that, we do an exact clip.
min_lon = area.to_crs(epsg=3035).bounds.minx.values[0]
min_lat = area.to_crs(epsg=3035).bounds.miny
max_lon = area.to_crs(epsg=3035).bounds
max_lat = area.to_crs(epsg=3035).
flood_map_area = flood_map.rio.clip_box(minx=.... )
flood_map_area = flood_map_area.rio.clip(area.XXXX.values, area.crs)
And let’s have a look as well. Please note that the legend is in meters.
flood_map_area['band_data'].plot(cmap='Blues',vmax=10)
4. Download and access Copernicus Land Cover data#
Unfortunately, there is no API option to download the Corine Land Cover data. We will have to download the data from the website first.
To do so, we will first have to register ourselves again on the website. Please find in the video clip below how to register yourself on the website of the Copernicus Land Monitoring Service:
Now click on the Login button in the top right corner to login on the website. There are many interesting datasets on this website, but we just want to download the Corine Land Cover data, and specifically the latest version: Corine Land Cover 2018. To do so, please select the Corine Land Cover - 100 meter. Now click on the large green Download button. Your download should start any minute.
Slightly annoying, the file you have downloaded is double zipped. Its slightly inconvenient to open this through Python and within Google Drive. So let’s unzip it twice outside of Python (on your local machine) and then direct yourself to the DATA
directory within the unzipped file. Here you can find a file called U2018_CLC2018_V2020_20u1.tif
. Drop this file into this week’s data directory, as specified at the start of this tutorial when we mounted our Google Drive.
CLC_location = os.path.join(data_path,'U2018_CLC2018_V2020_20u1.tif')
CLC = xr.open_dataset(CLC_location, engine="rasterio")
Similarly to the flood map data, we need to do a two-stage clip again (like we did before in this tutorial to ensure we get only our area of interest without exceeding our RAM.
CLC_region = CLC.rio.clip_box(minx=.....,)
CLC_region = CLC_region.rio.clip(area.geometry.values,area.crs)
CLC_region = CLC_region.rename({'x': 'lat','y': 'lon'})
CLC_region.rio.set_spatial_dims(x_dim="lat",y_dim="lon", inplace=True)
And now we create a color_dict again, similarly as we did for the raster data in the previous tutorial
CLC_values = [111, 112, 121, 122, 123, 124, 131, 132, 133, 141, 142, 211, 212, 213, 221, 222, 223, 231, 241, 242,
243, 244, 311, 312, 313, 321, 322, 323, 324, 331, 332, 333, 334, 335, 411, 412, 421, 422, 423, 511, 512, 521, 522, 523]
CLC_colors = ['#E6004D', '#FF0000', '#CC4DF2', '#CC0000', '#E6CCCC', '#E6CCE6', '#A600CC', '#A64DCC', '#FF4DFF', '#FFA6FF', '#FFE6FF', '#FFFFA8', '#FFFF00', '#E6E600',
'#E68000', '#F2A64D', '#E6A600', '#E6E64D', '#FFE6A6', '#FFE64D', '#E6CC4D', '#F2CCA6', '#80FF00', '#00A600',
'#4DFF00', '#CCF24D', '#A6FF80', '#A6E64D', '#A6F200', '#E6E6E6', '#CCCCCC', '#CCFFCC', '#000000', '#A6E6CC',
'#A6A6FF', '#4D4DFF', '#CCCCFF', '#E6E6FF', '#A6A6E6', '#00CCF2', '#80F2E6', '#00FFA6', '#A6FFE6', '#E6F2FF']
color_dict_raster = dict(zip(CLC_values,CLC_colors))
# We create a colormar from our list of colors
cm = ListedColormap(CLC_colors)
# Let's also define the description of each category : 1 (blue) is Sea; 2 (red) is burnt, etc... Order should be respected here ! Or using another dict maybe could help.
labels = np.array(CLC_values)
len_lab = len(labels)
# prepare normalizer
## Prepare bins for the normalizer
norm_bins = np.sort([*color_dict_raster.keys()]) + 0.5
norm_bins = np.insert(norm_bins, 0, np.min(norm_bins) - 1.0)
## Make normalizer and formatter
norm = matplotlib.colors.BoundaryNorm(norm_bins, len_lab, clip=True)
fmt = matplotlib.ticker.FuncFormatter(lambda x, pos: labels[norm(x)])
And let’s plot the Corine Land Cover data for our area of interest
fig, ax = plt.subplots(1, 1,figsize=(14,10))
CLC_region["band_data"].plot(ax=ax,levels=len(CLC_colors),colors=CLC_colors)
5. Perform a damage assessment using Coring Land Cover#
To calculate the potential damage to both windstorms and floods, we use stage-damage curves, which relate the intensity of the hazard to the fraction of maximum damage that can be sustained by a certain land use. As you can see on the Corine Land Cover map that we just plotted, there are a lot of land use classes (44), though not all will suffer damage from either the windstorm or the flood event. For each of the land-use classes a curve and a maximum damage number are assigned.
To Assess the damage for both the flood and windstorm event, we are going to make use of the DamageScanner, which is a tool to calculate potential flood damages based on inundation depth and land use using depth-damage curves in the Netherlands. The DamageScanner was originally developed for the ‘Netherlands Later’ project (Klijn et al., 2007). The original land-use classes were based on the Land-Use Scanner in order to evaluate the effect of future land-use change on flood damages. We have tailored the input of the DamageScanner to make sure it can estimate the damages using Corine Land Cover.
Because the simplicity of the model, we can use this for any raster-based hazard map with some level of intensity. Hence, we can use it for both hazards.
def DamageScanner(landuse_map,inun_map,curve_path,maxdam_path,cellsize=100):
# load land-use map
landuse = landuse_map.copy()
# Load inundation map
inundation = inun_map.copy()
inundation = np.nan_to_num(inundation)
# Load curves
if isinstance(curve_path, pd.DataFrame):
curves = curve_path.values
elif isinstance(curve_path, np.ndarray):
curves = curve_path
#Load maximum damages
if isinstance(maxdam_path, pd.DataFrame):
maxdam = maxdam_path.values
elif isinstance(maxdam_path, np.ndarray):
maxdam = maxdam_path
# Speed up calculation by only considering feasible points
inun = inundation * (inundation>=0) + 0
inun[inun>=curves[:,0].max()] = curves[:,0].max()
waterdepth = inun[inun>0]
landuse = landuse[inun>0]
# Calculate damage per land-use class for structures
numberofclasses = len(maxdam)
alldamage = np.zeros(landuse.shape[0])
damagebin = np.zeros((numberofclasses, 4,))
for i in range(0,numberofclasses):
n = maxdam[i,0]
damagebin[i,0] = n
wd = waterdepth[landuse==n]
alpha = np.interp(wd,((curves[:,0])),curves[:,i+1])
damage = alpha*(maxdam[i,1]*cellsize)
damagebin[i,1] = sum(damage)
damagebin[i,2] = len(wd)
if len(wd) == 0:
damagebin[i,3] = 0
else:
damagebin[i,3] = np.mean(wd)
alldamage[landuse==n] = damage
# create pandas dataframe with output
loss_df = pd.DataFrame(damagebin.astype(float),columns=['landuse','losses','area','avg_depth']).groupby('landuse').sum()
# return output
return loss_df.sum().values[0],loss_df
Windstorm Damage#
To estimate the potential damage of our windstorm, we use the vulnerability curves developed by Yamin et al. (2014). Following Yamin et al. (2014), we will apply a sigmoidal vulnerability function satisfying two constraints: (i) a minimum threshold for the occurrence of damage with an upper bound of 100% direct damage; (ii) a high power-law function for the slope, describing an increase in damage with increasing wind speeds. Due to the limited amount of vulnerability curves available for windstorm damage, we will use the damage curve that represents low-rise reinforced masonry buildings for all land-use classes that may contain buildings. Obviously, this is a large oversimplification of the real world, but this should be sufficient for this exercise. When doing a proper stand-alone windstorm risk assessment, one should take more effort in collecting the right vulnerability curves for different building types.
wind_curves = pd.read_excel("https://github.com/ElcoK/BigData_AED/raw/main/week5/damage_curves.xlsx",sheet_name='wind_curves')
maxdam = pd.read_excel("https://github.com/ElcoK/BigData_AED/raw/main/week5/damage_curves.xlsx",sheet_name='maxdam')
Unfortunately, we run into a classic problem when we want to overlay the windstorm data with the Corine Land Cover data. The windstorm data is not only stored in a different coordinate system (we had to convert it from EPSG:4326 to EPSG:3035), it is in a different resolution (1km instead of the 100m of Corine Land Cover).
Let’s first have a look how our clipped data look’s like. If you have decided to use Gelderland, you will see that we have 102 columns (our Lattitude/lat) and 74 rows (our Longitude/lon). If you scroll above to our Corine Land Cover data, you see that dimensions are different: 1270 columns (Lattitude/lat/x) and 870 rows (Longitude/lon/y).
windstorm_map
The first thing we are going to do is try to make sure our data will be in the correct resolution (moving from 1km to 100m). To do so, we will use the rio.reproject()
function. You will see that specify the resolution as 100. Because EPSG:3035 is a coordinate system in meters, we can simply use meters to define the resolution. We use the rio.clip()
function to make sure we clip it again to our area of interest. The function below (match_rasters
) will do the hard work for us. Please note all the input variables to understand what’s happening.
def match_rasters(hazard,landuse,haz_crs=3035,lu_crs=3035,resolution=100,hazard_col=['FX']):
"""
Clips, reprojections, and matches the resolutions of two rasters, `hazard` and `landuse`,
to prepare them for further analysis.
Parameters
----------
hazard : xarray.DataArray
A 2D or 3D array containing hazard data.
landuse : xarray.DataArray
A 2D array containing land use data.
haz_crs : int, optional
The CRS of `hazard`. Default is EPSG:3035.
lu_crs : int, optional
The CRS of `landuse`. Default is EPSG:3035.
resolution : float, optional
The desired resolution in meters for both `hazard` and `landuse` after reprojection. Default is 100.
hazard_col : list, optional
A list of column names or indices for the hazard variable. Default is ['FX'].
Returns
-------
tuple
A tuple containing two xarray.DataArray objects:
- The land use variable with matching resolution and dimensions to the hazard variable.
- The hazard variable clipped to the extent of the land use variable, with matching resolution and dimensions.
"""
# Set the crs of the hazard variable to haz_crs
hazard.rio.write_crs(haz_crs, inplace=True)
# Set the x and y dimensions in the hazard variable to 'x' and 'y' respectively
hazard.rio.set_spatial_dims(x_dim="x",y_dim="y", inplace=True)
# Reproject the landuse variable from EPSG:4326 to EPSG:3857
landuse = CLC_region.rio.reproject("EPSG:3857",resolution=resolution)
# Get the minimum longitude and latitude values in the landuse variable
min_lon = landuse.x.min().to_dict()['data']
min_lat = landuse.y.min().to_dict()['data']
# Get the maximum longitude and latitude values in the landuse variable
max_lon = landuse.x.max().to_dict()['data']
max_lat = landuse.y.max().to_dict()['data']
# Create a bounding box using the minimum and maximum latitude and longitude values
area = gpd.GeoDataFrame([shapely.box(min_lon,min_lat,max_lon, max_lat)],columns=['geometry'])
# Set the crs of the bounding box to EPSG:3857
area.crs = 'epsg:3857'
# Convert the crs of the bounding box to EPSG:4326
area = area.to_crs(f'epsg:{haz_crs}')
# Clip the hazard variable to the extent of the bounding box
hazard = hazard.rio.clip(area.geometry.values, area.crs)
# Reproject the hazard variable to EPSG:3857 with the desired resolution
hazard = hazard.rio.reproject("EPSG:3857",resolution=resolution)
# Clip the hazard variable again to the extent of the bounding box
hazard = hazard.rio.clip(area.geometry.values, area.crs)
# If the hazard variable has fewer columns and rows than the landuse variable, reproject the landuse variable to match the hazard variable
if (len(hazard.x)<len(landuse.x)) & (len(hazard.y)<len(landuse.y)):
landuse= landuse.rio.reproject_match(hazard)
# If the hazard variable has more columns and rows than the landuse variable, reproject the hazard variable to match the landuse variable
elif (len(hazard.x)>len(landuse.x)) & (len(hazard.y)>len(landuse.y)):
hazard = hazard.rio.reproject_match(landuse)
# return the new landuse and hazard map
return landuse,hazard
Now let’s run the match_rasters
function and let it do its magic.
CLC_region_wind, windstorm = match_rasters(windstorm_europe,
CLC_region,
haz_crs=3035,
lu_crs=3035,
resolution=100,
hazard_col=['FX'])
And let’s have a look if the two rasters are now the same extend:
CLC_region_wind
windstorm
It worked! And to double check, let’s also plot it:
windstorm.FX.plot()
Now its finally time to do our damage assessment! To do so, we need to convert our data to numpy.arrays()
to do our calculation:
landuse_map = CLC_region_wind['band_data'].to_numpy()[0,:,:]
wind_map = windstorm['FX'].to_numpy()[0,:,:]
wind_map.shape
And remember that our windstorm data was stored in m/s. Hence, we need to convert it to km/h:
wind_map_kmh = wind_map*XXX
And now let’s run the DamageScanner to obtain the damage results
wind_damage_CLC = DamageScanner(landuse_map,wind_map_kmh,wind_curves,maxdam)[1]
wind_damage_CLC
Flood Damage#
To Assess the flood damage, we are again going to make use of the DamageScanner. The Corine Land Cover data is widely used in European flood risk assessments. As such, we can simply make use of pre-developed curves. We are using the damage curves as developed by Huizinga et al. (2007). Again, let’s first load the maximum damages and the depth-damage curves:
flood_curves = pd.read_excel("https://github.com/ElcoK/BigData_AED/raw/main/week5/damage_curves.xlsx",sheet_name='flood_curves',engine='openpyxl')
maxdam = pd.read_excel("https://github.com/ElcoK/BigData_AED/raw/main/week5/damage_curves.xlsx",sheet_name='maxdam')
And convert our data to numpy.arrays()
to do our calculation:
landuse_map = CLC_region['band_data'].to_numpy()
flood_map = flood_map_area['band_data'].to_numpy()
And now let’s run the DamageScanner to obtain the damage results
flood_damage_CLC = DamageScanner(landuse_map,flood_map,flood_curves,maxdam)[1]
flood_damage_CLC
6. Perform a damage assessment of the road network using OpenStreetMap#
Generally, wind damage does not cause much damage to roads. There will be clean-up cost of the trees that will fall on the roads, but structural damage is rare. As such, we will only do a flood damage assessment for the road network of our region.
To do so, we first need to extract the roads again. We will use the graph_from_place()
function again to do so. However, the area will be to large to extract roads, so we will focus our analysis on the main network.
cf = '["highway"~"trunk|motorway|primary|secondary"]'
G = ox.graph_from_place(place_name, network_type="drive", custom_filter=cf)
And convert the road network to a geodataframe
, as done in the previous tutorial as well.
roads = gpd.GeoDataFrame(nx.to_pandas_edgelist(G))
roads.highway = roads.highway.astype('str')
In case the above does not work, you can continue the assignment by using the code below (make sure you remove the hashtags to run it).
#from urllib import request
# remote_url = 'https://github.com/ElcoK/BigData_AED/raw/main/week5/kampen_roads.gpkg'
# file = 'kampen_roads.gpkg'
#
# #request.urlretrieve(remote_url, file)
# roads = gpd.GeoDataFrame.from_file('kampen_roads.gpkg')
And lets have a look at the data:
fig, ax = plt.subplots(1, 1,figsize=(12,10))
roads.plot(column='highway',legend=True,ax=ax,legend_kwds={'loc': 'lower right'});
# remove the ax labels
ax.set_xticks([])
ax.set_yticks([])
ax.set_axis_off()
It is actually quite inconvenient to have all these lists in the data for when we want to do the damage assessment. Let’s clean this up a bit. To do so, we first make sure that all the lists are represented as actual lists, and not lists wrapped within a string.
roads.highway = roads.highway.apply(lambda x: x.strip('][').split(', '))
Now we just need to grab the first element of each of the lists.
roads.highway = roads.highway.apply(lambda x: x[0] if isinstance(x, list) else x)
roads.highway = roads.highway.str.replace("'","")
And let’s have a look whether this worked:
fig, ax = plt.subplots(1, 1,figsize=(12,10))
roads.plot(column='highway',legend=True,ax=ax,legend_kwds={'loc': 'upper left','ncol':1});
# remove the ax labels
ax.set_xticks([])
ax.set_yticks([])
ax.set_axis_off()
Nice! now let’s start with the damage calculation. As you already have may have noticed, our data is now not stored in raster format, but in vector format. One way to deal with this issue is to convert our vector data to raster data, but we will lose a lot of information and detail. As such, we will perform the damage assessment on the road elements, using the xarray flood map.
Let’s start with preparing the flood data into vector format:
# get the mean values
flood_map_vector = flood_map_area['band_data'].to_dataframe().reset_index()
# create geometry values and drop lat lon columns
flood_map_vector['geometry'] = [shapely.points(x) for x in list(zip(flood_map_vector['x'],flood_map_vector['y']))]
flood_map_vector = flood_map_vector.drop(['x','y','band','spatial_ref'],axis=1)
# drop all non values to reduce size
flood_map_vector = flood_map_vector.loc[~flood_map_vector['band_data'].isna()].reset_index(drop=True)
# and turn them into squares again:
flood_map_vector.geometry= shapely.buffer(flood_map_vector.geometry,distance=100/2,cap_style='square').values
And let’s plot the results:
gpd.GeoDataFrame(flood_map_vector.copy()).plot(column='band_data',cmap='Blues',vmax=5,linewidth=0)
We will need a bunch of functions to make sure we can do our calculations. They are specified below.
def reproject(df_ds,current_crs="epsg:4326",approximate_crs = "epsg:3035"):
geometries = df_ds['geometry']
coords = shapely.get_coordinates(geometries)
transformer=pyproj.Transformer.from_crs(current_crs, approximate_crs,always_xy=True)
new_coords = transformer.transform(coords[:, 0], coords[:, 1])
return shapely.set_coordinates(geometries.copy(), np.array(new_coords).T)
def buffer_assets(assets,buffer_size=100):
assets['buffered'] = shapely.buffer(assets.geometry.values,buffer_size)
return assets
def overlay_hazard_assets(df_ds,assets):
#overlay
hazard_tree = shapely.STRtree(df_ds.geometry.values)
if (shapely.get_type_id(assets.iloc[0].geometry) == 3) | (shapely.get_type_id(assets.iloc[0].geometry) == 6):
return hazard_tree.query(assets.geometry,predicate='intersects')
else:
return hazard_tree.query(assets.buffered,predicate='intersects')
def get_damage_per_asset(asset,df_ds,assets):
# find the exact hazard overlays:
get_hazard_points = df_ds.iloc[asset[1]['hazard_point'].values].reset_index()
get_hazard_points = get_hazard_points.loc[shapely.intersects(get_hazard_points.geometry.values,assets.iloc[asset[0]].geometry)]
asset_geom = assets.iloc[asset[0]].geometry
maxdam_asset = 100
hazard_intensity = np.arange(0,10,0.1)
fragility_values = np.arange(0,1,0.01)
if len(get_hazard_points) == 0:
return asset[0],0
else:
get_hazard_points['overlay_meters'] = shapely.length(shapely.intersection(get_hazard_points.geometry.values,asset_geom))
return asset[0],np.sum((np.interp(get_hazard_points.band_data.values,hazard_intensity,fragility_values))*get_hazard_points.overlay_meters*maxdam_asset)
Now we need to make sure that the road data is the same coordinate system.
roads.geometry = reproject(roads)
And we can now overlay the roads with the flood data
overlay_roads = pd.DataFrame(overlay_hazard_assets(flood_map_vector,buffer_assets(roads)).T,columns=['asset','hazard_point'])
And estimate the damages
collect_output = []
for asset in tqdm(overlay_roads.groupby('asset'),total=len(overlay_roads.asset.unique()),
desc='polyline damage calculation for'):
collect_output.append(get_damage_per_asset(asset,flood_map_vector,roads))
damaged_roads = roads.merge(pd.DataFrame(collect_output,columns=['index','damage']),
left_index=True,right_on='index')[['highway','geometry','damage']]
And let’s plot the results
fig, ax = plt.subplots(1, 1,figsize=(12,10))
damaged_roads.plot(column='damage',cmap='Reds',ax=ax);