[ ]:
%load_ext autoreload
%autoreload 2
Application of rfactor on data det Flanders
Aim
Implementation to compute and analyse the R-factor of the RUSLE-equation. The R-factor is a measure for the total erosivity of a number of rainfall events within a defined timeframe (year, month, number of days). The factor is computed by calculating the yearly sum of -for every rainfall event- the sum of the depth of rainfall (mm) and the kinetic energy, and taking the mean over all years.
Package imports
[ ]:
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Load the rfactor
package functionalities
[ ]:
from rfactor import compute_erosivity, maximum_intensity_matlab_clone, rain_energy_verstraeten2006
from rfactor.process import load_rain_folder , load_rain_file_matlab_legacy, compute_rainfall_statistics, get_rfactor_station_year
Input data and compute erosivity
Set folders
The main function compute_erosivity
requires a pandas DataFrame containing non-zero rainfall data with the following columns:
datetime (pd.Timestamp): Time stamp
rain_mm (float): Rain in mm
station (str): Measurement station identifier
However, the legacy format of the input files used in the original Matlab implementation is still supported as well. For this format, te input files are defined by text files (extension: txt
) that hold non-zero rainfall timeseries. The data are split per station and per year with a specific data file tag STATION_YYYY
:
KMI_6414_2004.txt
KMI_6414_2005.txt
...
KMI_6434_2003.txt
KMI_6434_2004.txt
...
Note: the files in tests/data/test_rainfalldata
are an incomplete data set for Belgium (RMI rainfall data are not included), an alternative is provided in the section ‘Analysis’.
[ ]:
fmap_rainfall = Path(r"../../tests/data/test_rainfalldata")
Use the load_rain_folder
function to read all rain data in a folder:
[ ]:
all_rainfall = load_rain_folder(fmap_rainfall, load_rain_file_matlab_legacy)
[ ]:
all_rainfall
Compute rainfall statistics (and add station metadata)
[ ]:
stations = pd.read_csv("data/stations.csv")
rain_stats = compute_rainfall_statistics(all_rainfall, stations)
rain_stats.head()
Select stations to compute erosivity for
[ ]:
consider_stations = pd.read_csv("data/files.csv")
sel_rainfall = all_rainfall.loc[all_rainfall["tag"].isin(consider_stations["datafile"])]
[ ]:
sel_rainfall.head()
Run Model
It needs to be decided which calculation functions are used to compute the rain energy depth
and maximum intensity
. There are a few implemented functions available in the package for which we choose the rain_energy_verstraeten
and maximum_intensity
by default. The maximum_intensity
function follows a slightly different approach to the original Matlab implementation. However, one can opt to use the matlab_clone
method as well, which is a Python implementation mimicking the
original Matlab one:
[ ]:
erosivity = compute_erosivity(all_rainfall, rain_energy_verstraeten2006, maximum_intensity_matlab_clone)
erosivity.head()
Users can choose to implement their own custom function and apply it, e.g.
```python def rain_energy_per_unit_depth_brown_and_foster(rain): “””Calculate rain energy per unit depth according to Brown and Foster.
Parameters
rain : numpy.ndarray Rain (mm)
Returns
energy : float Energy per unit depth.
Notes
The rain energy per unit depth :math:e_r
(:math:\\text{J}.\\text{mm}^{-1}. \\text{m}^{-2}
) is defined by Brown and Foster (1987) and Renard et al. (1997)
e_r = 29*(1-0.72*exp(-0.05*i_r)
with
:math:
i_r
the rain intensity for every 10-min increment (mm :math:\\text{h}^{-1}
).
References
Brown, L.C., Foster, G.R., 1987. Storm erosivity using idealized intensity distributions. Transactions of the ASAE 30, 0379–0386. https://doi.org/10.13031/2013.31957.
Renard, K.G., Foster, G.R., Weesies, G.A., McCool, D.K., Yoder, D.C., 1997, Predicting soil erosion by water: a guide to conservation planning with the revised universal soil loss equation (RUSLE), Agriculture Handbook. U.S. Department of Agriculture, Washington. https://www.ars.usda.gov/ARSUserFiles/64080530/RUSLE/AH_703.pdf
“”” rain_energy = 29 * (1 - 0.72 * np.exp(-0.05rain6))*rain return rain_energy.sum()
Analysis
If one does not have acces to the RMI data, the output erosivity from the computation with the full data set is provided in this repository as an alternative:
[ ]:
erosivity = pd.read_csv("data/erosivity_belgium.csv", parse_dates=[0, 1], index_col=0)
erosivity.head()
Ukkel
Compute R-factor for specific periods for Ukkel (KMI_6447 and KMI_F3).
[ ]:
timeseries = [range(1898, 2003, 1),
range(2003, 2021, 1),
range(1898, 2021, 1),
range(1996, 2021, 1),
range(1991, 2021, 1),
range(1990, 2001, 1),
range(2000, 2021, 1)]
for period in timeseries:
rfactor = get_rfactor_station_year(erosivity,["KMI_6447","KMI_FS3"], period)
print(f"R-factor for period from {period[0]} till {period[-1]} is: {rfactor['erosivity_cum'].mean():.2f}")
Make a figure for Ukkel
[ ]:
rfactor_all= get_rfactor_station_year(erosivity, stations=["KMI_FS3", "KMI_6447"]).sort_values("year")
[ ]:
fig, ax = plt.subplots(figsize=(12, 7.5))
rfactor_fs3 = get_rfactor_station_year(erosivity, ["KMI_FS3"])
rfactor_fs3 = rfactor_fs3[~(rfactor_fs3["erosivity_cum"].isnull())].sort_values("year")
rfactor_6447 = get_rfactor_station_year(erosivity, ["KMI_6447"])
rfactor_6447 = rfactor_6447[~(rfactor_6447["erosivity_cum"].isnull())].sort_values("year")
# add first line of KMI6447 to end of KMI FS3 in order to make a continuous timeseries on the plot
rfactor_fs3 = pd.concat([rfactor_fs3,rfactor_6447.iloc[0]])
ax.plot(rfactor_fs3["year"], rfactor_fs3["erosivity_cum"], label="KMI_FS3 (1898-2002)")
ax.plot(rfactor_6447["year"], rfactor_6447["erosivity_cum"], label="KMI_6447 (2003-2020)")
# rolling mean
rfactor_all = get_rfactor_station_year(erosivity, stations=["KMI_FS3", "KMI_6447"]).sort_values("year")
windows = [10, 20, 30]
for window in windows:
y = rfactor_all.rolling(window)["erosivity_cum"].mean()
ax.plot(rfactor_all["year"], y, ls=':', lw=3,
label=f"Moving average ({window} year window)")
ax.set_ylim([0,3500])
ax.set_xlim([1890,2020])
ax.tick_params(axis='both', which='major', labelsize=14)
ax.set_xlabel(r'Year',fontsize=16)
ax.set_ylabel(r'Yearly R-factor [MJ mm ha$^{-1}$ h$^{-1}$ year$^{-1}$]',fontsize=16)
ax.legend(prop={'size':16})
All stations except for Ukkel
[ ]:
stations_belgium_excl_ukkel = [station for station in stations["station"]
if station not in ["KMI_6447","KMI_FS3", "KMI1990_2002", "KMI2003_nu_v2"]]
years = range(1898, 2021, 1)
years = [year for year in years if year in
erosivity[erosivity["station"].isin(stations_belgium_excl_ukkel)]["year"].unique()]
rfactor_non_ukkel = get_rfactor_station_year(erosivity,
stations=stations_belgium_excl_ukkel,
years=years)
print(rfactor_non_ukkel["erosivity_cum"].mean())
len(rfactor_non_ukkel["station"].unique())
All stations flanders
[ ]:
stations_flanders = [station for station in stations["station"]
if station not in ["KMI_6447", "KMI_FS3", "KMI_6455", "KMI_6459",
"KMI_6472", "KMI_6494", "KMI_6484", "KMI1990_2002", "KMI2003_nu_v2"]]
years = range(1898, 2021, 1)
years = [year for year in years if year in
erosivity[erosivity["station"].isin(stations_flanders)]["year"].unique()]
rfactor_flanders = get_rfactor_station_year(erosivity,stations_flanders, years)
print(rfactor_flanders["erosivity_cum"].mean())
len(rfactor_flanders["station"].unique())
Plot the distribution of all R-values for Belgium, excluding the data from Ukkel
[ ]:
years = [year for year in range(1991, 2021) if year in
erosivity[erosivity["station"].isin(stations_belgium_excl_ukkel)]["year"].unique()]
rfactor_non_ukkel = get_rfactor_station_year(erosivity, stations_belgium_excl_ukkel, years)
fig, ax = plt.subplots()
ax.hist(rfactor_non_ukkel["erosivity_cum"], bins=20, color=[0.8]*3,
label=r"Jaarlijkse waarden voor alle 55 stations in België")
ax.plot([1239, 1239], [0, 120], color=[0.2]*3, ls=":", label="Ukkel (30-jaar referentie periode)")
ax.set_ylabel(r"#")
ax.set_xlabel(r"R-waarde [MJ mm ha$^{-1}$ h$^{-1}$ jaar$^{-1}$]")
ax.set_ylim([0,180])
ax.legend()
All stations of the VMM (Flanders)
[ ]:
stations_vmm = [station for station in stations["station"] if "KMI" not in station]
rfactor_vmm = get_rfactor_station_year(erosivity, stations_vmm)
print(rfactor_vmm["erosivity_cum"].mean())
len(rfactor_vmm["station"].unique())
Compute values per year over all stations
[ ]:
(get_rfactor_station_year(erosivity)
.groupby("station")
.aggregate(
{"erosivity_cum":[np.mean, np.std],
"year": "count"})
.sort_values(('year', 'count'), ascending=False)
.reset_index()
)
Analysis (monthly resolution)
Get the EI30-values for 2018 based on two Ukkel station (“KMI_6447”,”KMI_FS3”)
[ ]:
erosivity_monthly = (erosivity.loc[erosivity["station"].isin(["KMI_6447", "KMI_FS3"]), "erosivity"]
.resample("M")
.sum()
.to_frame())
erosivity_monthly["month"] = erosivity_monthly.index.month
erosivity_monthly.head()
Get mean and interquartile range of monthly R-factor
[ ]:
em = (erosivity_monthly
.groupby("month")
.aggregate({
"erosivity":[np.mean,
lambda x: np.percentile(x, 25),
lambda x: np.percentile(x,75)]
}))
[ ]:
em["mean"] = em["erosivity"]["mean"]
em["l_e"] = em["erosivity"]["mean"] - em["erosivity"]["<lambda_0>"]
em["u_e"] = em["erosivity"]["<lambda_1>"] - em["erosivity"]["mean"]
Plot
[ ]:
fig = plt.figure(figsize=(8, 4))
x = np.arange(len(em))
y_gv = [26, 20, 24, 27, 70, 77, 144, 102, 68, 50, 37, 32]
y_gv2 = [35, 27, 32, 36, 86, 96, 177, 135, 89, 66, 48, 43]
plt.bar(x-0.3,em["mean"],yerr=em[["l_e","u_e"]].T.values,color=[0.80]*3,width=0.3,label="Ukkel (1898-2020))")
plt.bar(x,y_gv,width=0.3,color=[0.5]*3,label="Verstraeten $\it{et. al}$ (2001)")
plt.bar(x+0.3,y_gv2,width=0.3,color=[0.2]*3,label="Verstraeten $\it{et. al}$ (2006)")
plt.ylabel("R-value")
ax = fig.axes
plt.xticks(x,["Januari","Februari","Maart","April","Mei","Juni","Juli","Augustus","September","Oktober","November","December"],rotation=45)
plt.legend(loc=2,facecolor ="white")
plt.ylabel(r"R-waarde [MJ mm ha$^{-1}$ h$^{-1}$ maand$^{-1}$]");
Additional: monthly analysis (detail)
Evolution monthly R-value
Generate plot to analyse evolution of montly R over the years
[ ]:
def subplot_montlhy_R(df,ax,ind1,ind2):
if (ind1==0) & (ind2==2):
ax.plot(df["year"],df["erosivity"],color=[0.1]*3,alpha=0.5,label="EI30")
else:
ax.plot(df["year"],df["erosivity"],color=[0.1]*3,alpha=0.5)
ax.fill_between([1998,2021], 0,500,facecolor='grey', alpha=0.3)
# rolling mean
scales = [10,20,30]
for scale in scales:
y = df["erosivity"].rolling(scale).mean()
if scale==10:
y_max=np.nanmax(y.values)*1.1
x = y.index.year
ax.plot(x,y,lw=2,label=f'voortschrijdend gemiddelde ({scale} jaar)')
if ind1!=3:
ax.set_xticks([])
else:
if ind2==1:
ax.set_xlabel(r'Jaar (-)',fontsize=16)
ax.set_ylim([0,500])
ax.set_title(name)
ax.tick_params(axis='y', which='major', labelsize=12,rotation=90)
ax.tick_params(axis='x', which='major', labelsize=12)
if ind2!=0:
ax.set_yticklabels([])
else:
if ind1==2:
ax.set_ylabel('\t\t\t Maandelijkse R [MJ mm ha$^{-1}$ h$^{-1}$ maand$^{-1}$]',fontsize=16)
if (ind1==0) & (ind2==2):
ax.legend(ncol=2)
fig,ax = plt.subplots(4,3,figsize=[9,9])
months = ["Januari","Februari","Maart","April","Mei","Juni","Juli","Augustus","September","Oktober","November","December"]
erosivity_monthly["year"] = erosivity_monthly.index.year
for month,name in enumerate(months):
ind1=int(np.floor((month)/3))
ind2=np.mod(month,3)
df_plot = erosivity_monthly.loc[erosivity_monthly["month"]==month+1]
df_plot = df_plot [~np.isnan(df_plot["erosivity"])].sort_values("year")
subplot_montlhy_R(df_plot,ax[ind1,ind2],ind1,ind2)