[ ]:
%load_ext autoreload
%autoreload 2

Application of rfactor on data det Flanders

Aim

Implementation to compute and analyse the R-factor of the RUSLE-equation. The R-factor is a measure for the total erosivity of a number of rainfall events within a defined timeframe (year, month, number of days). The factor is computed by calculating the yearly sum of -for every rainfall event- the sum of the depth of rainfall (mm) and the kinetic energy, and taking the mean over all years.

Package imports

[ ]:
from pathlib import Path

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Load the rfactor package functionalities

[ ]:
from rfactor import compute_erosivity, maximum_intensity_matlab_clone, rain_energy_verstraeten2006
from rfactor.process import load_rain_folder , load_rain_file_matlab_legacy, compute_rainfall_statistics, get_rfactor_station_year

Input data and compute erosivity

Set folders

The main function compute_erosivity requires a pandas DataFrame containing non-zero rainfall data with the following columns:

  • datetime (pd.Timestamp): Time stamp

  • rain_mm (float): Rain in mm

  • station (str): Measurement station identifier

However, the legacy format of the input files used in the original Matlab implementation is still supported as well. For this format, te input files are defined by text files (extension: txt) that hold non-zero rainfall timeseries. The data are split per station and per year with a specific data file tag STATION_YYYY:

KMI_6414_2004.txt
KMI_6414_2005.txt
...
KMI_6434_2003.txt
KMI_6434_2004.txt
...

Note: the files in tests/data/test_rainfalldata are an incomplete data set for Belgium (RMI rainfall data are not included), an alternative is provided in the section ‘Analysis’.

[ ]:
fmap_rainfall =  Path(r"../../tests/data/test_rainfalldata")

Use the load_rain_folder function to read all rain data in a folder:

[ ]:
all_rainfall = load_rain_folder(fmap_rainfall, load_rain_file_matlab_legacy)
[ ]:
all_rainfall

Compute rainfall statistics (and add station metadata)

[ ]:
stations = pd.read_csv("data/stations.csv")

rain_stats = compute_rainfall_statistics(all_rainfall, stations)
rain_stats.head()

Select stations to compute erosivity for

[ ]:
consider_stations = pd.read_csv("data/files.csv")

sel_rainfall = all_rainfall.loc[all_rainfall["tag"].isin(consider_stations["datafile"])]
[ ]:
sel_rainfall.head()

Run Model

It needs to be decided which calculation functions are used to compute the rain energy depth and maximum intensity. There are a few implemented functions available in the package for which we choose the rain_energy_verstraeten and maximum_intensity by default. The maximum_intensity function follows a slightly different approach to the original Matlab implementation. However, one can opt to use the matlab_clone method as well, which is a Python implementation mimicking the original Matlab one:

[ ]:
erosivity = compute_erosivity(all_rainfall, rain_energy_verstraeten2006, maximum_intensity_matlab_clone)
erosivity.head()

Users can choose to implement their own custom function and apply it, e.g.

```python def rain_energy_per_unit_depth_brown_and_foster(rain): “””Calculate rain energy per unit depth according to Brown and Foster.

Parameters

rain : numpy.ndarray Rain (mm)

Returns

energy : float Energy per unit depth.

Notes

The rain energy per unit depth :math:e_r (:math:\\text{J}.\\text{mm}^{-1}. \\text{m}^{-2}) is defined by Brown and Foster (1987) and Renard et al. (1997)

\[\]
e_r = 29*(1-0.72*exp(-0.05*i_r)

with

  • :math:i_r the rain intensity for every 10-min increment (mm :math:\\text{h}^{-1} ).

References

Brown, L.C., Foster, G.R., 1987. Storm erosivity using idealized intensity distributions. Transactions of the ASAE 30, 0379–0386. https://doi.org/10.13031/2013.31957.

Renard, K.G., Foster, G.R., Weesies, G.A., McCool, D.K., Yoder, D.C., 1997, Predicting soil erosion by water: a guide to conservation planning with the revised universal soil loss equation (RUSLE), Agriculture Handbook. U.S. Department of Agriculture, Washington. https://www.ars.usda.gov/ARSUserFiles/64080530/RUSLE/AH_703.pdf

“”” rain_energy = 29 * (1 - 0.72 * np.exp(-0.05rain6))*rain return rain_energy.sum()

Analysis

If one does not have acces to the RMI data, the output erosivity from the computation with the full data set is provided in this repository as an alternative:

[ ]:
erosivity = pd.read_csv("data/erosivity_belgium.csv", parse_dates=[0, 1], index_col=0)
erosivity.head()

Ukkel

Compute R-factor for specific periods for Ukkel (KMI_6447 and KMI_F3).

[ ]:
timeseries = [range(1898, 2003, 1),
              range(2003, 2021, 1),
              range(1898, 2021, 1),
              range(1996, 2021, 1),
              range(1991, 2021, 1),
              range(1990, 2001, 1),
              range(2000, 2021, 1)]

for period in timeseries:
    rfactor = get_rfactor_station_year(erosivity,["KMI_6447","KMI_FS3"], period)
    print(f"R-factor for period from {period[0]} till {period[-1]} is: {rfactor['erosivity_cum'].mean():.2f}")

Make a figure for Ukkel

[ ]:
rfactor_all= get_rfactor_station_year(erosivity, stations=["KMI_FS3", "KMI_6447"]).sort_values("year")
[ ]:
fig, ax = plt.subplots(figsize=(12, 7.5))

rfactor_fs3 = get_rfactor_station_year(erosivity, ["KMI_FS3"])
rfactor_fs3 = rfactor_fs3[~(rfactor_fs3["erosivity_cum"].isnull())].sort_values("year")

rfactor_6447 = get_rfactor_station_year(erosivity, ["KMI_6447"])
rfactor_6447 = rfactor_6447[~(rfactor_6447["erosivity_cum"].isnull())].sort_values("year")

# add first line of KMI6447 to end of KMI FS3 in order to make a continuous timeseries on the plot
rfactor_fs3 = pd.concat([rfactor_fs3,rfactor_6447.iloc[0]])

ax.plot(rfactor_fs3["year"], rfactor_fs3["erosivity_cum"], label="KMI_FS3 (1898-2002)")


ax.plot(rfactor_6447["year"], rfactor_6447["erosivity_cum"], label="KMI_6447 (2003-2020)")

# rolling mean
rfactor_all = get_rfactor_station_year(erosivity, stations=["KMI_FS3", "KMI_6447"]).sort_values("year")
windows  = [10, 20, 30]
for window in windows:
    y = rfactor_all.rolling(window)["erosivity_cum"].mean()
    ax.plot(rfactor_all["year"], y, ls=':', lw=3,
            label=f"Moving average ({window} year window)")


ax.set_ylim([0,3500])
ax.set_xlim([1890,2020])
ax.tick_params(axis='both', which='major', labelsize=14)
ax.set_xlabel(r'Year',fontsize=16)
ax.set_ylabel(r'Yearly R-factor [MJ mm ha$^{-1}$ h$^{-1}$ year$^{-1}$]',fontsize=16)
ax.legend(prop={'size':16})

All stations except for Ukkel

[ ]:
stations_belgium_excl_ukkel = [station for station in stations["station"]
                               if station not in ["KMI_6447","KMI_FS3", "KMI1990_2002", "KMI2003_nu_v2"]]
years = range(1898, 2021, 1)
years = [year for year in years if year in
         erosivity[erosivity["station"].isin(stations_belgium_excl_ukkel)]["year"].unique()]

rfactor_non_ukkel = get_rfactor_station_year(erosivity,
                                             stations=stations_belgium_excl_ukkel,
                                             years=years)
print(rfactor_non_ukkel["erosivity_cum"].mean())
len(rfactor_non_ukkel["station"].unique())

All stations flanders

[ ]:
stations_flanders = [station for station in stations["station"]
                     if station not in  ["KMI_6447", "KMI_FS3", "KMI_6455", "KMI_6459",
                                         "KMI_6472", "KMI_6494", "KMI_6484", "KMI1990_2002", "KMI2003_nu_v2"]]
years = range(1898, 2021, 1)
years = [year for year in years if year in
         erosivity[erosivity["station"].isin(stations_flanders)]["year"].unique()]

rfactor_flanders = get_rfactor_station_year(erosivity,stations_flanders, years)
print(rfactor_flanders["erosivity_cum"].mean())
len(rfactor_flanders["station"].unique())

Plot the distribution of all R-values for Belgium, excluding the data from Ukkel

[ ]:
years = [year for year in range(1991, 2021) if year in
         erosivity[erosivity["station"].isin(stations_belgium_excl_ukkel)]["year"].unique()]

rfactor_non_ukkel = get_rfactor_station_year(erosivity, stations_belgium_excl_ukkel, years)

fig, ax = plt.subplots()
ax.hist(rfactor_non_ukkel["erosivity_cum"], bins=20, color=[0.8]*3,
        label=r"Jaarlijkse waarden voor alle 55 stations in België")
ax.plot([1239, 1239], [0, 120], color=[0.2]*3, ls=":", label="Ukkel (30-jaar referentie periode)")
ax.set_ylabel(r"#")
ax.set_xlabel(r"R-waarde [MJ mm ha$^{-1}$ h$^{-1}$ jaar$^{-1}$]")
ax.set_ylim([0,180])
ax.legend()

All stations of the VMM (Flanders)

[ ]:
stations_vmm = [station for station in stations["station"] if "KMI" not in station]

rfactor_vmm = get_rfactor_station_year(erosivity, stations_vmm)

print(rfactor_vmm["erosivity_cum"].mean())
len(rfactor_vmm["station"].unique())

Compute values per year over all stations

[ ]:
(get_rfactor_station_year(erosivity)
 .groupby("station")
 .aggregate(
    {"erosivity_cum":[np.mean, np.std],
     "year": "count"})
 .sort_values(('year', 'count'), ascending=False)
 .reset_index()
)

Analysis (monthly resolution)

Get the EI30-values for 2018 based on two Ukkel station (“KMI_6447”,”KMI_FS3”)

[ ]:
erosivity_monthly = (erosivity.loc[erosivity["station"].isin(["KMI_6447", "KMI_FS3"]), "erosivity"]
                     .resample("M")
                     .sum()
                     .to_frame())
erosivity_monthly["month"] = erosivity_monthly.index.month
erosivity_monthly.head()

Get mean and interquartile range of monthly R-factor

[ ]:
em = (erosivity_monthly
      .groupby("month")
      .aggregate({
          "erosivity":[np.mean,
                       lambda x: np.percentile(x, 25),
                       lambda x: np.percentile(x,75)]
      }))
[ ]:
em["mean"] = em["erosivity"]["mean"]
em["l_e"] = em["erosivity"]["mean"] - em["erosivity"]["<lambda_0>"]
em["u_e"] = em["erosivity"]["<lambda_1>"] - em["erosivity"]["mean"]

Plot

[ ]:
fig = plt.figure(figsize=(8, 4))

x = np.arange(len(em))
y_gv = [26, 20, 24, 27, 70, 77, 144, 102, 68, 50, 37, 32]
y_gv2 = [35, 27, 32, 36, 86, 96, 177, 135, 89, 66, 48, 43]
plt.bar(x-0.3,em["mean"],yerr=em[["l_e","u_e"]].T.values,color=[0.80]*3,width=0.3,label="Ukkel (1898-2020))")
plt.bar(x,y_gv,width=0.3,color=[0.5]*3,label="Verstraeten $\it{et. al}$ (2001)")
plt.bar(x+0.3,y_gv2,width=0.3,color=[0.2]*3,label="Verstraeten $\it{et. al}$ (2006)")
plt.ylabel("R-value")
ax = fig.axes
plt.xticks(x,["Januari","Februari","Maart","April","Mei","Juni","Juli","Augustus","September","Oktober","November","December"],rotation=45)
plt.legend(loc=2,facecolor ="white")
plt.ylabel(r"R-waarde [MJ mm ha$^{-1}$ h$^{-1}$ maand$^{-1}$]");

Additional: monthly analysis (detail)

Evolution monthly R-value

Generate plot to analyse evolution of montly R over the years

[ ]:
def subplot_montlhy_R(df,ax,ind1,ind2):

    if (ind1==0) & (ind2==2):
        ax.plot(df["year"],df["erosivity"],color=[0.1]*3,alpha=0.5,label="EI30")
    else:
        ax.plot(df["year"],df["erosivity"],color=[0.1]*3,alpha=0.5)
    ax.fill_between([1998,2021], 0,500,facecolor='grey', alpha=0.3)
    # rolling mean
    scales  = [10,20,30]
    for scale in scales:
        y = df["erosivity"].rolling(scale).mean()
        if scale==10:
            y_max=np.nanmax(y.values)*1.1
        x = y.index.year
        ax.plot(x,y,lw=2,label=f'voortschrijdend gemiddelde ({scale} jaar)')
    if ind1!=3:
        ax.set_xticks([])
    else:
        if ind2==1:
            ax.set_xlabel(r'Jaar (-)',fontsize=16)
    ax.set_ylim([0,500])
    ax.set_title(name)
    ax.tick_params(axis='y', which='major', labelsize=12,rotation=90)
    ax.tick_params(axis='x', which='major', labelsize=12)
    if ind2!=0:
        ax.set_yticklabels([])
    else:
        if ind1==2:
            ax.set_ylabel('\t\t\t Maandelijkse R [MJ mm ha$^{-1}$ h$^{-1}$ maand$^{-1}$]',fontsize=16)
    if (ind1==0) & (ind2==2):
        ax.legend(ncol=2)

fig,ax = plt.subplots(4,3,figsize=[9,9])
months = ["Januari","Februari","Maart","April","Mei","Juni","Juli","Augustus","September","Oktober","November","December"]
erosivity_monthly["year"] = erosivity_monthly.index.year

for month,name in enumerate(months):
    ind1=int(np.floor((month)/3))
    ind2=np.mod(month,3)
    df_plot = erosivity_monthly.loc[erosivity_monthly["month"]==month+1]
    df_plot  = df_plot [~np.isnan(df_plot["erosivity"])].sort_values("year")
    subplot_montlhy_R(df_plot,ax[ind1,ind2],ind1,ind2)