rfactor package

rfactor.rfactor module

exception rfactor.rfactor.RFactorInputError[source]

Bases: Exception

Raise when input data are not conform the rfactor required input format.

exception rfactor.rfactor.RFactorKeyError[source]

Bases: Exception

Raise when input data missing required column names.

exception rfactor.rfactor.RFactorTypeError[source]

Bases: Exception

Raise when input data data type of a data column is wrong.

rfactor.rfactor.compute_erosivity(rain, energy_method=<function rain_energy_verstraeten2006>, intensity_method=<function maximum_intensity>, **kwargs)[source]

Calculate erosivity for each year/station combination

Parameters:

rain (pandas.DataFrame) –
DataFrame with rainfall time series. Need to contain the following columns:
- datetime (pandas.Timestamp): Time stamp
- rain_mm (float): Rain in mm
- station (str): Measurement station identifier
energy_method (Callable, default rain_energy_per_unit_depth_verstraeten2006) – Function to compute the rain energy per unit depth
intensity_method (Callable, default maximum_intensity) – Function to derive the maximal rain intensity (over 30min).

Returns:

all_erosivity – DataFrame with erosivity output for each event.

station* (str)
year (int)
tag (str): unique tag for year, station-couple.
event_rain_cum (float): Cumulative rain for each event
all_events_cum (float): Cumulative rain over the whole timeseries
max_30min_intensity (float): Maximal 30min intensity for each event
event_energy (float): Rain energy per unit depth for each event
erosivity (float): Erosivity for each event
erosivity_cum (float): Cumulative erosivity over all events together

Return type:

pandas.DataFrame

Notes

NaN- and 0-values are removed from the input timeseries.

rfactor.rfactor.maximum_intensity(df)[source]

Maximum rain intensity for 30-min interval (Pandas rolling) expressed as mm/hour

The implementation uses a rolling window of the chosen interval to derive the maximal intensity.

Parameters:

df (pandas.DataFrame) –

DataFrame with rainfall time series. Needs to contain the following columns:

datetime (pandas.Timestamp): Timestamp
rain_mm (float): Rain in mm. No NaN or 0-values allowed

Returns:

maxprecip_30min – Maximal 30-minute intensity during event (in mm/h).

Return type:

float

rfactor.rfactor.maximum_intensity_interpolate(df)[source]

Maximum rain intensity for 30-min interval (Matlab clone Fix). This implementation is a fixed version of the Python-translation of the original Matlab implementation by [3].

Changes to the original script are:

In the if-statement ‘if timestamps[-1] - timestamps[0] <= 30:’ this methode calculates the total amount of rain during the interval while the original method only looks at the first rainfall entry.
In the same if-statement, the *2 was removed, since this is already done in the ‘return’ step of the model. This *2 causes the model to steeply over estimate the rainfall during short rainfall events.

Parameters:: df (pandas.DataFrame) – DataFrame with rainfall time series. Needs to contain the following columns: - datetime (pandas.Timestamp): Time stamp - rain_mm (float): Rain in mm. No NaN or 0-values allowed - event_rain_cum (float): Cumulative rain in mm
Returns:: maxprecip_30min – Maximal 30-minute intensity during event (in mm/h).
Return type:: float

Notes

The Python and original Matlab implementation linearly interpolate zero and NaN-values within one event.

rfactor.rfactor.maximum_intensity_matlab_clone(df)[source]

Maximum rain intensity for 30-min interval (Matlab clone).

The implementation is a direct Python-translation of the original Matlab implementation by Verstraeten et al. (2006) [3].

Parameters:: df (pandas.DataFrame) – DataFrame with rainfall time series. Needs to contain the following columns: - datetime (pandas.Timestamp): Time stamp - rain_mm (float): Rain in mm. No NaN or 0-values allowed - event_rain_cum (float): Cumulative rain in mm
Returns:: maxprecip_30min – Maximal 30-minute intensity during event (in mm/h).
Return type:: float

Notes

The Python and original Matlab implementation linearly interpolate zero and NaN-values within one event.

rfactor.rfactor.rain_energy_brown_and_foster1987(rain)[source]

Calculate rain energy per unit depth according to Brown and Foster.

Brown and Foster is applied considering a 10-minute interval input rainfall data set.

Parameters:: rain (numpy.ndarray) – Rain (mm)
Returns:: energy – Energy per unit depth.
Return type:: float

Notes

The rain energy per unit depth \(e_r\) (\(\text{MJ}.\text{mm}^{-1}. \text{ha}^{-1}\)) is defined by [4] and [5]:

\[e_r = 0.29*(1-0.72*exp(-0.05*i_r)\]

with

\(i_r\) the rain intensity for every 10-min increment (mm \(\text{h}^{-1}\) ).

The rain energy is multiplied by the volume of rain (per 10 minutes) and summed per event to compute the total energy of the event. The formula applies for a 10 minute rainfall input data set.

References

rfactor.rfactor.rain_energy_mcgregor1995(rain)[source]

Calculate rain energy per unit depth according to McGregor with 10 minute interval data.

McGregor is applied considering a 10-minute interval input rainfall data set.

Parameters:: rain (numpy.ndarray) – Rain (mm)
Returns:: energy – Energy per unit depth.
Return type:: float

Notes

The rain energy per unit depth \(e_r\) (\(\text{MJ}.\text{mm}^{-1}. \text{ha}^{-1}\)) is defined by [6]:

\[e_r = 0.29*(1-0.72*exp(-0.08*i_r)\]

with

\(i_r\) the rain intensity for every 10-min increment (mm \(\text{h}^{-1}\) ).

The rain energy is multiplied by the volume of rain (per 10 minutes) and summed per event to compute the total energy of the event. The formula applies for a 10 minute rainfall input data set.

References

rfactor.rfactor.rain_energy_verstraeten2006(rain)[source]

Calculate rain energy per unit depth according to Salles/Verstraeten with 10 minute interval data.

Verstraeten is applied considering a 10-minute interval input rainfall data set.

Parameters:: rain (numpy.ndarray) – Rain (mm)
Returns:: energy – Energy per unit depth.
Return type:: float

Notes

The rain energy per unit depth \(e_r\) (\(\text{MJ}.\text{mm}^{-1}. \text{ha}^{-1}\)) for an application for Flanders/Belgium is defined by [1] , [2] and [3]:

\[e_r = 0.1112i_r^{0.31}\]

with

\(i_r\) the rain intensity for every 10-min increment (mm \(\text{h}^{-1}\) ).

The rain energy is multiplied by the volume of rain (per 10 minutes) and summed per event to compute the total energy of the event. The formula applies for a 10 minute rainfall input data set.

References

rfactor.process module

rfactor.process.compute_diagnostics(rain)[source]

Compute diagnostics for input rainfall.

This function computes coverage (per year, station) and missing rainfall for each month (per year, station).

Parameters:

rain (pandas.DataFrame) –

DataFrame with rainfall time series. Contains at least the following columns:

rain_mm (float): Rain in mm
datetime (pandas.Timestamp): Time stamp
station (str): station name
year (int): year of the measurement
tag (str): tag identifier, formatted as STATION_YEAR

Returns:

diagnostics – Diagnostics per station, year with coverage and identifier for no-rain per month. Computed based on non-zero rainfall timeseries.

station (str): station identifier.
year (int): year.
coverage (float): percentage coverage non-zero timeseries (see Notes).

Added with per month (id’s 1 to 12):

months (int): 1: no rain observed in month, 0: rain observed.

Return type:

pandas.DataFrame

Notes

The coverage is computed as:

\[C = 100*[1-\frac{\text{number of NULL-data}} {\text{length of non-zero timeseries}}]\]

rfactor.process.compute_rainfall_statistics(df_rainfall, df_station_metadata=None)[source]

Compute general statistics for rainfall timeseries.

Statistics (number of records, min, max, median and years data) are computed for each measurement station

Parameters:

df_rainfall (pandas.DataFrame) – See rfactor.process.load_rain_file()
df_station_metadata (pandas.DataFrame) –
Dataframe holding station metadata. This dataframe has one mandatory column:
- station (str): Name or code of the measurement station
- x (float): X-coordinate of measurement station.
- y (float): Y-coordinate of measurement station.

Returns:

df_statistics – Apart from the station, x, y when df_station_metadata is provided, the following columns are returned:

year (list): List of the years fror which data is available for the station.
records (int): Total number of records for the station.
min (float): Minimal measured value for the station.
median (float): Median measured value for the station.
max (float): Maximal measured value for the station.

Return type:

pandas.DataFrame

rfactor.process.get_rfactor_station_year(erosivity, stations=None, years=None)[source]

Get R-factor at end of every year for each station from cumulative erosivity.

Parameters:

erosivity (pandas.DataFrame) – See rfactor.rfactor.compute_erosivity()
stations (list) – List of stations to extract R for.
years (list) – List of years to extract R for.

Returns:

erosivity – Updated with:

year (int): year
station (str): station
erosivity_cum (float): cumulative erosivity at end of year and at station.

Return type:

pandas.DataFrame

rfactor.process.write_erosivity_data(df, folder_path)[source]

Write output erosivity to (legacy Matlab format) in folder.

Written data are split-up for each year and station (file name format: SOURCE_STATION_YEAR.txt) and does not contain any headers. The columns (no header!) in the written text files represent the following:

days_since (float): Days since the start of the year.
erosivity_cum (float): Cumulative erosivity over events.
all_event_rain_cum (float): Cumulative rain over events.

Parameters:

df (pandas.DataFrame) –
DataFrame with rfactor/erosivity time series. Can contain multiple columns, but should have at least the following:
- datetime (pandas.Timestamp): Time stamp
- station (str): Station identifier
- erosivity_cum (float): Cumulative erosivity over events
- all_event_rain_cum (float): Cumulative rain over events
folder_path (pathlib.Path) – Folder path to save data according to legacy Matlab format, see rfactor.process.load_rain_file().

rfactor.valid module

rfactor.valid.valid_column(rain, req_col)[source]

Input dataframe has valid required columns

Parameters:

rain (pd.DataFrame) – To test dataframe
req_col (set) – Required columns in dataframe, e.g. {“datetime”, “rain_mm”}

rfactor.valid.valid_const_freq(rain)[source]

Check if rainfall inputdata has constant frequency

Parameters:: rain (pandas.DataFrame)

rfactor.valid.valid_freq(df_freq, req_freq=None)[source]

Test for valid frequency of input data

The frequency of the input data is tested to a defined frequency. Limit usage R-factor package to above 1-minute resolution.

Parameters:

df_freq (pandas.DatetimeIndex.freq) – Temporal frequency in the rainfall data
req_freq (int, default None) – Required frequency (minutes). If None, the req_frequence should at least be 1 minute.

rfactor.valid.valid_rainfall_timeseries(func=None, req_col={'datetime', 'rain_mm'}, req_freq=None)[source]

Customisable decorator to check pandas input rainfall data for functions used in: this package.

Parameters:

func (callable, default None)
req_col (set) – See rfactor.process.valid_column()
int (req_freq) – See rfactor.process.valid_freq()

Returns:

decorator – Return the execution of the actual decorator

Return type:

callable

Notes

Use super decorator to allow for decorator inputs