Read KNMI observations using hydropandas

This notebook introduces how to use the hydropandas package to read, process and visualise KNMI data.

Martin & Onno - 2022

Notebook contents

  1. Observation types

  2. Get KNMI data

  3. Get ObsCollections

  4. Precipitation data

  5. Reference evaporation types

  6. Spatial interpolation of evaporation

[1]:
import logging

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from IPython.display import display
from scipy.interpolate import NearestNDInterpolator, RBFInterpolator
from tqdm import tqdm

import hydropandas as hpd
from hydropandas.io import knmi
[2]:
hpd.util.get_color_logger("INFO")
[2]:
<RootLogger root (INFO)>

Observation types

The hydropandas package has a function to read all kinds of KNMI observations. These are stored in an Obs object. There are three types of observations you can obtain from the KNMI:

  • EvaporationObs, for evaporation time series

  • PrecipitationObs, for precipitation time series

  • MeteoObs, for all the other meteorological time series

Get evaporation in [m/day] for KNMI station 344 (Rotterdam Airport).

[3]:
o = hpd.EvaporationObs.from_knmi(stn=344, start="2022")
display(o.head())
o.plot()
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable EV24 from 2022-01-01 to None

hydropandas.EvaporationObs

EV24_ROTTERDAM_344
x 90061.930891
y 441636.854285
location
filename
source KNMI
unit m
station 344
meteo_var EV24

EV24
YYYYMMDD
2022-01-01 01:00:00 0.0003
2022-01-02 01:00:00 0.0003
2022-01-03 01:00:00 0.0002
2022-01-04 01:00:00 0.0002
2022-01-05 01:00:00 0.0001
[3]:
<Axes: xlabel='YYYYMMDD'>
../_images/examples_02_knmi_observations_5_3.png

Get precipitation in [m/day] for KNMI station 344 (Rotterdam Airport).

[4]:
o = hpd.PrecipitationObs.from_knmi(stn=344, start="2022")
display(o.head())
o.plot()
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable RH from 2022-01-01 to None

hydropandas.PrecipitationObs

RH_ROTTERDAM_344
x 90061.930891
y 441636.854285
location
filename
source KNMI
unit m
station 344
meteo_var RH

RH
YYYYMMDD
2022-01-01 01:00:00 0.0000
2022-01-02 01:00:00 0.0002
2022-01-03 01:00:00 0.0064
2022-01-04 01:00:00 0.0000
2022-01-05 01:00:00 0.0008
[4]:
<Axes: xlabel='YYYYMMDD'>
../_images/examples_02_knmi_observations_7_3.png

Get temperature in [°C] for KNMI station 344 (Rotterdam Airport).

[5]:
o = hpd.MeteoObs.from_knmi(stn=344, meteo_var="TG", start="2022")
display(o.head())
o.plot()
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable TG from 2022-01-01 to None

hydropandas.MeteoObs

TG_ROTTERDAM_344
x 90061.930891
y 441636.854285
location
filename
source KNMI
unit
station 344
meteo_var TG

TG
YYYYMMDD
2022-01-01 01:00:00 12.4
2022-01-02 01:00:00 12.1
2022-01-03 01:00:00 11.6
2022-01-04 01:00:00 9.7
2022-01-05 01:00:00 6.7
[5]:
<Axes: xlabel='YYYYMMDD'>
../_images/examples_02_knmi_observations_9_3.png

attributes

All PrecipitationObs, EvaporationObs and MeteoObs objects have the attributes:

  • name: meteo variable and station name

  • x: x-coordinate in m RD

  • y: y-coordinate in m RD

  • station: station number

  • unit: measurement unit

  • meta: dictionary with other metadata

[6]:
print(f"name: {o.name}")
print(f"x,y: {(o.x, o.y)}")
print(f"station: {o.station}")
print("unit", o.unit)
print("metadata:")
for key, item in o.meta.items():
    print(f"    {key}: {item}")
name: TG_ROTTERDAM_344
x,y: (np.float64(90061.9308913925), np.float64(441636.8542853093))
station: 344
unit
metadata:
    location: ROTTERDAM
    TG: Etmaalgemiddelde temperatuur (in graden Celsius) / Daily mean temperature in (degrees Celsius)

Get KNMI data

The from_knmi method can be used to obtain one type of measurements at one station. There are two ways to select the station.

  1. with the station number stn

  2. by specifying the xy coordinates, the station nearest to this coordinates is used.

Below both methods are used to obtain precipitation measurements. Notice that they return the same data because station 344 is nearest to the given xy coordinates.

[7]:
o1 = hpd.PrecipitationObs.from_knmi(stn=344)
o2 = hpd.PrecipitationObs.from_knmi(xy=(90600, 442800))
o1.equals(o2)
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable RH from None to None
INFO:hydropandas.io.knmi.get_knmi_obs:get KNMI data from station nearest to coordinates (90600, 442800) and meteovariable RH
[7]:
True

read options

The from_knmi method has other optional arguments:

  • stn: station number.

  • start: the start date of the time series you want, default is 1st of January 2019.

  • end: the end date of the time series you want, default is today.

  • fill_missing_obs: option to fill missing values with values from the nearest KNMI station. If measurements are filled an extra column is added to the time series in which the station number is shown that was used to fill a particular missing value.

  • interval: time interval of the time series, default is ‘daily’

  • use_api: select the method to obtain the data, default is ‘True’

  • raise_exception: option to raise an error when the requested time series is empty. ***

The 3 examples below give a brief summary of these options

[8]:
# example 1 get daily average temperature (TG) from 1900 till now
o_t = hpd.MeteoObs.from_knmi("TG", stn=344, start="1960")
o_t.plot(figsize=(16, 4), grid=True);
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable TG from 1960-01-01 to None
../_images/examples_02_knmi_observations_15_1.png
[9]:
# example 2 get daily average precipitation from 1972 with and without filling missing measurements
f, axes = plt.subplots(figsize=(16, 6), nrows=2, sharex=True, sharey=True)

o_rd = hpd.PrecipitationObs.from_knmi(
    "RD", stn=892, start="1972", end="2023", fill_missing_obs=False
)
o_rd.plot(figsize=(16, 4), ax=axes[0])

o_rd_filled = hpd.PrecipitationObs.from_knmi(
    "RD", stn=892, start="1972", end="2023", fill_missing_obs=True
)

o_rd_filled.loc[o_rd_filled["station"] == "892", "RD"].plot(
    ax=axes[1], label="oorspronkelijke metingen"
)
o_rd_filled.loc[o_rd_filled["station"] != "892", "RD"].plot(
    ax=axes[1], color="orange", label="opgevulde metingen"
)
axes[0].set_title("precipitation Giersbergen (1972-2023)")
axes[1].legend();
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 892 and variable RD from 1972-01-01 to 2023-01-01
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 892 and variable RD from 1972-01-01 to 2023-01-01
INFO:hydropandas.io.knmi._add_missing_indices:station 892 has no measurements before 1993-11-01 09:00:00
INFO:hydropandas.io.knmi.fill_missing_measurements:Filled 7975 observations from station 827 TILBURG -> 892 GIERSBERGEN
../_images/examples_02_knmi_observations_16_1.png
[10]:
# see the station_opvulwaarde
display(o_rd_filled.head())

hydropandas.PrecipitationObs

RD_GIERSBERGEN_892
x 138500.0
y 407500.0
location
filename
source KNMI
unit m
station 892
meteo_var RD

RD station
1972-01-01 09:00:00 0.0 827
1972-01-02 09:00:00 0.0 827
1972-01-03 09:00:00 0.0 827
1972-01-04 09:00:00 0.0 827
1972-01-05 09:00:00 0.0 827
[11]:
# example 3 get evaporation
logging.getLogger().getEffectiveLevel()
logging.getLogger().setLevel(logging.INFO)

o_ev = hpd.EvaporationObs.from_knmi(
    stn=344, start="1972", end="2023", fill_missing_obs=True
)
o_ev
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable EV24 from 1972-01-01 to 2023-01-01
INFO:hydropandas.io.knmi._add_missing_indices:station 344 has no measurements before 1987-09-12 01:00:00
INFO:hydropandas.io.knmi.fill_missing_measurements:Filled 246 observations from station 210 VALKENBURG -> 344 ROTTERDAM
INFO:hydropandas.io.knmi.fill_missing_measurements:Filled 64 observations from station 348 CABAUW-MAST -> 344 ROTTERDAM
INFO:hydropandas.io.knmi.fill_missing_measurements:Filled 5499 observations from station 260 DE-BILT -> 344 ROTTERDAM
[11]:

hydropandas.EvaporationObs

EV24_ROTTERDAM_344
x 90061.930891
y 441636.854285
location
filename
source KNMI
unit m
station 344
meteo_var EV24

EV24 station
1972-01-01 01:00:00 0.0002 260
1972-01-02 01:00:00 0.0002 260
1972-01-03 01:00:00 0.0002 260
1972-01-04 01:00:00 0.0000 260
1972-01-05 01:00:00 0.0000 260
... ... ...
2022-12-28 01:00:00 0.0003 344
2022-12-29 01:00:00 0.0001 344
2022-12-30 01:00:00 0.0002 344
2022-12-31 01:00:00 0.0001 344
2023-01-01 01:00:00 0.0001 344

18629 rows × 2 columns

Get ObsCollections

You can read multiple Observation objects at once using the hpd.read_knmi method. The observations are stored in an ObsCollection object. Below an example to obtain precipitation (RH) and evaporation (EV24) from the measurement stations Rotterdam (344) and De Bilt (260).

[12]:
oc = hpd.read_knmi(stns=[344, 260], meteo_vars=["RH", "EV24"])
oc
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable RH from None to None
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 260 and variable RH from None to None
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable EV24 from None to None
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 260 and variable EV24 from None to None
[12]:
x y location filename source unit station meteo_var obs
name
RH_ROTTERDAM_344 90061.930891 441636.854285 KNMI m 344 RH PrecipitationObs RH_ROTTERDAM_344 -----metadat...
RH_DE-BILT_260 140565.152506 456790.999119 KNMI m 260 RH PrecipitationObs RH_DE-BILT_260 -----metadata-...
EV24_ROTTERDAM_344 90061.930891 441636.854285 KNMI m 344 EV24 EvaporationObs EV24_ROTTERDAM_344 -----metadat...
EV24_DE-BILT_260 140565.152506 456790.999119 KNMI m 260 EV24 EvaporationObs EV24_DE-BILT_260 -----metadata-...

other options in the read_knmi function are:

  • specify locations as a dataframe with x, y coordinates (RD_new), the function will find the nearest KNMI station for every location.

  • specify xmid and ymid which are 2 arrays corresponding to a structured grid to obtain the nearest KNMI station for every cell in the grid.

[13]:
locations = pd.DataFrame(index=["Rotterdam"], data={"x": 90600, "y": 442800})
display(locations)
hpd.read_knmi(locations=locations, meteo_vars=["RH"])
x y
Rotterdam 90600 442800
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable RH from None to None
[13]:
x y location filename source unit station meteo_var obs
name
RH_ROTTERDAM_344 90061.930891 441636.854285 KNMI m 344 RH PrecipitationObs RH_ROTTERDAM_344 -----metadat...
[14]:
hpd.read_knmi(xy=((90600, 442800),), meteo_vars=["RH"])
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable RH from None to None
[14]:
x y location filename source unit station meteo_var obs
name
RH_ROTTERDAM_344 90061.930891 441636.854285 KNMI m 344 RH PrecipitationObs RH_ROTTERDAM_344 -----metadat...

Precipitation

Using hydropandas 3 different precipitation products can be accessed:

  1. Daily data from a meteorological station

  2. Daily data from a neerslag (precipitation) station

  3. Hourly data from a meteorological station

All three products can be obtained using the from_knmi method. Product 1 and 2 can also be accessed without the api.

By default data is used from a meteorological station. Specify meteo_var='RD' to access data from a neerslag (precipitation) station.

[15]:
start = "2010-1-1"
end = "2010-1-10"

# daily meteo station (default)
precip1 = hpd.PrecipitationObs.from_knmi(stn=260, start=start, end=end)
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 260 and variable RH from 2010-01-01 to 2010-01-10
[16]:
# daily neerslag station
precip2 = hpd.PrecipitationObs.from_knmi(meteo_var="RD", stn=550, start=start, end=end)
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 550 and variable RD from 2010-01-01 to 2010-01-10
[17]:
# hourly meteo station (only works with api)
precip3 = hpd.PrecipitationObs.from_knmi(
    stn=260,
    start=start,
    end=end,
    interval="hourly",
)
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 260 and variable RH from 2010-01-01 to 2010-01-10
[18]:
# daily meteo station without api
precip4 = hpd.PrecipitationObs.from_knmi(
    stn=260,
    start=start,
    end=end,
    use_api=False,
)
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 260 and variable RH from 2010-01-01 to 2010-01-10
[19]:
# daily neerslag station without api
precip5 = hpd.PrecipitationObs.from_knmi(
    meteo_var="RD",
    stn=550,
    start=start,
    end=end,
    use_api=False,
)
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 550 and variable RD from 2010-01-01 to 2010-01-10

Below are the differences between the precipitation estimates from different station types.

[20]:
fig, ax = plt.subplots(figsize=(16, 4))
precip1["RH"].plot(
    ax=ax,
    drawstyle="steps",
    ls="--",
    lw=3,
    label=str(precip1.station) + "_dagelijks",
)

precip2["RD"].plot(
    ax=ax,
    drawstyle="steps",
    ls="--",
    lw=3,
    label=str(precip2.station) + "_dagelijks",
)

precip3["RH"].plot(
    ax=ax,
    drawstyle="steps",
    label=str(precip3.station) + "_uurlijks",
)

precip4["RH"].plot(
    ax=ax,
    drawstyle="steps",
    marker="o",
    label=str(precip4.station) + "_dagelijks_geen_api",
)

precip5["RD"].plot(
    ax=ax,
    drawstyle="steps",
    marker="o",
    label=str(precip5.station) + "_dagelijks_geen_api",
)


ax.legend()
ax.grid()
ax.set_ylabel("neerslag [m/dag]");
../_images/examples_02_knmi_observations_31_0.png

The locations of the stations can be plotted onto a map using the contextily package.

[21]:
import contextily as cx

oc = hpd.ObsCollection([precip1, precip2])

gdf = oc.to_gdf()

gdf = gdf.set_crs(28992)

gdf["name"] = gdf.index

ax = gdf.buffer(2000).plot(alpha=0, figsize=(8, 8))

gdf.plot("name", ax=ax, cmap="jet", legend=True, markersize=100)


cx.add_basemap(ax, crs=28992, source=cx.providers.OpenStreetMap.Mapnik)
../_images/examples_02_knmi_observations_33_0.png

Evaporation

KNMI provides the Makking reference evaporation (meteo_var EV24). Hydropandas provides a posssibility to calculate three different types of reference evaporation from data of KNMI meteo stations:

  • Penman

  • Hargreaves

  • Makkink (in the same way as KNMI)

These three types of reference evaporation are calculated the same way as described by Allen et al. 1990 and STOWA rapport. Be aware that the last report is written in Dutch and contains errors in the units.

The following variables from the KNMI are used for each reference evaporation type:

  • Penman: average (TG), minimum (TN) and maximum (TX) temperature, de global radiation (Q), de windspeed (FG) en de relative humidity (PG).

  • Hargreaves: average (TG), minimum (TN) and maximum (TX) temperature

  • Makkink: average temperature (TG) and global radiation (Q)

Comparison Makkink

Lets compare Hypdropandas Makkink verdamping evaporation with the EV24 Makkink verdamping of the KNMI. When Hydropandas Makkink evaporation is rounded (on 4 decimals), the estimate is the same as for the KNMI estimate.

[22]:
ev24 = hpd.EvaporationObs.from_knmi(
    stn=260, start="2022-1-1"
)  # et_type='EV24' by default
makk = hpd.EvaporationObs.from_knmi(meteo_var="makkink", stn=260, start="2022-1-1")
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 260 and variable EV24 from 2022-01-01 to None
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 260 and variable makkink from 2022-01-01 to None
[23]:
f, ax = plt.subplots(2, figsize=(11, 4))
ax[0].plot(ev24, label=ev24.name)
ax[0].plot(makk.round(4), label=makk.name)
ax[0].set_ylabel("E [m/d]")
ax[0].set_title("Makkink evaporation")
ax[0].legend()
ax[1].plot(ev24["EV24"] - makk["makkink"].round(4))
ax[1].set_title("Difference Makkink KNMI and Hydropandas")
f.tight_layout()
../_images/examples_02_knmi_observations_37_0.png

Comparison Penman, Makkink en Hargreaves

On average Penman gives a higher estimate for reference evaporation than Makkink (~0.55mm). This can be explained by the fact that Penman takes into account windspeed and Makkink ignores this proces. Hargreaves is a very simple way of estimation the evaporation, only taking into account temperature and extraterrestial radiation. Therefore it gives a lower estimate for the reference evporatoin compared to the two other methods (~-0.35mm wrt Makkink).

[24]:
penm = hpd.EvaporationObs.from_knmi(
    meteo_var="penman", stn=260, start="2022-1-1"
).squeeze()
harg = hpd.EvaporationObs.from_knmi(
    meteo_var="hargreaves", stn=260, start="2022-1-1"
).squeeze()
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 260 and variable penman from 2022-01-01 to None
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 260 and variable hargreaves from 2022-01-01 to None
[25]:
f, ax = plt.subplots(figsize=(11, 4))
ax.plot(makk, label=makk.name)
ax.plot(penm, label=penm.name)
ax.plot(harg, label=harg.name)
ax.legend()
[25]:
<matplotlib.legend.Legend at 0x7f75533130e0>
../_images/examples_02_knmi_observations_40_1.png

Spatial interpolate (Makkink) Evaporation?

Does interpolation matter? There are ways to interpolate evaporation datasets. However currently the nearest station is always used in HydroPandas’ methods. Does this give different results? First lets look spatially.

Get all stations where EV24 is measured

[26]:
stns = knmi.get_stations(meteo_var="EV24").sort_index()

Collect all EV24 data ever measured by KNMI

[27]:
tmin = "1950-01-01"
tmax = "2022-04-11"

# empty dataframe
df = pd.DataFrame(
    columns=stns.index
)  # index=pd.date_range(start=tmin, end=tmax, freq='H')

for stn in tqdm(stns.index):
    df_stn = hpd.MeteoObs.from_knmi(
        meteo_var="EV24", stn=stn, fill_missing_obs=False, start=tmin, end=tmax
    )
    df[stn] = df_stn

df
  0%|          | 0/34 [00:00<?, ?it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 210 and variable EV24 from 1950-01-01 to 2022-04-11
  3%|▎         | 1/34 [00:00<00:29,  1.13it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 215 and variable EV24 from 1950-01-01 to 2022-04-11
  6%|▌         | 2/34 [00:01<00:19,  1.64it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 235 and variable EV24 from 1950-01-01 to 2022-04-11
  9%|▉         | 3/34 [00:02<00:22,  1.35it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 240 and variable EV24 from 1950-01-01 to 2022-04-11
 12%|█▏        | 4/34 [00:03<00:24,  1.23it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 249 and variable EV24 from 1950-01-01 to 2022-04-11
 15%|█▍        | 5/34 [00:03<00:21,  1.37it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 251 and variable EV24 from 1950-01-01 to 2022-04-11
 18%|█▊        | 6/34 [00:04<00:19,  1.43it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 257 and variable EV24 from 1950-01-01 to 2022-04-11
 21%|██        | 7/34 [00:04<00:17,  1.53it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 260 and variable EV24 from 1950-01-01 to 2022-04-11
 24%|██▎       | 8/34 [00:05<00:18,  1.37it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 265 and variable EV24 from 1950-01-01 to 2022-04-11
 26%|██▋       | 9/34 [00:06<00:19,  1.30it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 267 and variable EV24 from 1950-01-01 to 2022-04-11
 29%|██▉       | 10/34 [00:07<00:17,  1.39it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 269 and variable EV24 from 1950-01-01 to 2022-04-11
 32%|███▏      | 11/34 [00:07<00:15,  1.45it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 270 and variable EV24 from 1950-01-01 to 2022-04-11
 35%|███▌      | 12/34 [00:08<00:16,  1.33it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 273 and variable EV24 from 1950-01-01 to 2022-04-11
 38%|███▊      | 13/34 [00:09<00:15,  1.39it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 275 and variable EV24 from 1950-01-01 to 2022-04-11
 41%|████      | 14/34 [00:10<00:15,  1.29it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 277 and variable EV24 from 1950-01-01 to 2022-04-11
 44%|████▍     | 15/34 [00:10<00:13,  1.37it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 278 and variable EV24 from 1950-01-01 to 2022-04-11
 47%|████▋     | 16/34 [00:11<00:12,  1.44it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 279 and variable EV24 from 1950-01-01 to 2022-04-11
 50%|█████     | 17/34 [00:12<00:12,  1.40it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 280 and variable EV24 from 1950-01-01 to 2022-04-11
 53%|█████▎    | 18/34 [00:13<00:12,  1.29it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 283 and variable EV24 from 1950-01-01 to 2022-04-11
 56%|█████▌    | 19/34 [00:13<00:11,  1.33it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 286 and variable EV24 from 1950-01-01 to 2022-04-11
 59%|█████▉    | 20/34 [00:14<00:09,  1.41it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 290 and variable EV24 from 1950-01-01 to 2022-04-11
 62%|██████▏   | 21/34 [00:15<00:10,  1.29it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 310 and variable EV24 from 1950-01-01 to 2022-04-11
 65%|██████▍   | 22/34 [00:16<00:09,  1.21it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 319 and variable EV24 from 1950-01-01 to 2022-04-11
 68%|██████▊   | 23/34 [00:17<00:08,  1.31it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 323 and variable EV24 from 1950-01-01 to 2022-04-11
 71%|███████   | 24/34 [00:17<00:07,  1.39it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 330 and variable EV24 from 1950-01-01 to 2022-04-11
 74%|███████▎  | 25/34 [00:18<00:06,  1.32it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable EV24 from 1950-01-01 to 2022-04-11
 76%|███████▋  | 26/34 [00:19<00:06,  1.28it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 348 and variable EV24 from 1950-01-01 to 2022-04-11
 79%|███████▉  | 27/34 [00:20<00:05,  1.31it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 350 and variable EV24 from 1950-01-01 to 2022-04-11
 82%|████████▏ | 28/34 [00:20<00:04,  1.25it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 356 and variable EV24 from 1950-01-01 to 2022-04-11
 85%|████████▌ | 29/34 [00:21<00:03,  1.30it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 370 and variable EV24 from 1950-01-01 to 2022-04-11
 88%|████████▊ | 30/34 [00:22<00:03,  1.25it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 375 and variable EV24 from 1950-01-01 to 2022-04-11
 91%|█████████ | 31/34 [00:23<00:02,  1.21it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 377 and variable EV24 from 1950-01-01 to 2022-04-11
 94%|█████████▍| 32/34 [00:23<00:01,  1.34it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 380 and variable EV24 from 1950-01-01 to 2022-04-11
 97%|█████████▋| 33/34 [00:24<00:00,  1.26it/s]
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 391 and variable EV24 from 1950-01-01 to 2022-04-11
100%|██████████| 34/34 [00:25<00:00,  1.34it/s]
[27]:
210 215 235 240 249 251 257 260 265 267 ... 330 344 348 350 356 370 375 377 380 391
YYYYMMDD
1987-03-26 01:00:00 0.0006 NaN 0.0006 NaN NaN NaN NaN 0.0005 NaN NaN ... NaN NaN NaN NaN NaN 0.0008 NaN NaN 0.0010 NaN
1987-03-27 01:00:00 0.0016 NaN 0.0015 NaN NaN NaN NaN 0.0016 NaN NaN ... NaN NaN 0.0016 NaN NaN 0.0017 NaN NaN 0.0024 NaN
1987-03-28 01:00:00 0.0008 NaN 0.0007 NaN NaN NaN NaN 0.0007 NaN NaN ... NaN NaN 0.0007 NaN NaN 0.0005 NaN NaN 0.0008 NaN
1987-03-29 01:00:00 0.0013 NaN 0.0007 NaN NaN NaN NaN 0.0010 NaN NaN ... NaN NaN NaN NaN NaN 0.0013 NaN NaN 0.0014 NaN
1987-03-30 01:00:00 0.0019 NaN 0.0021 NaN NaN NaN NaN 0.0016 NaN NaN ... NaN NaN 0.0017 NaN NaN 0.0015 NaN NaN 0.0018 NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2016-04-29 01:00:00 0.0023 0.0022 0.0020 0.0020 0.0015 0.0018 0.0014 0.0022 NaN 0.0021 ... 0.0023 0.0026 0.0023 0.0018 0.0022 0.0024 0.0019 0.0025 0.0025 0.0023
2016-04-30 01:00:00 0.0023 0.0021 0.0025 0.0020 0.0022 0.0027 0.0022 0.0014 NaN 0.0021 ... 0.0025 0.0021 0.0016 0.0014 0.0012 0.0015 0.0014 0.0010 0.0010 0.0009
2016-05-01 01:00:00 0.0023 0.0022 0.0029 0.0022 0.0021 0.0028 0.0026 0.0023 NaN 0.0026 ... 0.0027 0.0025 0.0022 0.0021 0.0021 0.0022 0.0021 0.0009 0.0011 0.0015
2016-05-02 01:00:00 0.0036 0.0035 0.0036 0.0034 0.0034 0.0035 0.0035 0.0035 NaN 0.0034 ... 0.0037 0.0036 0.0036 0.0035 0.0034 0.0031 0.0030 0.0029 0.0031 0.0031
2016-05-03 01:00:00 0.0029 0.0028 0.0022 0.0029 0.0028 0.0026 0.0027 0.0034 NaN 0.0030 ... 0.0026 0.0031 0.0033 0.0034 0.0034 0.0035 0.0034 0.0034 0.0036 0.0035

10632 rows × 34 columns

According to the KNMI, thin plate spline is the best way to interpolate Makkink evaporation. Thats also how they provide the gridded Makkink evaporation :

[28]:
xy = stns.loc[df.columns, ["x", "y"]]

for idx in tqdm(df.index[0 : len(df) : 2000]):
    # get all stations with values for this date
    val = df.loc[idx].dropna() * 1000  # mm
    # get stations for this date
    coor = xy.loc[val.index].to_numpy()
    if (
        len(val) < 3
    ):  # if there are less than 3 stations, thin plate spline does not work
        # options: linear, multiquadric, gaussian,
        kernel = "linear"

    else:
        kernel = "thin_plate_spline"
        # options:
        # 'inverse_quadratic', 'linear', 'multiquadric', 'gaussian',
        # 'inverse_multiquadric', 'cubic', 'quintic', 'thin_plate_spline'

    # create an scipy interpolator
    rbf = RBFInterpolator(coor, val.to_numpy(), epsilon=1, kernel=kernel)

    nea = NearestNDInterpolator(coor, val.to_numpy())

    # interpolate on grid of the Netherlands
    grid = np.mgrid[10000:280000:100j, 300000:620000:100j]
    grid2 = grid.reshape(2, -1).T  # interpolator only takes array [[x0, y0],
    #  [x1, y1]]
    val_rbf = rbf.__call__(grid2).reshape(100, 100)
    val_nea = nea.__call__(grid2).reshape(100, 100)

    # create figure
    fig, ax = plt.subplot_mosaic("AAAABBBBC", figsize=(10, 5.925))
    fig.suptitle(f"Makkink Evaporation [mm] {idx.date()}", y=0.95)
    vmin = 0
    vmax = 5

    ax["A"].set_title(f"Interpolation: {kernel}")
    ax["A"].pcolormesh(grid[0], grid[1], val_rbf, vmin=vmin, vmax=vmax)
    ax["B"].set_title("Interpolation: Nearest")
    ax["B"].pcolormesh(grid[0], grid[1], val_nea, vmin=vmin, vmax=vmax)
    ax["A"].scatter(*coor.T, c=val, s=100, ec="k", vmin=vmin, vmax=vmax)
    p = ax["B"].scatter(*coor.T, c=val, s=100, ec="k", vmin=vmin, vmax=vmax)
    cb = fig.colorbar(p, cax=ax["C"])
    cb.set_label("[mm]")
    fig.tight_layout()
100%|██████████| 6/6 [00:00<00:00, 11.01it/s]
../_images/examples_02_knmi_observations_47_1.png
../_images/examples_02_knmi_observations_47_2.png
../_images/examples_02_knmi_observations_47_3.png
../_images/examples_02_knmi_observations_47_4.png
../_images/examples_02_knmi_observations_47_5.png
../_images/examples_02_knmi_observations_47_6.png

The same method is implemented in Hydropandas for an ObsCollection.

[29]:
sd = "2022-01-01"
ed = "2022-12-31"
oc = hpd.read_knmi(
    stns=stns.index, starts=sd, ends=ed, meteo_vars=["EV24"], raise_exceptions=False
)
oc_et = oc.interpolate(xy=[(100000, 330000)])
eti = oc_et.iloc[0].obs
eti
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 210 and variable EV24 from 2022-01-01 to 2022-12-31
WARNING:hydropandas.io.knmi.parse_data:No columns to parse from file. Returning empty DataFrame.
WARNING:hydropandas.io.knmi.get_timeseries_stn:No data for meteo_var='EV24' at stn=210 betweenstart=Timestamp('2022-01-01 00:00:00') and end=Timestamp('2022-12-31 00:00:00').
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 215 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 235 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 240 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 249 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 251 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 257 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 260 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 265 and variable EV24 from 2022-01-01 to 2022-12-31
WARNING:hydropandas.io.knmi.parse_data:No columns to parse from file. Returning empty DataFrame.
WARNING:hydropandas.io.knmi.get_timeseries_stn:No data for meteo_var='EV24' at stn=265 betweenstart=Timestamp('2022-01-01 00:00:00') and end=Timestamp('2022-12-31 00:00:00').
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 267 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 269 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 270 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 273 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 275 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 277 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 278 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 279 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 280 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 283 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 286 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 290 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 310 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 319 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 323 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 330 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 344 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 348 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 350 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 356 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 370 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 375 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 377 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 380 and variable EV24 from 2022-01-01 to 2022-12-31
INFO:hydropandas.io.knmi.get_knmi_obs:get data from station 391 and variable EV24 from 2022-01-01 to 2022-12-31
[29]:

hydropandas.EvaporationObs

100000_330000
x 100000
y 330000
location
filename
source interpolation
unit
station NaN
meteo_var EV24

100000_330000
YYYYMMDD
2022-01-01 01:00:00 0.000254
2022-01-02 01:00:00 0.000570
2022-01-03 01:00:00 0.000330
2022-01-04 01:00:00 0.000142
2022-01-05 01:00:00 0.000157
... ...
2022-12-27 01:00:00 0.000151
2022-12-28 01:00:00 0.000260
2022-12-29 01:00:00 0.000101
2022-12-30 01:00:00 0.000141
2022-12-31 01:00:00 0.000077

365 rows × 1 columns

[30]:
etn = hpd.MeteoObs.from_knmi(
    xy=(100000, 330000), start=sd, end=ed, meteo_var="EV24", fill_missing_obs=True
)
etn
INFO:hydropandas.io.knmi.get_knmi_obs:get KNMI data from station nearest to coordinates (100000, 330000) and meteovariable EV24
[30]:

hydropandas.MeteoObs

EV24_WESTDORPE_319
x 48425.362049
y 360618.713456
location
filename
source KNMI
unit m
station 319
meteo_var EV24

EV24 station
2022-01-01 01:00:00 0.0002 319
2022-01-02 01:00:00 0.0005 319
2022-01-03 01:00:00 0.0004 319
2022-01-04 01:00:00 0.0002 319
2022-01-05 01:00:00 0.0002 319
... ... ...
2022-12-27 01:00:00 0.0003 319
2022-12-28 01:00:00 0.0003 319
2022-12-29 01:00:00 0.0001 319
2022-12-30 01:00:00 0.0002 319
2022-12-31 01:00:00 0.0001 319

365 rows × 2 columns

As can be seen, for one location the interpolation method is significantly slower. Lets see how the values compare for a time series.

[31]:
fig, ax = plt.subplots(2, 1, figsize=(8, 6), sharex=True)
eti.plot(ax=ax[0])
etn.plot(ax=ax[0], linestyle="--", color="C1")
ax[0].set_title("Comparison Interpolated and Nearest Makkink Evaporation")
ax[0].set_ylabel("[mm]")
ax[0].grid()
ax[0].legend(["Interpolated", "Nearest"])

(eti.squeeze() - etn["EV24"].squeeze()).plot(ax=ax[1], color="C2")
ax[1].set_ylabel("[mm]")
ax[1].grid()
ax[1].legend(["Interpolated - Nearest"])
[31]:
<matplotlib.legend.Legend at 0x7f755165b9b0>
../_images/examples_02_knmi_observations_52_1.png

The interpolated evaporation can also be collected for multiple points (using x and y in a list of DataFrame) in an ObsCollection

[32]:
oc_et = oc.interpolate(xy=[(100000, 320000), (100000, 330000), (100000, 340000)])
oc_et
[32]:
x y location filename source unit station meteo_var obs
name
100000_320000 100000 320000 interpolation NaN EV24 EvaporationObs 100000_320000 -----metadata----...
100000_330000 100000 330000 interpolation NaN EV24 EvaporationObs 100000_330000 -----metadata----...
100000_340000 100000 340000 interpolation NaN EV24 EvaporationObs 100000_340000 -----metadata----...

The interpolation method is slow at first, but if collected for many different locations the time penalty is not that significant anymore.