Reading groundwater observations

This notebook introduces how to use the hydropandas package to read, process and visualise groundwater data from Dino and Bro databases.

Notebook contents

  1. GroundwaterObs

  2. ObsCollection

  3. Read ObsCollections

  4. Write ObsCollections

[1]:
import hydropandas as hpd

import logging
from IPython.display import HTML

import logging
[2]:
hpd.util.get_color_logger("INFO")

GroundwaterObs

The hydropandas package has several functions to read groundwater observations at a measurement well. These include reading data from: - dino (from csv-files). - bro (using the bro-api) - fews (xml dumps from the fews database) - wiski (dumps from the wiski database)

[3]:
# reading a dino csv file
path = "data/Grondwaterstanden_Put/B33F0080001_1.csv"
gw_dino = hpd.GroundwaterObs.from_dino(path=path)
gw_dino
INFO:hydropandas.io.dino:reading -> B33F0080001_1
[4]:
# reading the same filter from using the bro api. Specify a groundwater monitoring id (GMW00...) and a filter number (1)
gw_bro = hpd.GroundwaterObs.from_bro("GMW000000041261", 1)
INFO:hydropandas.io.bro:reading bro_id GMW000000041261
INFO:hydropandas.io.bro:GLD000000009378 contains 36 duplicates (of 66447). Keeping only first values.

Now we have an GroundwaterObs object named gw_bro and gw_dino. Both objects are from the same measurement well in different databases. A GroundwaterObs object inherits from a pandas DataFrame and has the same attributes and methods.

[5]:
gw_bro.describe()
[5]:
values
count 66411.000000
mean 5.562630
std 0.222262
min 4.913000
25% 5.380000
50% 5.569000
75% 5.721000
max 6.397000
[6]:
gw_bro
[6]:
values qualifier
1972-11-28 00:00:00 5.763 goedgekeurd
1972-12-07 00:00:00 5.773 goedgekeurd
1972-12-14 00:00:00 5.703 goedgekeurd
1972-12-21 00:00:00 5.643 goedgekeurd
1972-12-28 00:00:00 5.573 goedgekeurd
... ... ...
2021-10-08 07:00:00 5.486 goedgekeurd
2021-10-08 08:00:00 5.485 goedgekeurd
2021-10-08 09:00:00 5.486 goedgekeurd
2021-10-08 09:47:00 5.491 goedgekeurd
2021-10-08 10:00:00 5.485 goedgekeurd

66411 rows × 2 columns

[7]:
ax = gw_dino["stand_m_tov_nap"].plot(
    label="dinoloket", figsize=(14, 5), legend=True, marker=".", lw=0.2
)
gw_bro["values"].plot(ax=ax, label="bro", legend=True, ylabel=gw_bro.unit)
gw_dino["ground_level"].plot(
    ax=ax,
    label="ground level",
    legend=True,
    grid=True,
    color="green",
    ylabel=gw_dino.unit,
)

ax.set_title(f"same tube from Dinoloket {gw_dino.name} and BRO id {gw_bro.name}")
../_images/examples_01_groundwater_observations_10_0.png

GroundwaterObs Attributes

Besides the standard DataFrame attributes a GroundwaterObs has the following additional attributes: - x, y: x- and y-coordinates of the observation point - name: str with the name - filename: str with the filename (only available when the data was loaded from a file) - monitoring_well: the name of the monitoring_well. One monitoring well can have multiple tubes. - tube_nr: the number of the tube. The combination of monitoring_well and tube_nr should be unique - screen_top: the top of the tube screen (bovenkant filter in Dutch) - screen_bottom: the bottom of the tube screen (onderkant filter in Dutch) - ground_level: surface level (maaiveld in Dutch) - tube_top: the top of the tube - metadata_available: boolean indicating whether metadata is available for this observation point - meta: dictionary with additional metadata

When dowloading from Dinoloket all levels are in meters NAP.

[8]:
print(gw_bro)
GroundwaterObs GMW000000041261_1
-----metadata------
name : GMW000000041261_1
x : 213268.0
y : 473910.0
filename :
source : BRO
unit : m NAP
monitoring_well : GMW000000041261
tube_nr : 1
screen_top : 4.05
screen_bottom : 3.05
ground_level : 6.9
tube_top : 7.173
metadata_available : True

-----time series------
                     values    qualifier
1972-11-28 00:00:00   5.763  goedgekeurd
1972-12-07 00:00:00   5.773  goedgekeurd
1972-12-14 00:00:00   5.703  goedgekeurd
1972-12-21 00:00:00   5.643  goedgekeurd
1972-12-28 00:00:00   5.573  goedgekeurd
...                     ...          ...
2021-10-08 07:00:00   5.486  goedgekeurd
2021-10-08 08:00:00   5.485  goedgekeurd
2021-10-08 09:00:00   5.486  goedgekeurd
2021-10-08 09:47:00   5.491  goedgekeurd
2021-10-08 10:00:00   5.485  goedgekeurd

[66411 rows x 2 columns]

GroundwaterObs methods

Besides the standard DataFrame methods a GroundwaterObs has additional methods. This methods are accessible through submodules: - geo.get_lat_lon(), to obtain latitude and longitude - gwobs.get_modellayer(), to obtain the modellayer of a modflow model using the filter depth - stats.get_seasonal_stat(), to obtain seasonal statistics - stats.obs_per_year(), to obtain the number of observations per year - stats.consecutive_obs_years(), to obtain the number of consecutive years with more than a minimum number of observations - plots.interactive_plot(), to obtain a bokeh plot

Get latitude and longitude with gw.geo.get_lat_lon():

[9]:
print(f"latitude and longitude -> {gw_bro.geo.get_lat_lon()}")
latitude and longitude -> (52.25014706738195, 6.24047994529121)
[10]:
gw_bro.stats.get_seasonal_stat(stat="mean")
[10]:
winter_mean summer_mean
GMW000000041261_1 5.723791 5.40682
[11]:
p = gw_bro.plots.interactive_plot("figure")
HTML(filename="figure/{}.html".format(gw_bro.name))
[11]:
GMW000000041261_1

ObsCollections

ObsCollections are a combination of multiple observation objects. The easiest way to construct an ObsCollections is from a list of observation objects.

[12]:
path1 = "data/Grondwaterstanden_Put/B33F0080001_1.csv"
path2 = "data/Grondwaterstanden_Put/B33F0133001_1.csv"
gw1 = hpd.GroundwaterObs.from_dino(path=path1)
gw2 = hpd.GroundwaterObs.from_dino(path=path2)

# create ObsCollection
oc = hpd.ObsCollection([gw1, gw2], name="Dino groundwater")
oc
INFO:hydropandas.io.dino:reading -> B33F0080001_1
INFO:hydropandas.io.dino:reading -> B33F0133001_1
[12]:
x y filename source unit monitoring_well tube_nr screen_top screen_bottom ground_level tube_top metadata_available obs
name
B33F0080-001 213260.0 473900.0 B33F0080001_1 dino m NAP B33F0080 1.0 3.85 2.85 6.92 7.18 True GroundwaterObs B33F0080-001 -----metadata-----...
B33F0133-001 210400.0 473366.0 B33F0133001_1 dino m NAP B33F0133 1.0 -67.50 -70.00 6.50 7.14 True GroundwaterObs B33F0133-001 -----metadata-----...

Now we have an ObsCollection object named oc. The ObsCollection contains all the data from the two GroundwaterObs objects. It also stores a reference to the GroundwaterObs objects in the ‘obs’ column. An ObsCollection object also inherits from a pandas DataFrame and has the same attributes and methods.

[13]:
# get columns
oc.columns
[13]:
Index(['x', 'y', 'filename', 'source', 'unit', 'monitoring_well', 'tube_nr',
       'screen_top', 'screen_bottom', 'ground_level', 'tube_top',
       'metadata_available', 'obs'],
      dtype='object')
[14]:
# get individual GroundwaterObs object from an ObsCollection
o = oc.loc["B33F0133-001", "obs"]
o
[14]:
stand_m_tov_nap locatie filternummer stand_cm_tov_mp stand_cm_tov_mv stand_cm_tov_nap bijzonderheid opmerking tube_top
1989-12-14 1.20 B33F0133 1 582.0 530.0 120.0 NaN NaN 7.02
1990-01-15 1.57 B33F0133 1 545.0 493.0 157.0 NaN NaN 7.02
1990-01-29 1.70 B33F0133 1 532.0 480.0 170.0 NaN NaN 7.02
1990-02-14 1.53 B33F0133 1 549.0 497.0 153.0 NaN NaN 7.02
1990-03-01 1.56 B33F0133 1 546.0 494.0 156.0 NaN NaN 7.02
... ... ... ... ... ... ... ... ... ...
2011-01-14 3.57 B33F0133 1 357.0 293.0 357.0 NaN NaN 7.14
2011-01-15 3.60 B33F0133 1 354.0 290.0 360.0 NaN NaN 7.14
2011-01-16 3.61 B33F0133 1 353.0 289.0 361.0 NaN NaN 7.14
2011-01-17 3.61 B33F0133 1 353.0 289.0 361.0 NaN NaN 7.14
2011-01-18 3.63 B33F0133 1 351.0 287.0 363.0 NaN NaN 7.14

2212 rows × 9 columns

[15]:
# get statistics
oc.describe()
[15]:
x y tube_nr screen_top screen_bottom ground_level tube_top
count 2.000000 2.000000 2.0 2.000000 2.000000 2.000000 2.000000
mean 211830.000000 473633.000000 1.0 -31.825000 -33.575000 6.710000 7.160000
std 2022.325394 377.595021 0.0 50.452069 51.512729 0.296985 0.028284
min 210400.000000 473366.000000 1.0 -67.500000 -70.000000 6.500000 7.140000
25% 211115.000000 473499.500000 1.0 -49.662500 -51.787500 6.605000 7.150000
50% 211830.000000 473633.000000 1.0 -31.825000 -33.575000 6.710000 7.160000
75% 212545.000000 473766.500000 1.0 -13.987500 -15.362500 6.815000 7.170000
max 213260.000000 473900.000000 1.0 3.850000 2.850000 6.920000 7.180000

ObsCollection methods

Besides the methods of a pandas DataFrame an ObsCollection has additional methods stored in submodules.

geo: - get_bounding_box -> get a tuple with (xmin, ymin, xmax, ymax) - get_extent -> get a tule with (xmin, xmax, ymin, ymax) - get_lat_lon -> to get the lattitudes and longitudes from the x and y coordinates - within_polygon -> to select only the observations that lie within a polygon

gwobs: - set_tube_nr -> to set the tube numbers based on the tube screen depth when there are multiple tubes at one monitoring well - set_tube_nr_monitoring_well -> find out which observations are at the same location with a different screen depth. Set monitoring_well and tube_nr attributes accordingly.

plots: - interactive_figures -> create bokeh figures for each observation point. - interactive_map -> create a folium map with observation points and bokeh figures for each observation point. - section_plot -> create a plot of multiple observations and a plot of the well layout.

stats: - get_first_last_obs_date() -> get the first and the last date of the observations for each observation point - get_no_of_observations() -> get the number of observations - get_seasonal_stat() -> get seasonal stats of the observations

E.g. get the bounding box with gw.geo.get_bounding_box():

[16]:
print(f"bounding box -> {oc.geo.get_bounding_box()}")
bounding box -> (210400.0, 473366.0, 213260.0, 473900.0)
[17]:
oc.geo.set_lat_lon()
oc.plots.interactive_map(plot_dir="figure")
[17]:
Make this Notebook Trusted to load map: File -> Trust Notebook

We can get an overview of the well layout and observations via plots.section_plot:

[18]:
oc.plots.section_plot()
INFO:hydropandas.extensions.plots:created sectionplot -> B33F0133-001
../_images/examples_01_groundwater_observations_29_1.png

ObsCollection Attributes

An ObsCollection also has additional attributes: - name, a str with the name of the collection - meta, a dictionary with additional metadata

[19]:
print(f"name is -> {oc.name}")
print(f"meta is -> {oc.meta}")
name is -> Dino groundwater
meta is -> {}

Read ObsCollections

Instead of creating the ObsCollection from a list of observation objects. It is also possible to read the data from a source into an ObsCollection at once. The following sources can be read as an ObsCollection:

  • bro (using the api)

  • dino (from files)

  • fews (dumps from the fews database)

  • wiski (dumps from the wiski database)

  • menyanthes (a .men file)

  • modflow (from the heads of a modflow model)

  • imod (from the heads of an imod model)

This notebook won’t go into detail on all the sources that can be read. Only the two options for reading data from Dino and BRO are shown below.

[20]:
# read using a .zip file with data
dinozip = "data/dino.zip"
dino_gw = hpd.read_dino(dirname=dinozip, keep_all_obs=False)
dino_gw
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B02H0092001_1.csv
WARNING:hydropandas.io.dino:no NAP measurements available -> Grondwaterstanden_Put/B02H0092001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B02H1007001_1.csv
WARNING:hydropandas.io.dino:no NAP measurements available -> Grondwaterstanden_Put/B02H1007001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B04D0032002_1.csv
WARNING:hydropandas.io.dino:could not read metadata -> Grondwaterstanden_Put/B04D0032002_1.csv
WARNING:hydropandas.io.dino:no NAP measurements available -> Grondwaterstanden_Put/B04D0032002_1.csv
INFO:root:not added to collection -> Grondwaterstanden_Put/B04D0032002_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B27D0260001_1.csv
WARNING:hydropandas.io.dino:could not read metadata -> Grondwaterstanden_Put/B27D0260001_1.csv
WARNING:hydropandas.io.dino:no NAP measurements available -> Grondwaterstanden_Put/B27D0260001_1.csv
INFO:root:not added to collection -> Grondwaterstanden_Put/B27D0260001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B33F0080001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B33F0080002_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B33F0133001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B33F0133002_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B37A0112001_1.csv
WARNING:hydropandas.io.dino:could not read metadata -> Grondwaterstanden_Put/B37A0112001_1.csv
WARNING:hydropandas.io.dino:could not read measurements -> Grondwaterstanden_Put/B37A0112001_1.csv
INFO:root:not added to collection -> Grondwaterstanden_Put/B37A0112001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B42B0003001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B42B0003002_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B42B0003003_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B42B0003004_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B58A0092004_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B58A0092005_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B58A0102001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B58A0167001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B58A0212001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B22D0155001_1.csv
[20]:
x y filename source unit monitoring_well tube_nr screen_top screen_bottom ground_level tube_top metadata_available obs
name
B02H0092-001 219890.0 600030.0 Grondwaterstanden_Put/B02H0092001_1.csv dino m NAP B02H0092 1.0 NaN NaN NaN NaN True GroundwaterObs B02H0092-001 -----metadata-----...
B02H1007-001 219661.0 600632.0 Grondwaterstanden_Put/B02H1007001_1.csv dino m NAP B02H1007 1.0 NaN NaN 1.92 NaN True GroundwaterObs B02H1007-001 -----metadata-----...
B33F0080-001 213260.0 473900.0 Grondwaterstanden_Put/B33F0080001_1.csv dino m NAP B33F0080 1.0 3.85 2.85 6.92 7.18 True GroundwaterObs B33F0080-001 -----metadata-----...
B33F0080-002 213260.0 473900.0 Grondwaterstanden_Put/B33F0080002_1.csv dino m NAP B33F0080 2.0 -10.15 -12.15 6.92 7.17 True GroundwaterObs B33F0080-002 -----metadata-----...
B33F0133-001 210400.0 473366.0 Grondwaterstanden_Put/B33F0133001_1.csv dino m NAP B33F0133 1.0 -67.50 -70.00 6.50 7.14 True GroundwaterObs B33F0133-001 -----metadata-----...
B33F0133-002 210400.0 473366.0 Grondwaterstanden_Put/B33F0133002_1.csv dino m NAP B33F0133 2.0 -104.20 -106.20 6.50 7.12 True GroundwaterObs B33F0133-002 -----metadata-----...
B42B0003-001 38165.0 413785.0 Grondwaterstanden_Put/B42B0003001_1.csv dino m NAP B42B0003 1.0 -2.00 -3.00 6.50 6.99 True GroundwaterObs B42B0003-001 -----metadata-----...
B42B0003-002 38165.0 413785.0 Grondwaterstanden_Put/B42B0003002_1.csv dino m NAP B42B0003 2.0 -34.00 -35.00 6.50 6.99 True GroundwaterObs B42B0003-002 -----metadata-----...
B42B0003-003 38165.0 413785.0 Grondwaterstanden_Put/B42B0003003_1.csv dino m NAP B42B0003 3.0 -60.00 -61.00 6.50 6.95 True GroundwaterObs B42B0003-003 -----metadata-----...
B42B0003-004 38165.0 413785.0 Grondwaterstanden_Put/B42B0003004_1.csv dino m NAP B42B0003 4.0 -107.00 -108.00 6.50 6.97 True GroundwaterObs B42B0003-004 -----metadata-----...
B58A0092-004 186924.0 372026.0 Grondwaterstanden_Put/B58A0092004_1.csv dino m NAP B58A0092 4.0 -115.23 -117.23 29.85 29.61 True GroundwaterObs B58A0092-004 -----metadata-----...
B58A0092-005 186924.0 372026.0 Grondwaterstanden_Put/B58A0092005_1.csv dino m NAP B58A0092 5.0 -134.23 -137.23 29.84 29.62 True GroundwaterObs B58A0092-005 -----metadata-----...
B58A0102-001 187900.0 373025.0 Grondwaterstanden_Put/B58A0102001_1.csv dino m NAP B58A0102 1.0 -3.35 -8.35 29.65 29.73 True GroundwaterObs B58A0102-001 -----metadata-----...
B58A0167-001 185745.0 371095.0 Grondwaterstanden_Put/B58A0167001_1.csv dino m NAP B58A0167 1.0 23.33 22.33 30.50 30.21 True GroundwaterObs B58A0167-001 -----metadata-----...
B58A0212-001 183600.0 373020.0 Grondwaterstanden_Put/B58A0212001_1.csv dino m NAP B58A0212 1.0 26.03 25.53 28.49 28.53 True GroundwaterObs B58A0212-001 -----metadata-----...
B22D0155-001 233830.0 502530.0 Grondwaterstanden_Put/B22D0155001_1.csv dino m NAP B22D0155 1.0 7.80 6.80 8.91 9.94 True GroundwaterObs B22D0155-001 -----metadata-----...
[21]:
# read from bro using an extent (Schoonhoven zuid-west)
oc = hpd.read_bro(extent=(117850, 118180, 439550, 439900), keep_all_obs=False)
oc
  0%|          | 0/4 [00:00<?, ?it/s]
INFO:hydropandas.io.bro:reading bro_id GMW000000036319
INFO:hydropandas.io.bro:GLD000000012818 contains 720 duplicates (of 22303). Keeping only first values.
 25%|██▌       | 1/4 [00:04<00:14,  4.84s/it]
INFO:hydropandas.io.bro:reading bro_id GMW000000036327
INFO:hydropandas.io.bro:GLD000000012821 contains 720 duplicates (of 17856). Keeping only first values.
 50%|█████     | 2/4 [00:08<00:08,  4.02s/it]
INFO:hydropandas.io.bro:reading bro_id GMW000000036365
INFO:hydropandas.io.bro:GLD000000012908 contains 329 duplicates (of 12793). Keeping only first values.
 75%|███████▌  | 3/4 [00:11<00:03,  3.58s/it]
INFO:hydropandas.io.bro:reading bro_id GMW000000049567
INFO:hydropandas.io.bro:no groundwater level dossier for GMW000000049567 and tube number 1
WARNING:hydropandas.io.bro:no measurements found for gmw_id GMW000000049567 and tube number1
INFO:hydropandas.io.bro:reading bro_id GMW000000049567
INFO:hydropandas.io.bro:no groundwater level dossier for GMW000000049567 and tube number 2
WARNING:hydropandas.io.bro:no measurements found for gmw_id GMW000000049567 and tube number2
100%|██████████| 4/4 [00:12<00:00,  3.03s/it]
[21]:
x y filename source unit monitoring_well tube_nr screen_top screen_bottom ground_level tube_top metadata_available obs
name
GMW000000036319_1 117957.010 439698.236 BRO m NAP GMW000000036319 1 -1.721 -2.721 -0.501 -0.621 True GroundwaterObs GMW000000036319_1 -----metadata...
GMW000000036327_1 118064.196 439799.968 BRO m NAP GMW000000036327 1 -0.833 -1.833 0.856 0.716 True GroundwaterObs GMW000000036327_1 -----metadata...
GMW000000036365_1 118127.470 439683.136 BRO m NAP GMW000000036365 1 -0.429 -1.428 1.371 1.221 True GroundwaterObs GMW000000036365_1 -----metadata...
[22]:
# plot wells, use x-coordinate in section plot
oc.plots.section_plot(section_colname_x="x", section_label_x="x coordinate [m]")
INFO:hydropandas.extensions.plots:created sectionplot -> GMW000000036365_1
../_images/examples_01_groundwater_observations_36_1.png

Write ObsCollections

Sometimes reading ObsCollections can be time consuming, especially when you need to download a lot of data. It can be worth to save the ObsCollection to a file and read it later instead of going through the full read process again. There are two basic ways to do this: 1. Write the ObsCollection to a pickle 2. Write the ObsCollection to an excel file

pickle

excel

extension

.pklz

.xlsx

human readable

No

Yes

data lost

None

Some metadata, see hpd.to_excel docstring

Pickling is used to store Python objects into a binary file that is not human readable. Writing and reading a pickle is fast and returns an exact copy of the ObsCollection. Exchanging pickles between machines can be troublesome because of machine settings and differences between package versions.

[23]:
oc.to_pickle("test.pklz")
[24]:
oc_pickled = hpd.read_pickle("test.pklz")

An ObsCollection can be written to Excel file. An Excel file with multiple sheets is created. One sheet with the metdata and another sheet for each observation in the ObsCollection. Writing to an excel file is considerably slower than writing a pickle but it does give you a human readable file format that can be easily exchanged between machines.

[25]:
oc.to_excel("test.xlsx")
WARNING:py.warnings:C:\Users\oebbe\02_python\hydropandas\hydropandas\obs_collection.py:1933: UserWarning: Pandas requires version '1.4.3' or newer of 'xlsxwriter' (version '1.3.8' currently installed).
  with pd.ExcelWriter(path) as writer:

[26]:
oc_excelled = hpd.read_excel("test.xlsx")