Reading groundwater observations
This notebook introduces how to use the hydropandas
package to read, process and visualise groundwater data from Dino and Bro databases.
Notebook contents
ObsCollection
Read ObsCollections
Write ObsCollections
[1]:
import hydropandas as hpd
import logging
from IPython.display import HTML
import logging
[2]:
hpd.util.get_color_logger("INFO")
GroundwaterObs
The hydropandas package has several functions to read groundwater observations at a measurement well. These include reading data from: - dino (from csv-files). - bro (using the bro-api) - fews (xml dumps from the fews database) - wiski (dumps from the wiski database)
[3]:
# reading a dino csv file
path = "data/Grondwaterstanden_Put/B33F0080001_1.csv"
gw_dino = hpd.GroundwaterObs.from_dino(path=path)
gw_dino
INFO:hydropandas.io.dino:reading -> B33F0080001_1
[4]:
# reading the same filter from using the bro api. Specify a groundwater monitoring id (GMW00...) and a filter number (1)
gw_bro = hpd.GroundwaterObs.from_bro("GMW000000041261", 1)
INFO:hydropandas.io.bro:reading bro_id GMW000000041261
INFO:hydropandas.io.bro:GLD000000009378 contains 36 duplicates (of 66447). Keeping only first values.
Now we have an GroundwaterObs
object named gw_bro
and gw_dino
. Both objects are from the same measurement well in different databases. A GroundwaterObs
object inherits from a pandas DataFrame
and has the same attributes and methods.
[5]:
gw_bro.describe()
[5]:
values | |
---|---|
count | 66411.000000 |
mean | 5.562630 |
std | 0.222262 |
min | 4.913000 |
25% | 5.380000 |
50% | 5.569000 |
75% | 5.721000 |
max | 6.397000 |
[6]:
gw_bro
[6]:
values | qualifier | |
---|---|---|
1972-11-28 00:00:00 | 5.763 | goedgekeurd |
1972-12-07 00:00:00 | 5.773 | goedgekeurd |
1972-12-14 00:00:00 | 5.703 | goedgekeurd |
1972-12-21 00:00:00 | 5.643 | goedgekeurd |
1972-12-28 00:00:00 | 5.573 | goedgekeurd |
... | ... | ... |
2021-10-08 07:00:00 | 5.486 | goedgekeurd |
2021-10-08 08:00:00 | 5.485 | goedgekeurd |
2021-10-08 09:00:00 | 5.486 | goedgekeurd |
2021-10-08 09:47:00 | 5.491 | goedgekeurd |
2021-10-08 10:00:00 | 5.485 | goedgekeurd |
66411 rows × 2 columns
[7]:
ax = gw_dino["stand_m_tov_nap"].plot(
label="dinoloket", figsize=(14, 5), legend=True, marker=".", lw=0.2
)
gw_bro["values"].plot(ax=ax, label="bro", legend=True, ylabel=gw_bro.unit)
gw_dino["ground_level"].plot(
ax=ax,
label="ground level",
legend=True,
grid=True,
color="green",
ylabel=gw_dino.unit,
)
ax.set_title(f"same tube from Dinoloket {gw_dino.name} and BRO id {gw_bro.name}")
GroundwaterObs Attributes
Besides the standard DataFrame
attributes a GroundwaterObs
has the following additional attributes: - x, y: x- and y-coordinates of the observation point - name: str with the name - filename: str with the filename (only available when the data was loaded from a file) - monitoring_well: the name of the monitoring_well. One monitoring well can have multiple tubes. - tube_nr: the number of the tube. The combination of monitoring_well and tube_nr should be unique - screen_top: the top of the
tube screen (bovenkant filter in Dutch) - screen_bottom: the bottom of the tube screen (onderkant filter in Dutch) - ground_level: surface level (maaiveld in Dutch) - tube_top: the top of the tube - metadata_available: boolean indicating whether metadata is available for this observation point - meta: dictionary with additional metadata
When dowloading from Dinoloket all levels are in meters NAP.
[8]:
print(gw_bro)
GroundwaterObs GMW000000041261_1
-----metadata------
name : GMW000000041261_1
x : 213268.0
y : 473910.0
filename :
source : BRO
unit : m NAP
monitoring_well : GMW000000041261
tube_nr : 1
screen_top : 4.05
screen_bottom : 3.05
ground_level : 6.9
tube_top : 7.173
metadata_available : True
-----time series------
values qualifier
1972-11-28 00:00:00 5.763 goedgekeurd
1972-12-07 00:00:00 5.773 goedgekeurd
1972-12-14 00:00:00 5.703 goedgekeurd
1972-12-21 00:00:00 5.643 goedgekeurd
1972-12-28 00:00:00 5.573 goedgekeurd
... ... ...
2021-10-08 07:00:00 5.486 goedgekeurd
2021-10-08 08:00:00 5.485 goedgekeurd
2021-10-08 09:00:00 5.486 goedgekeurd
2021-10-08 09:47:00 5.491 goedgekeurd
2021-10-08 10:00:00 5.485 goedgekeurd
[66411 rows x 2 columns]
GroundwaterObs methods
Besides the standard DataFrame
methods a GroundwaterObs
has additional methods. This methods are accessible through submodules: - geo.get_lat_lon()
, to obtain latitude and longitude - gwobs.get_modellayer()
, to obtain the modellayer of a modflow model using the filter depth - stats.get_seasonal_stat()
, to obtain seasonal statistics - stats.obs_per_year()
, to obtain the number of observations per year - stats.consecutive_obs_years()
, to obtain the number of consecutive
years with more than a minimum number of observations - plots.interactive_plot()
, to obtain a bokeh plot
Get latitude and longitude with gw.geo.get_lat_lon()
:
[9]:
print(f"latitude and longitude -> {gw_bro.geo.get_lat_lon()}")
latitude and longitude -> (52.25014706738195, 6.24047994529121)
[10]:
gw_bro.stats.get_seasonal_stat(stat="mean")
[10]:
winter_mean | summer_mean | |
---|---|---|
GMW000000041261_1 | 5.723791 | 5.40682 |
[11]:
p = gw_bro.plots.interactive_plot("figure")
HTML(filename="figure/{}.html".format(gw_bro.name))
[11]:
ObsCollections
ObsCollections
are a combination of multiple observation objects. The easiest way to construct an ObsCollections
is from a list of observation objects.
[12]:
path1 = "data/Grondwaterstanden_Put/B33F0080001_1.csv"
path2 = "data/Grondwaterstanden_Put/B33F0133001_1.csv"
gw1 = hpd.GroundwaterObs.from_dino(path=path1)
gw2 = hpd.GroundwaterObs.from_dino(path=path2)
# create ObsCollection
oc = hpd.ObsCollection([gw1, gw2], name="Dino groundwater")
oc
INFO:hydropandas.io.dino:reading -> B33F0080001_1
INFO:hydropandas.io.dino:reading -> B33F0133001_1
[12]:
x | y | filename | source | unit | monitoring_well | tube_nr | screen_top | screen_bottom | ground_level | tube_top | metadata_available | obs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
name | |||||||||||||
B33F0080-001 | 213260.0 | 473900.0 | B33F0080001_1 | dino | m NAP | B33F0080 | 1.0 | 3.85 | 2.85 | 6.92 | 7.18 | True | GroundwaterObs B33F0080-001 -----metadata-----... |
B33F0133-001 | 210400.0 | 473366.0 | B33F0133001_1 | dino | m NAP | B33F0133 | 1.0 | -67.50 | -70.00 | 6.50 | 7.14 | True | GroundwaterObs B33F0133-001 -----metadata-----... |
Now we have an ObsCollection
object named oc
. The ObsCollection
contains all the data from the two GroundwaterObs
objects. It also stores a reference to the GroundwaterObs
objects in the ‘obs’ column. An ObsCollection
object also inherits from a pandas DataFrame
and has the same attributes and methods.
[13]:
# get columns
oc.columns
[13]:
Index(['x', 'y', 'filename', 'source', 'unit', 'monitoring_well', 'tube_nr',
'screen_top', 'screen_bottom', 'ground_level', 'tube_top',
'metadata_available', 'obs'],
dtype='object')
[14]:
# get individual GroundwaterObs object from an ObsCollection
o = oc.loc["B33F0133-001", "obs"]
o
[14]:
stand_m_tov_nap | locatie | filternummer | stand_cm_tov_mp | stand_cm_tov_mv | stand_cm_tov_nap | bijzonderheid | opmerking | tube_top | |
---|---|---|---|---|---|---|---|---|---|
1989-12-14 | 1.20 | B33F0133 | 1 | 582.0 | 530.0 | 120.0 | NaN | NaN | 7.02 |
1990-01-15 | 1.57 | B33F0133 | 1 | 545.0 | 493.0 | 157.0 | NaN | NaN | 7.02 |
1990-01-29 | 1.70 | B33F0133 | 1 | 532.0 | 480.0 | 170.0 | NaN | NaN | 7.02 |
1990-02-14 | 1.53 | B33F0133 | 1 | 549.0 | 497.0 | 153.0 | NaN | NaN | 7.02 |
1990-03-01 | 1.56 | B33F0133 | 1 | 546.0 | 494.0 | 156.0 | NaN | NaN | 7.02 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2011-01-14 | 3.57 | B33F0133 | 1 | 357.0 | 293.0 | 357.0 | NaN | NaN | 7.14 |
2011-01-15 | 3.60 | B33F0133 | 1 | 354.0 | 290.0 | 360.0 | NaN | NaN | 7.14 |
2011-01-16 | 3.61 | B33F0133 | 1 | 353.0 | 289.0 | 361.0 | NaN | NaN | 7.14 |
2011-01-17 | 3.61 | B33F0133 | 1 | 353.0 | 289.0 | 361.0 | NaN | NaN | 7.14 |
2011-01-18 | 3.63 | B33F0133 | 1 | 351.0 | 287.0 | 363.0 | NaN | NaN | 7.14 |
2212 rows × 9 columns
[15]:
# get statistics
oc.describe()
[15]:
x | y | tube_nr | screen_top | screen_bottom | ground_level | tube_top | |
---|---|---|---|---|---|---|---|
count | 2.000000 | 2.000000 | 2.0 | 2.000000 | 2.000000 | 2.000000 | 2.000000 |
mean | 211830.000000 | 473633.000000 | 1.0 | -31.825000 | -33.575000 | 6.710000 | 7.160000 |
std | 2022.325394 | 377.595021 | 0.0 | 50.452069 | 51.512729 | 0.296985 | 0.028284 |
min | 210400.000000 | 473366.000000 | 1.0 | -67.500000 | -70.000000 | 6.500000 | 7.140000 |
25% | 211115.000000 | 473499.500000 | 1.0 | -49.662500 | -51.787500 | 6.605000 | 7.150000 |
50% | 211830.000000 | 473633.000000 | 1.0 | -31.825000 | -33.575000 | 6.710000 | 7.160000 |
75% | 212545.000000 | 473766.500000 | 1.0 | -13.987500 | -15.362500 | 6.815000 | 7.170000 |
max | 213260.000000 | 473900.000000 | 1.0 | 3.850000 | 2.850000 | 6.920000 | 7.180000 |
ObsCollection methods
Besides the methods of a pandas DataFrame
an ObsCollection
has additional methods stored in submodules.
geo
: - get_bounding_box
-> get a tuple with (xmin, ymin, xmax, ymax) - get_extent
-> get a tule with (xmin, xmax, ymin, ymax) - get_lat_lon
-> to get the lattitudes and longitudes from the x and y coordinates - within_polygon
-> to select only the observations that lie within a polygon
gwobs
: - set_tube_nr
-> to set the tube numbers based on the tube screen depth when there are multiple tubes at one monitoring well - set_tube_nr_monitoring_well
-> find out which observations are at the same location with a different screen depth. Set monitoring_well and tube_nr attributes accordingly.
plots
: - interactive_figures
-> create bokeh figures for each observation point. - interactive_map
-> create a folium map with observation points and bokeh figures for each observation point. - section_plot
-> create a plot of multiple observations and a plot of the well layout.
stats
: - get_first_last_obs_date()
-> get the first and the last date of the observations for each observation point - get_no_of_observations()
-> get the number of observations - get_seasonal_stat()
-> get seasonal stats of the observations
E.g. get the bounding box with gw.geo.get_bounding_box()
:
[16]:
print(f"bounding box -> {oc.geo.get_bounding_box()}")
bounding box -> (210400.0, 473366.0, 213260.0, 473900.0)
[17]:
oc.geo.set_lat_lon()
oc.plots.interactive_map(plot_dir="figure")
[17]:
We can get an overview of the well layout and observations via plots.section_plot
:
[18]:
oc.plots.section_plot()
INFO:hydropandas.extensions.plots:created sectionplot -> B33F0133-001
ObsCollection Attributes
An ObsCollection
also has additional attributes: - name, a str with the name of the collection - meta, a dictionary with additional metadata
[19]:
print(f"name is -> {oc.name}")
print(f"meta is -> {oc.meta}")
name is -> Dino groundwater
meta is -> {}
Read ObsCollections
Instead of creating the ObsCollection from a list of observation objects. It is also possible to read the data from a source into an ObsCollection at once. The following sources can be read as an ObsCollection:
bro (using the api)
dino (from files)
fews (dumps from the fews database)
wiski (dumps from the wiski database)
menyanthes (a .men file)
modflow (from the heads of a modflow model)
imod (from the heads of an imod model)
This notebook won’t go into detail on all the sources that can be read. Only the two options for reading data from Dino and BRO are shown below.
[20]:
# read using a .zip file with data
dinozip = "data/dino.zip"
dino_gw = hpd.read_dino(dirname=dinozip, keep_all_obs=False)
dino_gw
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B02H0092001_1.csv
WARNING:hydropandas.io.dino:no NAP measurements available -> Grondwaterstanden_Put/B02H0092001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B02H1007001_1.csv
WARNING:hydropandas.io.dino:no NAP measurements available -> Grondwaterstanden_Put/B02H1007001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B04D0032002_1.csv
WARNING:hydropandas.io.dino:could not read metadata -> Grondwaterstanden_Put/B04D0032002_1.csv
WARNING:hydropandas.io.dino:no NAP measurements available -> Grondwaterstanden_Put/B04D0032002_1.csv
INFO:root:not added to collection -> Grondwaterstanden_Put/B04D0032002_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B27D0260001_1.csv
WARNING:hydropandas.io.dino:could not read metadata -> Grondwaterstanden_Put/B27D0260001_1.csv
WARNING:hydropandas.io.dino:no NAP measurements available -> Grondwaterstanden_Put/B27D0260001_1.csv
INFO:root:not added to collection -> Grondwaterstanden_Put/B27D0260001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B33F0080001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B33F0080002_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B33F0133001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B33F0133002_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B37A0112001_1.csv
WARNING:hydropandas.io.dino:could not read metadata -> Grondwaterstanden_Put/B37A0112001_1.csv
WARNING:hydropandas.io.dino:could not read measurements -> Grondwaterstanden_Put/B37A0112001_1.csv
INFO:root:not added to collection -> Grondwaterstanden_Put/B37A0112001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B42B0003001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B42B0003002_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B42B0003003_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B42B0003004_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B58A0092004_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B58A0092005_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B58A0102001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B58A0167001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B58A0212001_1.csv
INFO:hydropandas.io.dino:reading -> Grondwaterstanden_Put/B22D0155001_1.csv
[20]:
x | y | filename | source | unit | monitoring_well | tube_nr | screen_top | screen_bottom | ground_level | tube_top | metadata_available | obs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
name | |||||||||||||
B02H0092-001 | 219890.0 | 600030.0 | Grondwaterstanden_Put/B02H0092001_1.csv | dino | m NAP | B02H0092 | 1.0 | NaN | NaN | NaN | NaN | True | GroundwaterObs B02H0092-001 -----metadata-----... |
B02H1007-001 | 219661.0 | 600632.0 | Grondwaterstanden_Put/B02H1007001_1.csv | dino | m NAP | B02H1007 | 1.0 | NaN | NaN | 1.92 | NaN | True | GroundwaterObs B02H1007-001 -----metadata-----... |
B33F0080-001 | 213260.0 | 473900.0 | Grondwaterstanden_Put/B33F0080001_1.csv | dino | m NAP | B33F0080 | 1.0 | 3.85 | 2.85 | 6.92 | 7.18 | True | GroundwaterObs B33F0080-001 -----metadata-----... |
B33F0080-002 | 213260.0 | 473900.0 | Grondwaterstanden_Put/B33F0080002_1.csv | dino | m NAP | B33F0080 | 2.0 | -10.15 | -12.15 | 6.92 | 7.17 | True | GroundwaterObs B33F0080-002 -----metadata-----... |
B33F0133-001 | 210400.0 | 473366.0 | Grondwaterstanden_Put/B33F0133001_1.csv | dino | m NAP | B33F0133 | 1.0 | -67.50 | -70.00 | 6.50 | 7.14 | True | GroundwaterObs B33F0133-001 -----metadata-----... |
B33F0133-002 | 210400.0 | 473366.0 | Grondwaterstanden_Put/B33F0133002_1.csv | dino | m NAP | B33F0133 | 2.0 | -104.20 | -106.20 | 6.50 | 7.12 | True | GroundwaterObs B33F0133-002 -----metadata-----... |
B42B0003-001 | 38165.0 | 413785.0 | Grondwaterstanden_Put/B42B0003001_1.csv | dino | m NAP | B42B0003 | 1.0 | -2.00 | -3.00 | 6.50 | 6.99 | True | GroundwaterObs B42B0003-001 -----metadata-----... |
B42B0003-002 | 38165.0 | 413785.0 | Grondwaterstanden_Put/B42B0003002_1.csv | dino | m NAP | B42B0003 | 2.0 | -34.00 | -35.00 | 6.50 | 6.99 | True | GroundwaterObs B42B0003-002 -----metadata-----... |
B42B0003-003 | 38165.0 | 413785.0 | Grondwaterstanden_Put/B42B0003003_1.csv | dino | m NAP | B42B0003 | 3.0 | -60.00 | -61.00 | 6.50 | 6.95 | True | GroundwaterObs B42B0003-003 -----metadata-----... |
B42B0003-004 | 38165.0 | 413785.0 | Grondwaterstanden_Put/B42B0003004_1.csv | dino | m NAP | B42B0003 | 4.0 | -107.00 | -108.00 | 6.50 | 6.97 | True | GroundwaterObs B42B0003-004 -----metadata-----... |
B58A0092-004 | 186924.0 | 372026.0 | Grondwaterstanden_Put/B58A0092004_1.csv | dino | m NAP | B58A0092 | 4.0 | -115.23 | -117.23 | 29.85 | 29.61 | True | GroundwaterObs B58A0092-004 -----metadata-----... |
B58A0092-005 | 186924.0 | 372026.0 | Grondwaterstanden_Put/B58A0092005_1.csv | dino | m NAP | B58A0092 | 5.0 | -134.23 | -137.23 | 29.84 | 29.62 | True | GroundwaterObs B58A0092-005 -----metadata-----... |
B58A0102-001 | 187900.0 | 373025.0 | Grondwaterstanden_Put/B58A0102001_1.csv | dino | m NAP | B58A0102 | 1.0 | -3.35 | -8.35 | 29.65 | 29.73 | True | GroundwaterObs B58A0102-001 -----metadata-----... |
B58A0167-001 | 185745.0 | 371095.0 | Grondwaterstanden_Put/B58A0167001_1.csv | dino | m NAP | B58A0167 | 1.0 | 23.33 | 22.33 | 30.50 | 30.21 | True | GroundwaterObs B58A0167-001 -----metadata-----... |
B58A0212-001 | 183600.0 | 373020.0 | Grondwaterstanden_Put/B58A0212001_1.csv | dino | m NAP | B58A0212 | 1.0 | 26.03 | 25.53 | 28.49 | 28.53 | True | GroundwaterObs B58A0212-001 -----metadata-----... |
B22D0155-001 | 233830.0 | 502530.0 | Grondwaterstanden_Put/B22D0155001_1.csv | dino | m NAP | B22D0155 | 1.0 | 7.80 | 6.80 | 8.91 | 9.94 | True | GroundwaterObs B22D0155-001 -----metadata-----... |
[21]:
# read from bro using an extent (Schoonhoven zuid-west)
oc = hpd.read_bro(extent=(117850, 118180, 439550, 439900), keep_all_obs=False)
oc
0%| | 0/4 [00:00<?, ?it/s]
INFO:hydropandas.io.bro:reading bro_id GMW000000036319
INFO:hydropandas.io.bro:GLD000000012818 contains 720 duplicates (of 22303). Keeping only first values.
25%|██▌ | 1/4 [00:04<00:14, 4.84s/it]
INFO:hydropandas.io.bro:reading bro_id GMW000000036327
INFO:hydropandas.io.bro:GLD000000012821 contains 720 duplicates (of 17856). Keeping only first values.
50%|█████ | 2/4 [00:08<00:08, 4.02s/it]
INFO:hydropandas.io.bro:reading bro_id GMW000000036365
INFO:hydropandas.io.bro:GLD000000012908 contains 329 duplicates (of 12793). Keeping only first values.
75%|███████▌ | 3/4 [00:11<00:03, 3.58s/it]
INFO:hydropandas.io.bro:reading bro_id GMW000000049567
INFO:hydropandas.io.bro:no groundwater level dossier for GMW000000049567 and tube number 1
WARNING:hydropandas.io.bro:no measurements found for gmw_id GMW000000049567 and tube number1
INFO:hydropandas.io.bro:reading bro_id GMW000000049567
INFO:hydropandas.io.bro:no groundwater level dossier for GMW000000049567 and tube number 2
WARNING:hydropandas.io.bro:no measurements found for gmw_id GMW000000049567 and tube number2
100%|██████████| 4/4 [00:12<00:00, 3.03s/it]
[21]:
x | y | filename | source | unit | monitoring_well | tube_nr | screen_top | screen_bottom | ground_level | tube_top | metadata_available | obs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
name | |||||||||||||
GMW000000036319_1 | 117957.010 | 439698.236 | BRO | m NAP | GMW000000036319 | 1 | -1.721 | -2.721 | -0.501 | -0.621 | True | GroundwaterObs GMW000000036319_1 -----metadata... | |
GMW000000036327_1 | 118064.196 | 439799.968 | BRO | m NAP | GMW000000036327 | 1 | -0.833 | -1.833 | 0.856 | 0.716 | True | GroundwaterObs GMW000000036327_1 -----metadata... | |
GMW000000036365_1 | 118127.470 | 439683.136 | BRO | m NAP | GMW000000036365 | 1 | -0.429 | -1.428 | 1.371 | 1.221 | True | GroundwaterObs GMW000000036365_1 -----metadata... |
[22]:
# plot wells, use x-coordinate in section plot
oc.plots.section_plot(section_colname_x="x", section_label_x="x coordinate [m]")
INFO:hydropandas.extensions.plots:created sectionplot -> GMW000000036365_1
Write ObsCollections
Sometimes reading ObsCollections can be time consuming, especially when you need to download a lot of data. It can be worth to save the ObsCollection to a file and read it later instead of going through the full read process again. There are two basic ways to do this: 1. Write the ObsCollection to a pickle 2. Write the ObsCollection to an excel file
pickle |
excel |
|
---|---|---|
extension |
.pklz |
.xlsx |
human readable |
No |
Yes |
data lost |
None |
Some metadata, see |
Pickling is used to store Python objects into a binary file that is not human readable. Writing and reading a pickle is fast and returns an exact copy of the ObsCollection. Exchanging pickles between machines can be troublesome because of machine settings and differences between package versions.
[23]:
oc.to_pickle("test.pklz")
[24]:
oc_pickled = hpd.read_pickle("test.pklz")
An ObsCollection can be written to Excel file. An Excel file with multiple sheets is created. One sheet with the metdata and another sheet for each observation in the ObsCollection. Writing to an excel file is considerably slower than writing a pickle but it does give you a human readable file format that can be easily exchanged between machines.
[25]:
oc.to_excel("test.xlsx")
WARNING:py.warnings:C:\Users\oebbe\02_python\hydropandas\hydropandas\obs_collection.py:1933: UserWarning: Pandas requires version '1.4.3' or newer of 'xlsxwriter' (version '1.3.8' currently installed).
with pd.ExcelWriter(path) as writer:
[26]:
oc_excelled = hpd.read_excel("test.xlsx")