hydropandas.extensions package
hydropandas.extensions.accessor module
Copied from ./pandas/core/accessor.py.
- class hydropandas.extensions.accessor.CachedAccessor(name, accessor)[source]
Bases:
object
Custom property-like object (descriptor) for caching accessors.
- Parameters
name (str) – The namespace this will be accessed under, e.g.
df.foo
accessor (cls) – The class with the extension methods. The class’ __init__ method should expect one of a
Series
,DataFrame
orIndex
as the single argumentdata
hydropandas.extensions.geo module
- class hydropandas.extensions.geo.GeoAccessor(oc_obj)[source]
Bases:
object
- get_bounding_box(xcol='x', ycol='y', buffer=0)[source]
returns the bounding box of all observations.
- Parameters
xcol (str, optional) – column name with x values
ycol (str, optional) – column name with y values
buffer (int or float, optional) – add a buffer around the bouding box from the observations
- Returns
coordinates of bouding box
- Return type
xmin, ymin, xmax, ymax
- get_distance_to_point(point, xcol='x', ycol='y')[source]
get distance of every observation to a point.
- Parameters
point (shapely.geometry.point.Point) – point geometry
xcol (str, optional) – x column in self._obj used to get x coordinates
ycol (str, optional) – y column in self._obj used to get y coordinates
- Returns
distance to the point for every observation in self._obj
- Return type
pd.Series
- get_extent(xcol='x', ycol='y', buffer=0)[source]
returns the extent of all observations.
- Parameters
xcol (str, optional) – column name with x values
ycol (str, optional) – column name with y values
buffer (int or float, optional) – add a buffer around the bouding box from the observations
- Returns
coordinates of bouding box
- Return type
xmin, xmax, ymin, ymax
- get_lat_lon(in_epsg='epsg:28992', out_epsg='epsg:4326')[source]
get lattitude and longitude from x and y attributes.
- Parameters
in_epsg (str, optional) – epsg code of current x and y attributes, default (RD new)
out_epsg (str, optional) – epsg code of desired output, default lat/lon
- Returns
with columns ‘lat’ and ‘lon’
- Return type
pandas.DataFrame
- get_nearest_line(gdf=None, xcol_obs='x', ycol_obs='y', multiple_lines='error')[source]
get nearest line for each point in the obs collection. Function calls the nearest_polygon function.
- Parameters
gdf (GeoDataFrame) – dataframe with line features
xcol_obs (str, optional) – x column in self._obj used to get geometry
ycol_obs (str, optional) – y column in self._obj used to get geometry
multiple_lines (str, optional) –
keyword on how to deal with multiple lines being nearest. Options are:
’error’ -> raise a ValueError ‘keep_all’ -> return the indices of multiple lines as a string seperated by a comma ‘keep_first’ -> return the index of the first line
- Returns
with columns ‘nearest polygon’ and ‘distance nearest polygon’
- Return type
pandas.DataFrame
- get_nearest_point(obs_collection2=None, gdf2=None, xcol_obs1='x', ycol_obs1='y', xcol_obs2='x', ycol_obs2='y')[source]
get nearest point of another obs collection for each point in the current obs collection.
- Parameters
obs_collection2 (ObsCollection, optional) – collection of observations of which the nearest point is found
gdf2 (GeoDataFrame, optional) – dataframe to look for nearest observation point
xcol_obs1 (str, optional) – x column in self._obj used to get geometry
ycol_obs1 (str, optional) – y column in self._obj used to get geometry
xcol_obs2 (str, optional) – x column in obs_collection2 used to get geometry
ycol_obs2 (str, optional) – y column in self._obj used to get geometry
- Returns
with columns ‘nearest point’ and ‘distance nearest point’
- Return type
pandas.DataFrame
- get_nearest_polygon(gdf=None, xcol_obs='x', ycol_obs='y', multiple_polygons='error')[source]
get nearest polygon for each point in the obs collection. Function also works for lines instead of polygons.
- Parameters
gdf (GeoDataFrame) – dataframe with polygon features
xcol_obs (str, optional) – x column in self._obj used to get geometry
ycol_obs (str, optional) – y column in self._obj used to get geometry
multiple_polygons (str, optional) –
keyword on how to deal with multiple polygons being nearest. Options are:
’error’ -> raise a ValueError ‘keep_all’ -> return the indices of multiple polygons as a string seperated by a comma ‘keep_first’ -> return the index of the first polygon
- Returns
with columns ‘nearest polygon’ and ‘distance nearest polygon’
- Return type
pandas.DataFrame
- set_lat_lon(in_epsg='epsg:28992', out_epsg='epsg:4326', add_to_meta=True)[source]
create columns with lat and lon values of the observation points.
- Parameters
in_epsg (str, optional) – epsg code of current x and y attributes, default (RD new)
out_epsg (str, optional) – epsg code of desired output, default lat/lon
add_to_meta (bool, optional) – if True the lat and lon values are added to the observation meta dictionary. The default is True.
- Return type
None.
- within_extent(extent, inplace=False)[source]
Slice ObsCollection by extent.
- Parameters
extent (tuple) – format (xmin, xmax, ymin, ymax), default dis.sr.get_extent() format
- Returns
new_oc – ObsCollection with observations within extent
- Return type
- within_polygon(gdf=None, shapefile=None, inplace=False, **kwargs)[source]
Slice ObsCollection by checking if points are within a shapefile.
- Parameters
gdf (GeoDataFrame, optional) – geodataframe containing a single polygon
shapefile (str, optional) – Not yet implemented
inplace (bool, default False) – Modify the ObsCollection in place (do not create a new object).
**kwargs – kwargs will be passed to the self._obj.to_gdf() method
- Returns
new_oc – ObsCollection with observations within polygon
- Return type
- class hydropandas.extensions.geo.GeoAccessorObs(obs)[source]
Bases:
object
- get_lat_lon(in_epsg='epsg:28992', out_epsg='epsg:4326')[source]
get lattitude and longitude from x and y attributes.
- Parameters
in_epsg (str, optional) – epsg code of current x and y attributes, default (RD new)
out_epsg (str, optional) – epsg code of desired output, default lat/lon
- Returns
lon, lat
- Return type
longitude and lattitude of x, y coordinates
hydropandas.extensions.gwobs module
- class hydropandas.extensions.gwobs.GeoAccessorObs(obs)[source]
Bases:
object
- get_modellayer_modflow(gwf=None, ds=None, left=-999, right=999)[source]
Add modellayer to meta dictionary.
- Parameters
gwf (flopy.mf6.modflow.mfgwf.ModflowGwf) – modflow model
ds (xarray.Dataset) – xarray Dataset with with top and bottoms, must have dimensions ‘x’ and ‘y’ and variables ‘top’ and ‘bot’
- Returns
modellayer
- Return type
int
- class hydropandas.extensions.gwobs.GwObsAccessor(oc_obj)[source]
Bases:
object
- get_modellayers(gwf=None, ds=None)[source]
Get the modellayer per observation. The layers can be obtained from the modflow model or can be defined in zgr.
- Parameters
gwf (flopy.mf6.modflow.mfgwf.ModflowGwf) – modflow model
ds (xarray.Dataset) – xarray Dataset with with top and bottoms, must have dimensions ‘x’ and ‘y’ and variables ‘top’ and ‘bot’
- Return type
pd.Series with the modellayers of each observation
- get_regis_layers()[source]
Get the regis layer per observation.
- Return type
pd.Series with the names of the regis layer of each observation
- set_tube_nr(radius=1, xcol='x', ycol='y', if_exists='error', add_to_meta=False)[source]
This method computes the tube numbers based on the location of the observations.
Then it sets the value of the tube number:
in the ObsCollection dataframe
as the attribute of an Obs object
in the meta dictionary of the Obs object (only if add_to_meta is True)
This method is useful for groundwater observations. If two or more observation points are close to each other they will be seen as one monitoring_well with multiple tubes. The tube_nr is based on the ‘screen_bottom’ attribute of the observations in such a way that the deepest tube has the highest tube number.
- Parameters
radius (int, optional) – max distance between two observations to be seen as one location, by default 1
xcol (str, optional) – column name with x coordinates, by default ‘x’
ycol (str, optional) – column name with y coordinates, by default ‘y’
if_exists (str, optional) – what to do if an observation point already has a tube_nr, options: ‘error’, ‘replace’ or ‘keep’, by default ‘error’
add_to_meta (bool, optional) – if True the tube_nr is added to the meta dictionary of an observation. The default is False.
- Raises
RuntimeError – if the column tube_nr exists and if_exists=’error’ an error is raised
- set_tube_nr_monitoring_well(loc_col, radius=1, xcol='x', ycol='y', if_exists='error', add_to_meta=False)[source]
This method sets the tube_nr and monitoring_well name of an observation point based on the location of the observations.
When two or more tubes are close to another, as defined by radius, they are set to the same monitoring_well and an increasing tube_nr based on depth.
The value of the tube_nr and the monitoring_well are set:
in the ObsCollection dataframe
as the attribute of an Obs object
in the meta dictionary of the Obs object (only if add_to_meta is True)
This method is useful for groundwater observations. If two or more observation points are close to each other they will be seen as one monitoring_well with multiple tubes. The tube_nr is based on the ‘screen_bottom’ attribute of the observations in such a way that the deepest tube has the highest tube number. The monitoring_well is based on the named of the loc_col of the screen with the lowest tube_nr.
- Parameters
loc_col (str) – the column name with the names to use for the monitoring_well
radius (int, optional) – max distance between two observations to be seen as one location, by default 1
xcol (str, optional) – column name with x coordinates, by default ‘x’
ycol (str, optional) – column name with y coordinates, by default ‘y’
if_exists (str, optional) – what to do if an observation point already has a tube_nr, options: ‘error’, ‘replace’ or ‘keep’, by default ‘error’
add_to_meta (bool, optional) – if True the tube_nr and location are added to the meta dictionary of an observation. The default is False.
- Raises
RuntimeError – if the column tube_nr exists and if_exists=’error’ an error is raised
- hydropandas.extensions.gwobs.get_model_layer_z(z, zvec, left=-999, right=999)[source]
Get index of model layer based on elevation.
Assumptions:
the highest value in zvec is the top of model layer 0.
if z is equal to the bottom of a layer, the model layer above that layer is assigned.
- Parameters
z (int, float) – elevation.
zvec (list, np.array) – elevations of model layers. shape is nlay + 1
left (int, optional) – if z is below the lowest value in zvec, this value is returned. The default is -999.
right (TYPE, optional) – if z is above the highest value in zvec, this value is returned. The default is 999.
- Returns
int – model layer
- Return type
int
Examples
>>> zvec = [0, -10, -20, -30] >>> get_model_layer_z(-5, zvec) 0
>>> get_model_layer_z(-25, zvec) 2
>>> get_model_layer_z(-50, zvec) -999
>>> get_model_layer_z(100, zvec) 999
>>> get_model_layer_z(-20, zvec) 1
- hydropandas.extensions.gwobs.get_modellayer_from_screen_depth(ftop, fbot, zvec, left=-999, right=999)[source]
- Parameters
ftop (int or float) – top of screen.
fbot (int or float) – bottom of screen, has to be lower than ftop.
zvec (list or np.array) – elevations of the modellayers at the location of the tube.
left (int, optional) – value to return if tube screen is below the modellayers. The default is -999.
right (int, optional) – value to return if tube screen is above the modellayers. The default is-999.
- Raises
ValueError – raised if something unexpected happens.
- Returns
modellayer.
- Return type
int or np.nan
Examples
>>> zvec = [0, -10, -20, -30, -40] >>> get_modellayer_from_screen_depth(-5, -7, zvec) 0
>>> get_modellayer_from_screen_depth(-25, -27, zvec) 2
>>> get_modellayer_from_screen_depth(-15, -27, zvec) 2
>>> get_modellayer_from_screen_depth(-5, -27, zvec) 1
>>> get_modellayer_from_screen_depth(-5, -37, zvec) 1
>>> get_modellayer_from_screen_depth(15, -5, zvec) 0
>>> get_modellayer_from_screen_depth(15, 5, zvec) 999
>>> get_modellayer_from_screen_depth(-55, -65, zvec) -999
>>> get_modellayer_from_screen_depth(15, -65, zvec) nan
>>> get_modellayer_from_screen_depth(None, -7, zvec) 0
>>> get_modellayer_from_screen_depth(None, None, zvec) nan
- hydropandas.extensions.gwobs.get_zvec(x, y, gwf=None, ds=None)[source]
get a list with the vertical layer boundaries at a point in the model.
- Parameters
x (int or float) – x coordinate.
y (int or float) – y coordinate.
gwf (flopy.mf6.modflow.mfgwf.ModflowGwf) – modflow model with top and bottoms
ds (xarray.Dataset) – xarray Dataset typically created in nlmod. Must have variables ‘top’ and ‘botm’.
- Raises
NotImplementedError – not all grid types are supported yet.
- Returns
zvec – list of vertical layer boundaries. length is nlay + 1.
- Return type
list
hydropandas.extensions.plots module
- class hydropandas.extensions.plots.CollectionPlots(oc_obj)[source]
Bases:
object
- interactive_map(plot_dir='figures', m=None, tiles='OpenStreetMap', fname=None, per_monitoring_well=True, color='blue', legend_name=None, add_legend=True, map_label='', map_label_size=20, col_name_lat='lat', col_name_lon='lon', zoom_start=13, create_interactive_plots=True, **kwargs)[source]
Create an interactive map with interactive plots using folium and bokeh.
Notes
Some notes on this method:
if you want to have multiple obs collections on one folium map, only the last one should have add_legend = True to create a correct legend
the color of the observation point on the map is now the same color as the line of the observation measurements. Also a built-in color cycle is used for different measurements at the same monitoring_well.
- Parameters
plot_dir (str) – directory used for the folium map and bokeh plots
m (folium.Map, str, optional) – current map to add observations too, if None a new map is created
tiles (str, optional) – background tiles, default is openstreetmap
fname (str, optional) – name of the folium map
per_monitoring_well (bool, optional) – if True plot multiple tubes at the same monitoring well in one figure
color (str, optional) – color of the observation points on the map
legend_name (str, optional) – the name of the observation points shown in the map legend
add_legend (boolean, optional) – add a legend to a plot
map_label (str, optional) – add a label to the monitoring wells on the map, the label should be 1. the attribute of an observation 2. the key in the meta attribute of the observation 3. a generic label for each observation in this collection. A label is only added if map_label is not ‘’. The default is ‘’.
map_label_size (int, optional) – label size of the map_label in pt.
col_name_lat (str, optional) – name of the column in the obs_collection dic with the lat values of the observation points
col_name_lon (str, optional) – see col_name_lat
zoom_start (int, optional) – start zoom level of the folium ma
create_interactive_plots (boolean, optional) – if True interactive plots will be created, if False the iplot_fname in the meta ditctionary of the observations is used.
**kwargs –
will be passed to the interactive_plots method options are:
cols : list of str or None
hoover_names : list of str
plot_legend_names : list of str
plot_freq : list of str
markers : list of str
hoover_names : list of str
plot_colors : list of str
ylabel : str
add_screen_to_legend : boolean
tmin : dt.datetime
tmax : dt.datetime
- Returns
m – the folium map
- Return type
folium.Map
- interactive_plots(savedir='figures', tmin=None, tmax=None, per_monitoring_well=True, **kwargs)[source]
Create interactive plots of the observations using bokeh.
- Parameters
savedir (str) – directory used for the folium map and bokeh plots
tmin (dt.datetime, optional) – start date for timeseries plot
tmax (dt.datetime, optional) – end date for timeseries plot
per_monitoring_well (bool, optional) – if True plot multiple tubes at the same monitoring_well in one figure
**kwargs –
will be passed to the Obs.interactive_plot method, options include:
cols : list of str or None
hoover_names : list of str
plot_freq : list of str
plot_legend_names : list of str
markers : list of str
hoover_names : list of str
plot_colors : list of str
ylabel : str
add_screen_to_legend : boolean
- section_plot(tmin=None, tmax=None, cols=(None,), section_colname_x=None, section_label_x=None, section_well_layout_color='gray', section_markersize=100, ylabel='auto', fn_save=None, check_obs_close_to_screen_bottom=True, plot_well_layout_markers=True, plot_obs=True)[source]
Create plot with well layout (left) en observations (right).
- Parameters
tmin (dt.datetime, optional) – start date for timeseries plot
tmax (dt.datetime, optional) – end date for timeseries plot
cols (tuple of str or None, optional) – the columns of the observation to plot. The first numeric column is used if cols is None, by default None.
section_colname_x (str, optional) – column used for x position on section plot, when None order collection is used
section_label_x (str, optional) – label applied to x-axis in section plot
section_well_layout_color (str, optional) – color of well layout, default is gray
section_markersize (int, optional) – size of makers in sectionplot
ylabel (str or list, optional) – when ‘auto’ column unit in collection is ylabel, otherwise first element of list is label of section plot, second element of observation plot
fn_save (str, optional) – filename to save plot
check_obs_close_to_screen_bottom (bool, optional) – plots a horizontal line when minimum observation is close to screen_bottom
plot_well_layout_markers (bool, optional) – plots ground level, top tube, screen levels and sandtrap via makers. Default is True
plot_obs (bool, optional) – Plots observation. Default is True
TODO –
speficy colors via extra column in ObsCollection
- addtional visual checks:
maximum observation is close to or above ground level, maximum observation is close to or above tube top minimum observation is close to or below tube bottom (sand trap)
include some interactive Bokeh fancy
apply an offset when two wells are at same location
limit y-axis of section plot to observations only
remove the checking (if obs are near bottom) from this function
moving the legend outside the plot
set xlim of observation plot more tight when tmin is not specified
- series_per_group(plot_column, by=None, savefig=True, outputdir='.')[source]
Plot time series per group.
The default groupby is based on identical x, y coordinates, so plots unique time series per location.
- Parameters
plot_column (str) – name of column containing time series data
by ((list of) str or (list of) array-like) – groupby parameters, default is None which sets groupby to columns [“x”, “y”].
savefig (bool, optional) – save figures, by default True
outputdir (str, optional) – path to output directory, by default the current directory (“.”)
- class hydropandas.extensions.plots.ObsPlots(obs)[source]
Bases:
object
- interactive_plot(savedir=None, cols=(None,), markers=('line',), p=None, plot_legend_names=('',), plot_freq=(None,), tmin=None, tmax=None, hoover_names=('Peil',), hoover_date_format='%Y-%m-%d', ylabel=None, plot_colors=('blue',), add_screen_to_legend=False, return_filename=False)[source]
Create an interactive plot of the observations using bokeh.
Todo:
add options for hoovers, markers, linestyle
- Parameters
savedir (str, optional) – directory used for the folium map and bokeh plots
cols (tuple of str or None, optional) – the columns of the observation to plot. The first numeric column is used if cols is None, by default None.
markers (list of str, optional) – type of markers that can be used for plot, ‘line’ and ‘circle’ are supported
p (bokeh.plotting.figure, optional) – reference to existing figure, if p is None a new figure is created
plot_legend_names (list of str, optional) – legend in bokeh plot
plot_freq (list of str, optional) – bokeh plot is resampled with this frequency to reduce the size
tmin (dt.datetime, optional) – start date for timeseries plot
tmax (dt.datetime, optional) – end date for timeseries plot
hoover_names (list of str, optional) – names will be displayed together with the cols values when hoovering over plot
hoover_date_format (str, optional) – date format to use when hoovering over a plot
ylabel (str or None, optional) – label on the y-axis. If None the unit attribute of the observation is used.
plot_colors (list of str, optional) – plot_colors used for the plots
add_screen_to_legend (boolean, optional) – if True the attributes screen_top and screen_bottom are added to the legend name
return_filename (boolean, optional) – if True filename will be returned
- Returns
fname_plot – filename of the bokeh plot or reference to bokeh plot
- Return type
str or bokeh plot
hydropandas.extensions.stats module
- class hydropandas.extensions.stats.StatsAccessor(oc_obj)[source]
Bases:
object
- consecutive_obs_years(min_obs=12, col=None)[source]
get the number of consecutive years with more than a minimum of observations.
- Parameters
min_obs (int or str, optional) – if min_obs is an integer it is the minimum number of observations per year. If min_obs is a string it is the column name of the obs_collection with minimum number of observation per year per observation.
col (str or None, optional) – the column of the obs dataframe to get measurements from. The first numeric column is used if col is None, by default None.
- Returns
df – dataframe with the observations as column, the years as rows, and the values are the number of consecutive years.
- Return type
pd.DataFrame
- property dates_first_obs
- property dates_last_obs
- get_first_last_obs_date()[source]
get the date of the first and the last measurement.
- Returns
DataFrame with 2 columns with the dates of the first and the last
measurement
- get_max(tmin=None, tmax=None, col=None)[source]
get the maximum value of every obs object.
- Parameters
tmin (dt.datetime, optional) – get the maximum value after this date. If None all observations are used.
tmax (dt.datetime, optional) – get the maximum value before this date. If None all observations are used.
col (str or None, optional) – the column of the obs dataframe to get maximum from. The first numeric column is used if col is None, by default None.
- Returns
pandas series with the maximum of each observation in the obs
collection.
- get_min(tmin=None, tmax=None, col=None)[source]
get the minimum value of every obs object.
- Parameters
tmin (dt.datetime, optional) – get the minimum value after this date. If None all observations are used.
tmax (dt.datetime, optional) – get the minimum value before this date. If None all observations are used.
col (str or None, optional) – the column of the obs dataframe to get minimum from. The first numeric column is used if col is None, by default None.
- Returns
pandas series with the minimum of each observation in the obs
collection.
- get_no_of_observations(tmin=None, tmax=None, col=None)[source]
get number of non-nan values of a column in the observation df.
- Parameters
tmin (dt.datetime, optional) – get the number of observations after this date. If None all observations are used.
tmax (dt.datetime, optional) – get the number of observations before this date. If None all observations are used.
col (str or None, optional) – the column of the obs dataframe to get measurements from. The first numeric column is used if col is None, by default None.
- Returns
pandas series with the number of observations for each row in the obs
collection.
- get_seasonal_stat(col=None, stat='mean', winter_months=(1, 2, 3, 4, 11, 12), summer_months=(5, 6, 7, 8, 9, 10))[source]
get statistics per season.
- Parameters
col (str or None, optional) – the column of the obs dataframe to get measurements from. The first numeric column is used if col is None, by default None.
stat (str, optional) – type of statistics, all statisics from df.describe() are available
winter_months (tuple of int, optional) – month number of winter months
summer_months (tuple of int, optional) – month number of summer months
- Return type
DataFrame with stats for summer and winter
- mean_in_period(tmin=None, tmax=None, col=None)[source]
get the mean value of one column (col) in all observations within a period defined by tmin and tmax. If both tmin and tmax are None the whole period in which there are observations is used.
- Parameters
tmin (datetime, optional) – start of averaging period. The default is None.
tmax (datetime, optional) – end of averaging period. The default is None.
col (str or None, optional) – the column of the obs dataframe to get measurements from. The first numeric column is used if col is None, by default None.
- Returns
mean values for each observation.
- Return type
pd.Series
- property n_observations
- property obs_periods
- class hydropandas.extensions.stats.StatsAccessorObs(obs)[source]
Bases:
object
- get_seasonal_stat(col=None, stat='mean', winter_months=(1, 2, 3, 4, 11, 12), summer_months=(5, 6, 7, 8, 9, 10))[source]
get statistics per season.
- Parameters
col (str or None, optional) – the column of the obs dataframe to get measurements from. The first numeric column is used if col is None, by default None.
stat (str, optional) – type of statistics, all statisics from df.describe() are available
winter_months (tuple of int, optional) – month number of winter months
summer_months (tuple of int, optional) – month number of summer months
- Returns
two lists with the statistics for the summer and the winter.
- Return type
winter_stats, summer_stats