Validation

The wind_validation.validation.validate() method operates on wind statistics in one of the formats that the windkit package is using. These are windkit.time_series_wind_climate, windkit.binned_wind_climate or windkit.weibull_wind_climate.

wind_validation.validation.validate(obs: xarray.core.dataset.Dataset, mod: xarray.core.dataset.Dataset, dtype: Optional[str] = None, stats: str = 'basic', metrics: str = 'basic', **kwargs) xarray.core.dataset.Dataset[source]

Function to validate modelled wind data against observations.

The validation calculates “stats” and “metrics” and outputs them as the results.

Stats are calculated on the modelled and observed data seperately. An example is the mean of the wind speed or the variance of the wind direction.

Metrics are measures that are calculated between the observed and modelled wind data. This can be for example the pearson correlation coefficient, the mean absolute error, and the mean wind direction error.

The data variables in the output xr.dataset are named with the convention obs_VARIABLE_STATISTIC for observations and mod_VARIABLE_STATISTIC for modelled data. The metrics are labelled as VARIABLE_METRIC.

Parameters
  • obs (xarray.Dataset) – Observed and modelled data to validate.

  • mod (xarray.Dataset) – Observed and modelled data to validate.

  • dtype (str, optional) – Explicitely state the data format. Possible options: ts, hist, weib. By default None.

  • stats (str or list, optional) –

    Stats to be calculated.

    str: if a string is used, it should the name of a suite of stats.

    the available suites and their included stats are:

    ”basic”:
    • wind speed mean

    • wind speed standard deviation

    • wind direction circular mean

    • wind direction circular standard deviation

    • etc.

    ”all”:
    • every available stat (see documentation for stats)

    If a list is used, it must contain tuples in a form of: (‘variable’, ‘stat’). For example: [(‘wind_speed’, ‘mean’), (‘power_density’, ‘mean’), …]. All available option can be found under the respective format folder in the stats.py file.

  • metrics (str or list, optional) –

    Metrics to be calculated.

    metrics suites are:

    ”basic”:
    • wind speed pearson correlation coefficient

    • wind speed spearman correlation coefficient

    • wind speed coefficient of determination

    • wind speed mean error

    • wind speed root-mean-square error

    • wind speed mean-absolute error

    • wind speed mean-absolute-percentage error

    • wind direction circular mean error

    • wind direction circular mean-absolute error

    ”all”:
    • every available metric (see documentation for metrics)

    If a list is used, it must contain tuples in a form of: ((‘variable’,’stat’), ‘metric’). For example: [((‘wind_speed’, ‘mean’), ‘me’), ((‘power_density’, ‘mean’), ‘mpe’)]. All available option can be found under the respective format folder in the metrics.py file.

  • dim (str, optional) – Dimension to calculate timeseries stats and metrics along. By default dim is “time”. It’s important to note that this is only applicable to timeseries data.

  • by (numpy.array or xarray.DataArray, optional) – Optional grouper-array to calculate stats and metrics in groups based on the unique values in the array.

Returns

Validation results of stats and metrics.

Return type

xarray.Dataset

Raises

ValueError – Raises this in case the automatic inference of data format from “hist”, “ts”, “weib” is unsuccessful, a user should provide it manually in this case.

Examples

>>> results = validate(obs, mod, stats='all', metrics='all')