Downloader module#

Note

It is important to use a different cache folder than other projects of nempy and nemosis that you may already have!

Downloader functions for retrieving data from various sources

nemed.downloader.download_aemo_cdeii_summary(filter_start, filter_end, cache)#

Downloads and combines selected AEMO CDEII Summary Files for a specified date range. The files processed here are available from: https://aemo.com.au/en/energy-systems/electricity/national-electricity-market-nem/market-operations/settlements-and-payments/settlements/carbon-dioxide-equivalent-intensity-index

Parameters
  • filter_start (str) – Data download period start, in the format: ‘yyyy/mm/dd HH:MM:SS’

  • filter_end (str) – Data download period end, in the format: ‘yyyy/mm/dd HH:MM:SS’

  • cache (str) – Raw data location in local directory

Returns

AEMO data containing columns = ‘SETTLEMENTDATE’, ‘REGIONID’, ‘TOTAL_SENT_OUT_ENERGY’, ‘TOTAL_EMISSIONS’, ‘CO2E_INTENSITY_INDEX’]

Return type

pd.DataFrame

Raises

ValueError – Where ‘filter_start’ exceeds the earliest available CDEII data file supported by NEMED.

nemed.downloader.download_dudetailsummary(cache, asof_date=None)#

Download the DUDETAILSUMMARY MMS table with mapping of Dispatch Type and Region to DUID

Parameters
  • cache (str) – Raw data location in local directory

  • asof_date (str, optional) – Date to retrieve DUALLOC table as of, in the format: ‘yyyy/mm/dd HH:MM’, by default None which will retrieve recent data

Returns

Columns:

Type:

Description:

DUID

str

Dispatchable Unit Identifier as defined by AEMO.

START_DATE

str

Date of data entry as defined by AEMO.

DISPATCHTYPE

str

Dispatch Type of DUID as ‘GENERATOR’ or ‘LOAD’.

REGIONID

str

Region of DUID.

Return type

pandas.DataFrame

Raises

Exception – Parameter asof_date exceeds the earliest available DUALLOC table in MMS

nemed.downloader.download_generators_info(cache)#

Retrieves the Generators and Scheduled Loads static table via NEMOSIS (published by AEMO in NEM Registration and Exemption List file). Data reflects the most recent file uploaded by AEMO.

Warning

This Generators and Scheduled Load table is only the most recent data and is a static file

Parameters

cache (str) – Raw data location in local directory.

Returns

AEMO data containing columns=[‘Participant’, ‘Station Name’, ‘Region’, ‘Dispatch Type’, ‘Category’, ‘Classification’, ‘Fuel Source - Primary’, ‘Fuel Source - Descriptor’, ‘Technology Type - Primary’, ‘Technology Type - Descriptor’, ‘Aggregation’, ‘DUID’, ‘Reg Cap (MW)’]

Return type

pandas.DataFrame

nemed.downloader.download_genset_map(cache, asof_date=None)#

Download the GENSETID to DUID mapping from DUALLOC MMS Table.

Parameters
  • cache (str) – Raw data location in local directory

  • asof_date (str, optional) – Date to retrieve DUALLOC table as of, in the format: ‘yyyy/mm/dd HH:MM’, by default None which will retrieve recent data

Returns

Columns:

Type:

Description:

EFFECTIVEDATE

str

Effective Date as defined by AEMO.

DUID

str

Dispatchable Unit Identifier as defined by AEMO.

GENSETID

str

Generator Set Identifier as defined by AEMO.

Return type

pandas.DataFrame

Raises

Exception – Parameter asof_date exceeds the earliest available DUALLOC table in MMS

nemed.downloader.download_plant_emissions_factors(start_date, end_date, cache)#

Retrieves CO2-equivalent emissions intensity factors (tCO2-e/MWh) for each generator. Metric is reflective of sent-out generation. Underlying data is sourced from the ‘GENUNITS’ table of AEMO MMS at monthly time resolution.

Parameters
  • cache (str) – Raw data location in local directory

  • start_date (str) – Data download period start, in the format: ‘yyyy/mm/dd HH:MM’

  • end_date (str) – Data download period end, in the format: ‘yyyy/mm/dd HH:MM’

Returns

Plant Emissions Factor Data with columns=[‘file_year’, ‘file_month’, ‘GENSETID’, ‘CO2E_EMISSIONS_FACTOR’, ‘CO2E_ENERGY_SOURCE’, ‘CO2E_DATA_SOURCE’]

Return type

pandas.DataFrame

Raises

Exception – Data Unavailable for dates prior 05-2011

nemed.downloader.download_pricesetter_files(start_time, end_time, cache)#

Download NEM Price Setter files from MMS table. First caches raw XML files as JSON and then reads and returns data in the form of pandas.DataFrame. Processed data only considers the marginal generator for the Energy market.

For further explaination on NEMPriceSetting refer to: https://aemo.com.au/-/media/files/electricity/nem/it-systems-and-change/nemde-queue/nemde_queue_users_guide.pdf?la=en

Parameters
  • cache (str) – Raw data location in local directory

  • start_time (str) – Start Time Period in format ‘yyyy/mm/dd HH:MM’

  • end_time (str) – End Time Period in format ‘yyyy/mm/dd HH:MM’

Returns

Columns:

Type:

Description:

PeriodID

datetime

The NEM market dispatch interval.

RegionID

str

The NEM market region.

Price

float

The market price for dispatch interval.

Unit

str

A DUID who contributes to setting the price (in most cases).

BandNo

int

Trade band number of the unit’s contribution to price setting.

Increase

float

A marginal increase (in MW) in the unit band for a 1MW increase in energy demand for the region.

RRNBandPrice

float

Unit Band price as referred to the RRN

BandCost

float

Amount in $/h (Increase column multiplied by RRNBandPrice)

Return type

pandas.DataFrame

nemed.downloader.download_unit_dispatch(start_time, end_time, cache, source_initialmw=False, source_scada=True, overwrite='scada', return_all=True, check=True, rm_negative=True)#

Downloads historical generation dispatch data via NEMOSIS.

Parameters
  • start_time (str) – Start Time Period in format ‘yyyy/mm/dd HH:MM’

  • end_time (str) – End Time Period in format ‘yyyy/mm/dd HH:MM’

  • cache (str) – Raw data location in local directory

  • source_initialmw (bool) – Whether to download initialmw column from DISPATCHLOAD table, by default False

  • source_scada (bool) – Whether to download scada column from DISPATCH_UNIT_SCADA table, by default True

  • overwrite (str) – The data value to overwrite in the returned ‘Dispatch’ column if there is a discrepency in initialmw and scada. Must be one of [‘initialmw’,’scada’,’average’]. If one of source_initialmw or source_scada is False, overwrite has null effect. By default ‘scada’.

  • return_all (bool) – Whether to return all columns or only [‘Time’,’DUID’,’Dispatch’], by default False.

  • check (bool) – Whether to check for, and remove duplicates after function is complete, by default True.

  • rm_negative (bool) – Checks for negative dispatch values in SCADA and replaces them with zero, by default True.

Returns

Returns generation data as per NEMOSIS

Return type

pd.DataFrame

nemed.downloader.read_plant_auxload_csv(select_columns=['EFFECTIVEFROM', 'DUID', 'PCT_AUXILIARY_LOAD'], coltype={'DUID': <class 'str'>, 'EFFECTIVEFROM': <class 'str'>, 'PCT_AUXILIARY_LOAD': <class 'float'>})#

Reads locally stored .csv in package with auxiliary load data mapped to each DUID. Users can update this .csv with custom/missing values should they wish.

Parameters
  • select_columns (list, optional) – Columns of the dataset to return, by default [‘EFFECTIVEFROM’, ‘DUID’, ‘PCT_AUXILIARY_LOAD’]

  • coltype (dict, optional) – Datatype corresponding to each field, by default {‘EFFECTIVEFROM’: datetime, ‘DUID’: str, ‘PCT_AUXILIARY_LOAD’: float}

Returns

Custom table containing columns=[‘EFFECTIVEFROM’, ‘DUID’, ‘PCT_AUXILIARY_LOAD’]

Return type

pandas.DataFrame