Process module#

Process functions for calculations based on downloaded data

nemed.process.aggregate_data_by(data, by)#

Aggregate the total emissions dataset metrics of Sent Out Generation, Total Emissions and Intensity Index.

Parameters
  • data (pandas.DataFrame) – Dataframe input must correspond to the output from get_total_emissions with the by arugment set to None.

  • by (str) – One of [‘interval’, ‘hour’, ‘day’, ‘month’, ‘year’]

Returns

Data is returned as:

Columns:

Type:

Description:

TimeBeginning

datetime

Timestamp for start of interval or aggregation period. Only returned if by parameter is set.

TimeEnding

datetime

Timestamp for end of interval or aggregation period.

Region

str

The NEM region corresponding to data. ‘NEM’ field reflects all regions and is returned if filter_regions is None from get_total_emissions.

Energy

float

The total (sent-out if generation_sent_out is True from get_total_emissions) energy for the corresponding region and time.

Total_Emissions

float

The total emissions for the corresponding region and time.

Intensity_Index

float

The intensity index as above, considering the total emissions divided by (sent-out) energy.

Return type

pandas.DataFrame

Raises

Exception – Invalid dataframe input.

nemed.process.get_marginal_emitter(start_time, end_time, cache)#

Retrieves the marginal emissions intensity for each dispatch interval and region. This factor being the weighted sum of the generators contributing to price-setting. Although not necessarily common, there may be times where multiple technology types contribute to the marginal emissions - note however that the ‘DUID’ and ‘CO2E_ENERGY_SOURCE’ returned will reflect only the plant which makes the greatest contribution towards price-setting.

Parameters
  • cache (str) – Raw data location in local directory

  • start_time (str) – Start Time Period in format ‘yyyy/mm/dd HH:MM’

  • end_time (str) – End Time Period in format ‘yyyy/mm/dd HH:MM’

Returns

Data is returned as:

Columns:

Type:

Description:

Time

datetime

Timestamp reported as end of dispatch interval.

Region

str

The NEM region corresponding to the marginal emitter data.

Intensity_Index

float

The intensity index [tCO2e/MWh] (as by weighted contributions) of the price-setting generators.

DUID

str

Unit identifier of the generator with the largest contribution on the margin for that Time-Region.

CO2E_ENERGY_SOURCE

str

Unit energy source with the largest contribution on the margin for that Time-Region.

Return type

pandas.DataFrame

nemed.process.get_total_emissions_by_DI_DUID(start_time, end_time, cache, filter_regions=None, generation_sent_out=True, assume_energy_ramp=True, dropna_co2factors=True, return_all=False)#

Retrieve the total emissions for each generation unit per dispatch interval.

Parameters
  • start_time (str) – Start Time Period in format ‘yyyy/mm/dd HH:MM’

  • end_time (str) – End Time Period in format ‘yyyy/mm/dd HH:MM’

  • cache (str) – Raw data location in local directory

  • filter_regions (list(str)) – NEM regions to filter for while retrieving the data, as a list, by default None to collect all region data

  • generation_sent_out (bool) – Considers ‘sent_out’ generation (auxilary loads) as opposed to ‘as generated’ in calculations, by default True

  • assume_energy_ramp (bool) – Uses a linear ramp between dispatch scada points as opposed to a stepped function, by default True

  • dropna_co2factors (bool) – Removes data (generation) entries which do not have a CO2E_EMISSIONS_FACTOR mapped to them, by default True

  • return_all (bool) – Returns the entire table will all columns as opposed to tidied up table, by default False

Returns

Data is returned as formatted if return_all = False, generation_sent_out = True:

Columns:

Type:

Description:

DUID

str

Generator Identifier.

Time

datetime

Timestamp for end of interval.

Region

str

The NEM region corresponding to data.

Plant_Emissions_Intensity

float

The CO2_EMISSIONS_FACTOR [tCO2-e/MWh] corresponding to DUID.

Energy

float

The energy [MWh] (as generated) calculated as step or ramp depending on assume_energy_ramp.

PCT_AUXILIARY_LOAD

int

The percentage of auxiliary load corresponding to DUID.

Energy_SO

float

The energy [MWh] (sent out) calculated based on Energy and PCT_AUXILIARY_LOAD

Total_Emissions

float

The emissions [tCO2-e] for the DUID and Time based on Energy_SO and Plant_Emissions_Intensity

Return type

pandas.DataFrame