cal_proc package

Submodules

cal_proc.cdp module

File containing all CDP instrument processor classes.

class cal_proc.cdp.CDP(ds)[source]

Bases: cal_proc.generic.Generic

Parses and processes calibration data files for instrument: CDP.

CDP: Cloud Droplet Probe

update(largs)[source]

Make any change to the object.

Parameters

largs (list) – List of lists of arbitrary arguments to apply to nc

Todo

This is actually not used and looks hella complicated. Change so is more useful?

Examples

Warning

These usage examples are now out of date.

The largs list is from

$ python cal_ncgen.py --update option

and may be one of the following types;

  • A list [of lists] of cdl files of data to be written into the nc object. This option is chosen based on extension .cdl. If more than one cdl file is offered then it must be given as a single entry. eg;

    $ python cal_ncgen.py -u PCASP_20170725.cdl PCASP_20171114.cdl
    
  • A list [of lists] of PCASP diameter calibration files output from cstodconverter. This option is chosen based on filename ending with d.csv. If more than one calibration file is offered then it must be given as a single entry. eg;

    $ python cal_ncgen.py -u 20170725_P1_cal_results_PSLd.csv 20171114_P1_cal_results_PSLd.csv
    
  • A list nc attribute/value or variable/value pairs. For attributes that are strings, the value is concatenated to the existing string with a delimiting comma (non-char attributes will probably throw an error). Variables will be appended to the end of the existing variable numpy array. Note that variable attributes cannot be appended to. The attribute/variable names must be given exactly as in the existing nc file. Any containing group/s is given with forward slashes, eg

    $ python cal_ncgen.py -u bin_cal/time 2769 2874 -u bin_cal/applies_to C027-C055 C057-C071 2818.5, 2864.5
    

Note that any spaces in filenames must be enclosed in quotes. All files are assumed to the same type as the first filename in the list.

update_bincal_from_file(cal_file, vars_d)[source]

Appends bin calibration data in calibration file to that in nc file.

Parameters
  • cal_file (str or pathlib) – Filename of calibration cdp calibration csv file to be read. The type of calibration file, a scattering cross-section or diameters file, is automatically determined. Diameter files are recognised as starting with the string ‘input file’ as well as possibly having ‘dia’ in the filename or ending with ‘d.csv’. The scattering cross-section files are recognised as containing ‘scs’ or ‘master_calibration’ in the filename.

  • vars_d (dict) – Dictionary of any additional variables associated with those contained within the datafile. At the very least this should contain any associated coordinate variables, eg time.

cal_proc.generic module

Generic instrument class.

cal_proc.generic.walk_dstree(ds)[source]

Recursive Dataset group generator.

from: http://unidata.github.io/netcdf4-python/netCDF4/index.html#section2

Parameters

ds (netCDF4.Dataset) – Dataset

cal_proc.generic.append_time(otime, ntime, concat_axis=0)[source]

Appends time variable/s to existing time coordinate.

Note that increasing the size of the time coordinate also increases the size of all of the dependent variables.

Parameters
  • otime (netCDF4.Variable) – Original Dataset time coordinate. As is coordinate dimensionality is 1d.

  • ntime (netCDF4.Variable or iterable) – New time variables to append to otime. May either be a netCDF4 variable or a simple iterable of values. These values may be datetime objects. If the values are strings some attempt to convert them will be done. If they are numbers then it is assumed that units and calendar are compatible with those in otime.

Returns

A netCDF4 variable with the same units and calendar as otime.

class cal_proc.generic.Generic(ds)[source]

Bases: object

Parent class for general instrument parsing and processing.

Generic forms the basis for all specific instrument processor classes.

update_ver()[source]

Includes program version information as root attribute.

Version information is determined from cal_proc.__init__(). Any existing version strings shall be overwritten.

update_hist(update=None)[source]

Updates the global history attribute.

The history nc attribute is a single string of comma-delineated text.

Parameters

update (str or list) – Update for history string. If None (default) then auto-generate string based on today’s datetime. If given then append update/s to history attribute string. Any <now> or <today> strings are changed to today’s datetime.

update_user(update=None)[source]

Updates the global username attribute.

The username nc attribute is a single string of comma-delineated text.

Parameters

update (str or list) – Update for username string. If None (default) then auto-generate string based on previous entries in netCDF and ask user. String usually given as username <user@email>. Append username/s to existing attribute string.

update_attr(attr, update=None)[source]

Updates an attribute by appending update.

Root and group attributes are generally strings and should not be changed. However they may be appended to, it is common to create a comma-delineated string. If attr does not exist then it is not created by this method. If a new attribute is required then it is more sound to create a new nc file from scratch that includes this attribute.

Parameters
  • attr (str) – Name of attribute to update. If the attribute is in a group instead of the root then the full path of the attribute must be included with / seperators. If attr does not exist within the dataset then do not create but return.

  • update (str or list) – Update for attribute. If None (default) then just return. If string then append to existing attr string with comma seperator. If list of strings then append comma-delineated string generated from list.

change_val(var, old_val, new_val)[source]

Changes a single variable/attribute value.

The variable or attribute name must be given along with the old value, old_val, that is to be change to new_val. If old_val is not found then nothing is done.

Todo

Not implemented

append_var(var, coord_vals, var_vals)[source]

Method to append to an existing netCDF4 variable and associated coord

Todo

Redundant/not currently used.

The extra coordinate values are appended to the coordinate of vname in self. This automatically creates the same number of masked entries in all of the variables that depend on that coordinate.

Parameters
  • var (String) – String of path/name of variable. This can be found with os.path.join(self.ds[var].group().path,self.ds[var].name)

  • coord_vals (List or array) – Iterable of values to append to the end of unlimited coordinate of variable var.

  • var_vals (List or array) – Iterable of values to append to the end of var. Must be the same length as coord_vals in the unlimited dimension.

Note

This is a bit complicated/not sensible. It is possible/probable to add a variable (along with the coord) then append to a different variable that uses the same coordinate which then writes the same coordinate values into the coordinate again.

append_dict(var_d)[source]

Appends multiple variables with a single coordinate.

Multiple variable values that use the same coordinate can be appended to existing dataset variables in one go. Variables that do not already exist in the dataset are ignored. Use add instead…

Args
var_d (dict): Dictionary of multiple variable values to be

appended. Dictionary keys are the variable path+name strings and the dictionary values are either a netCDF variable or a sub-dictionary of values and attributes

Todo

Currently does not accept netCDF4 variable values. Currently only accepts iterable of data.

var_d = {coord: netCDF4.Variable,
         var1:  [1,2,3,4,5],
         var2:  {'_data': [1,2,3,4,5],
                 'var2_attr1': 'var2 attribute 1',
                 'var2_attr1': 'var2 attribute 1', ...}
         var3:  'Fred'}

Note that all variables should be the same length if they are list-like. Any variables not the same length as the maximum length variable will be broadcast so that they are longer. This could well have unintended concequences however it does mean that variables that are the same thing repeated for all coordinate values (eg var3) will be replicated automatically.

Note

This is not done yet!

There is nothing special about the coordinate variable, the function identifies the coordinate as being the variable with the same name as its dimension.

append_dataset(ds, force_append=['username', 'history'], exclude=[])[source]

Adds groups, attributes, dimensions, and variables from ds.

Attributes of self.ds shall take priority over those of the same name in ds, such attribute values of ds shall be ignored. The exception is if the attribute key is included in force_append. In this case the resultant attribute shall be a comma-delineated combination string of the individual attributes with that from ds being appended to that of self.ds.

Variables from ds are appended to the same variable in self. The variables are sorted by the unlimited dimension. Variables only in ds shall be added to self.

Any groups, attributes, or variables in ds that are not to be added or appended can be specified as a list with exclude.

Parameters
  • ds (netCDF4.Dataset) –

  • force_append (list) – List of any root or group attribute strings that should always be appended to, even if they are identical. Default is [‘username’,’history’]. Group attribule strings must include full path.

  • exclude (list) – List of attribute or variable name strings (but not variable attributes) that are not to be added or appended to.

cal_proc.pcasp module

File containing all PCASP instrument processor classes.

class cal_proc.pcasp.PCASP(ds)[source]

Bases: cal_proc.generic.Generic

Parses and processes calibration data files for instrument: PCASP.

PCASP: Passive Cavity Aerosol Spectrometer Probe

update(largs)[source]

Make any change to the pcasp object.

Parameters

largs (list) – List of lists of arbitrary arguments to apply to nc

Todo

This is actually not used and looks hella complicated. Change so is more useful?

Examples

Warning

These usage examples are now out of date.

The largs list is from

$ python cal_ncgen.py --update option

and may be one of the following types;

  • A list [of lists] of cdl files of data to be written into the nc object. This option is chosen based on extension .cdl. If more than one cdl file is offered then it must be given as a single entry. eg;

    $ python cal_ncgen.py -u PCASP_20170725.cdl PCASP_20171114.cdl
    
  • A list [of lists] of PCASP diameter calibration files output from cstodconverter. This option is chosen based on filename ending with d.csv. If more than one calibration file is offered then it must be given as a single entry. eg;

    $ python cal_ncgen.py -u 20170725_P1_cal_results_PSLd.csv 20171114_P1_cal_results_PSLd.csv
    
  • A list nc attribute/value or variable/value pairs. For attributes that are strings, the value is concatenated to the existing string with a delimiting comma (non-char attributes will probably throw an error). Variables will be appended to the end of the existing variable numpy array. Note that variable attributes cannot be appended to. The attribute/variable names must be given exactly as in the existing nc file. Any containing group/s is given with forward slashes, eg

    $ python cal_ncgen.py -u bin_cal/time 2769 2874 -u bin_cal/applies_to C027-C055 C057-C071 2818.5, 2864.5
    

Note that any spaces in filenames must be enclosed in quotes. All files are assumed to the same type as the first filename in the list.

update_bincal_from_file(cal_file, vars_d)[source]

Appends bin calibration data in calibration file to that in nc file.

Parameters
  • cal_file (str or pathlib) – Filename of calibration PCASP calibration csv file to be read. The type of calibration file, a scattering cross-section or diameters file, is automatically determined. Diameter files are recognised as starting with the string ‘input file’ as well as possibly having ‘dia’ in the filename or ending with ‘d.csv’. The scattering cross-section files are recognised as containing ‘scs’ in the filename or having it end with ‘cs.csv’.

  • vars_d (dict) – Dictionary of any additional variables associated with those contained within the datafile. At the very least this should contain any associated coordinate variables, eg time.

cal_proc.reader module

File reader/parser utilities.

cal_proc.reader.opc_calfile(cal_file, f_type='pcasp_d', reject_bins=None, invalid=-9999)[source]

Parses calibration files outputted from various calibration programs.

Currently reads in csv files from Phil’s calibration programs along with Angela’s calibration program.

Note

This has been ripped straight from datafile_utils.py. Probably can do this better.

Parameters
  • cal_file (str or pathlib) – Filename of calibration file to

  • read. (be) –

  • f_type (str) –

    Type of calibration file to read. One of;

    ’pcasp_d’

    output of CDtoDConverter.exe

    ’pcasp_cs’

    output of pcaspcal.exe

    ’cdp_d’

    output of CDtoDConverter.exe

    ’cdp_cs’

    output from ADs calibration program

  • reject_bins (list) – List of integer bin numbers that are not returned. Default is None.

  • invalid (int or str) – Value of invalid values as used by input file. Default is -9999.

Returns

Dictionary of metadata and data masked arrays.

cal_proc.temperature module

File containing all primary instrument temperature instrument processor classes.

Todo

This needs significant work

class cal_proc.temperature.NDIT(ds)[source]

Bases: cal_proc.generic.Generic

Parses and processes calibration data files for instrument: NDIT

NDIT: Non-deiced temperature probe

update(dvars)[source]

Updates attributes and variables of ndit nc object.

Parameters

dvars (dict) –

class cal_proc.temperature.DIT(ds)[source]

Bases: cal_proc.generic.Generic

Parses and processes calibration data files for instrument: DIT

DIT: Deiced temperature probe

update(dvars)[source]

Updates attributes and variables of ndit nc object.

Parameters

dvars (dict) –

cal_proc.wcm2000 module

File containing all SEA WCM-2000 total water probe processor classes.

class cal_proc.wcm2000.WCM2000(ds)[source]

Bases: cal_proc.generic.Generic

Parses and processes calibration data files for instrument: WCM2000

WCM2000: SEA WCM-2000 total water probe

update(dvars, verbose=False)[source]

Updates attributes and variables of wcm2000 nc object.

Parameters
  • dvars (dict) – Dictionary of attribute/variable key and value pairs. The dict keys are the attribute/variable name, including group paths relative to the root. Each dict either contains a scalar or iterable data value or, if applied to a variable (but not an attribute), a sub-dict that must include at least the key ‘data’. Other keys in the sub-dicts are attributes associated with that variable. The data value is a scalar or iterable to append to that attribute/variable. If a variable is given that does not already exist then it is ignored.

  • verbose (bool) – If True then print info to stdout for each update. Default is False.

Example

If the netCDF4 object self has only the following variables:

float32 time(time)
    standard_name: time
    long_name: time of calibration

    time = 569., 669.;

group /TWC:
    float32 r100(time)
        long_name: TWC element resistance at 100deg C
        units: milliohm

    float32 dtdr(time)
        long_name: Change in TWC element resistance with temperature
        units: deg C / milliohm

    r100 = 31.4473, 31.3362;
    dtdr = 33.9276, 33.8165;

Then the following call,

self.update({'time': 769,
             'TWC/r100', {'data': 31.2251,
                          'comment': 'Added comment'},
             'TWC/dtdr', 33.7064})

Shall result in the following nc structure:

float32 time(time)
    standard_name: time
    long_name: time of calibration

    time = [569., 669. 769.];

group /TWC:
    float32 r100(time)
        long_name: 'TWC element resistance at 100deg C'
        units: 'milliohm'
        comment: 'Added comment'

    float32 dtdr(time)
        long_name: 'Change in TWC element resistance with temperature'
        units: 'deg C / milliohm'

    r100 = [31.4473, 31.3362, 31.2251];
    dtdr = [33.9276, 33.8165, 33.7064];

Attribute and variables names may be given as strings, as in the example above or as the netCDF4 object variables. Efforts are made to coerse given variable data into the correct type, if this is impossible the update shall be ignored.

Module contents

Import all cal parser functions