Skip to content

parameters

Module for LUTE parameter objects.

This module contains objects that define LUTE TaskParameters. It is separate from the pydantic model definitions included in lute.io.models. This allows LUTE first-party code to run without pydantic validation. Validation is still required to have occurred at some point to enter correct values into the database.

AnalysisHeader

Bases: ContainerBase

Header information for LUTE analysis runs.

Source code in lute/io/parameters.py
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
class AnalysisHeader(ContainerBase):
    """Header information for LUTE analysis runs."""

    _schema: Dict[str, Any] = {}
    _dict: Dict[str, Any] = {}

    title: str
    experiment: str
    run: Union[str, int]
    date: str
    lute_version: Union[float, str]
    task_timeout: int
    work_dir: str

    def __init__(self, schema: Dict[str, Any], *args, **kwargs):
        self._dict = {}
        self._schema = schema
        handle_field_attrs(self, *args, **kwargs)

ParameterConfig

Bases: ContainerBase

Configuration for parameters model.

The Config class holds Pydantic configuration. A number of LUTE-specific configuration has also been placed here.

Attributes:

Name Type Description
version_specifier Optional[int]

An indicator of how to interpret the version information. An integer constructed from enumerators of the VersionSpecifier enum (lute.tasks.dataclasses), or a bitwise OR thereof. If None, no version information available.

task_version Optional[str]

The version information. This field may be filled dynamically. Interpretation of the information (if present) is determined by the version_specifier. It may be, e.g. a JSON string containing a git commit hash, a git diff, a straight version string (v2, e.g.).

version_location Optional[str]

None. Indicate where the version info should be taken from. E.g. a repository. Can be filled by a validator dynamically if necessary. This is used by the IO infrastructure to determine how to record version.

version_diff_args Optional[List[str]]

None. Provide arguments to git diff if using a diff as part of the versioning strategy. This is used by the IO infrastructure to determine how to record version.

run_directory Optional[str]

None. If set, it should be a valid path. The Task will be run from this directory. This may be useful for some Tasks which rely on searching the working directory.

result_from_params Optional[str]

None. Optionally used to define results from information available in the model using a custom validator. E.g. use a outdir and filename field to set result_from_params=f"{outdir}/{filename}, etc. Only used if set_result==True

result_summary Optional[str]

None. Defines a result summary that can be known after processing the Pydantic model. Use of summary depends on the Executor running the Task. All summaries are stored in the database, however. Only used if set_result==True

ThirdPartyTask-specific Optional[str]
short_flags_use_eq bool

False. If True, "short" command-line args are passed as -x=arg. ThirdPartyTask-specific.

long_flags_use_eq bool

False. If True, "long" command-line args are passed as --long=arg. ThirdPartyTask-specific.

Source code in lute/io/parameters.py
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
class ParameterConfig(ContainerBase):
    """Configuration for parameters model.

    The Config class holds Pydantic configuration. A number of LUTE-specific
    configuration has also been placed here.

    Attributes:
        version_specifier (Optional[int]): An indicator of how to interpret
            the version information. An integer constructed from enumerators
            of the VersionSpecifier enum (lute.tasks.dataclasses), or a bitwise
            OR thereof. If None, no version information available.

        task_version (Optional[str]): The version information. This field may
            be filled dynamically. Interpretation of the information (if present)
            is determined by the version_specifier. It may be, e.g. a JSON string
            containing a git commit hash, a git diff, a straight version string
            (v2, e.g.).

        version_location (Optional[str]): None. Indicate where the version info
            should be taken from. E.g. a repository. Can be filled by a
            validator dynamically if necessary. This is used by the IO
            infrastructure to determine how to record version.

        version_diff_args (Optional[List[str]]): None. Provide arguments to git
            diff if using a diff as part of the versioning strategy. This is
            used by the IO infrastructure to determine how to record version.

        run_directory (Optional[str]): None. If set, it should be a valid
            path. The `Task` will be run from this directory. This may be
            useful for some `Task`s which rely on searching the working
            directory.

        set_result (bool). False. If True, the model has information about
            setting the TaskResult object from the parameters it contains.
            E.g. it has an `output` parameter which is marked as the result.
            The result can be set with a field value of `is_result=True` on
            a specific parameter, or using `result_from_params` and a
            validator.

        result_from_params (Optional[str]): None. Optionally used to define
            results from information available in the model using a custom
            validator. E.g. use a `outdir` and `filename` field to set
            `result_from_params=f"{outdir}/{filename}`, etc. Only used if
            `set_result==True`

        result_summary (Optional[str]): None. Defines a result summary that
            can be known after processing the Pydantic model. Use of summary
            depends on the Executor running the Task. All summaries are
            stored in the database, however. Only used if `set_result==True`

        impl_schemas (Optional[str]). Specifies a the schemas the
            output/results conform to. Only used if `set_result==True`.

        -----------------------
        ThirdPartyTask-specific:

        short_flags_use_eq (bool): False. If True, "short" command-line args
            are passed as `-x=arg`. ThirdPartyTask-specific.

        long_flags_use_eq (bool): False. If True, "long" command-line args
            are passed as `--long=arg`. ThirdPartyTask-specific.
    """

    def __init__(self, *args, **kwargs) -> None:
        for k, v in kwargs.items():
            if k in LUTE_PARAMETER_CONFIG_KEYS:
                setattr(self, k, v)

    # All Tasks
    version_specifier: Optional[int] = None
    task_version: Optional[str] = None
    version_location: Optional[str] = None
    version_diff_args: Optional[List[str]] = None
    run_directory: Optional[str] = None
    set_result: Optional[bool] = None
    result_from_params: Optional[str] = None
    result_summary: Optional[str] = None
    impl_schemas: Optional[str] = None

TemplateConfig

Parameters used for templating of third party configuration files.

Attributes:

Name Type Description
template_name str

The name of the template to use. This template must live in config/templates.

output_path str

The FULL path, including filename to write the rendered template to.

Source code in lute/io/parameters.py
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
class TemplateConfig:
    """Parameters used for templating of third party configuration files.

    Attributes:
        template_name (str): The name of the template to use. This template must
            live in `config/templates`.

        output_path (str): The FULL path, including filename to write the
            rendered template to.
    """

    _schema: Dict[str, Any] = {}

    def __init__(self, schema: Dict[str, Any], *args, **kwargs):
        self._schema = schema
        self._dict: Dict[str, Any] = {}
        handle_field_attrs(self, *args, **kwargs)

TemplateParameters dataclass

Class for representing parameters for third party configuration files.

These parameters can represent arbitrary data types and are used in conjunction with templates for modifying third party configuration files from the single LUTE YAML. Due to the storage of arbitrary data types, and the use of a template file, a single instance of this class can hold from a single template variable to an entire configuration file. The data parsing is done by jinja using the complementary template. All data is stored in the single model variable params.

The pydantic "dataclass" is used over the BaseModel/Settings to allow positional argument instantiation of the params Field.

Source code in lute/io/parameters.py
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
@dataclass
class TemplateParameters:
    """Class for representing parameters for third party configuration files.

    These parameters can represent arbitrary data types and are used in
    conjunction with templates for modifying third party configuration files
    from the single LUTE YAML. Due to the storage of arbitrary data types, and
    the use of a template file, a single instance of this class can hold from a
    single template variable to an entire configuration file. The data parsing
    is done by jinja using the complementary template.
    All data is stored in the single model variable `params.`

    The pydantic "dataclass" is used over the BaseModel/Settings to allow
    positional argument instantiation of the `params` Field.
    """

    params: Any

construct_task_parameters(schema, values)

Construct a TaskParameters object from a schema and parameter values.

This function will create a new TaskParameters object from a pydantic schema (usually retrieved from the database). This is a simplified container defined in this module, rather than the pydantic version. This allows its use in environments which do not have pydantic installed.

This function will recursively construct necessary internal objects as well, e.g., AnalysisHeader, TemplateParameters, etc.

This function assumes that the values passed in have been validated, and

that they conform to the schema. No validation will be done.

Parameters:

Name Type Description Default
schema Dict[str, Any]

The JSON schema for the PYDANTIC TaskParameters model. Usually this will be retrieved from the database.

required
values Dict[str, Any]

The of the parameters for the TaskParameters object.

required

Returns:

Name Type Description
new_obj object

Usually this will be the TaskParameters instance (or a a sub-class thereof), but this method recursively constructs all objects.

Source code in lute/io/parameters.py
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
def construct_task_parameters(schema: Dict[str, Any], values: Dict[str, Any]) -> object:
    """Construct a TaskParameters object from a schema and parameter values.

    This function will create a new `TaskParameters` object from a pydantic schema
    (usually retrieved from the database). This is a simplified container defined in
    this module, rather than the pydantic version. This allows its use in environments
    which do not have pydantic installed.

    This function will recursively construct necessary internal objects as well, e.g.,
    `AnalysisHeader`, `TemplateParameters`, etc.

    NOTE: This function assumes that the values passed in have been validated, and
          that they conform to the schema. No validation will be done.

    Args:
        schema (Dict[str, Any]): The JSON schema for the **PYDANTIC** TaskParameters
            model. Usually this will be retrieved from the database.

        values (Dict[str, Any]): The of the parameters for the TaskParameters object.

    Returns:
        new_obj (object): Usually this will be the `TaskParameters` instance (or a
            a sub-class thereof), but this method recursively constructs all objects.
    """
    # Parameter may not be here but in properties
    fields_for_params: Dict[str, Any] = {}
    class_name: str = schema["title"]
    param_config_obj_vals: Dict[str, Any] = {}
    if "definitions" in schema and "Config" in schema["definitions"]:
        config_properties: Dict[str, Any] = schema["definitions"]["Config"][
            "properties"
        ]
        for config_prop in config_properties:
            if config_prop in LUTE_PARAMETER_CONFIG_KEYS:
                # We put the value of the Config option as a `const` field in the defn
                config_prop_val: Any = config_properties[config_prop]["const"]
                param_config_obj_vals[config_prop] = config_prop_val
        schema["definitions"].pop("Config")
    for param_name in values:
        try:
            param_info: Dict[str, Any] = schema["properties"][param_name]
        except KeyError:
            for _, defn in schema["definitions"].items():
                if param_name in defn["properties"]:
                    param_info = defn["properties"][param_name]
                    break
            else:
                raise RuntimeError(f"Cannot find {param_name} in schema")
        working_value: Any = values[param_name]
        if working_value.__class__.__name__ == "TemplateParameters":
            working_value = working_value.params
        if working_value is None:
            fields_for_params[param_name] = None
            continue
        if "type" in param_info:
            type_info: str = param_info["type"]
            # new_field: Field
            new_field: Any
            cast_as: type
            if type_info == "array":
                cast_as = BASE_SCHEMA_TYPE_MAP[param_info["items"]["type"]]
                new_field = list(map(cast_as, working_value))
            elif type_info == "null":
                new_field = None
            elif type_info == "object":
                # Ideally we shouldn't get here, but it can happen if there is a
                # complex object passed as a parameter but no model defined.
                # E.g. this will happen if a dict {"a":1, "b":2} is the parameter
                # without having a separate BaseModel defined for it.
                # We cannot type check, so we just hope json deserialization worked.
                new_field = working_value
            else:
                cast_as = BASE_SCHEMA_TYPE_MAP[param_info["type"]]
                new_field = cast_as(working_value)
            fields_for_params[param_name] = new_field
        else:
            # Look here for information:
            # https://json-schema.org/draft/2020-12/json-schema-core#section-10.2
            # See also:
            # https://github.com/OAI/OpenAPI-Specification/blob/main/versions/3.0.2.md#data-types
            # Have to look up definitions
            if "oneOf" in param_info:
                # Case of Union[Obj1,Obj2,...]
                ...
            elif "allOf" in param_info:
                # Will be a str like "#/definitions/ClassName"
                sub_schema: Dict[str, Any] = schema
                ref: str = param_info["allOf"][0]["$ref"]  # .split("/")
                ref_parts: List[str] = ref.split("/")
                for part in ref_parts:
                    if part == "#":
                        continue
                    else:
                        sub_schema = sub_schema[part]
                fields_for_params[param_name] = construct_task_parameters(
                    sub_schema, working_value
                )
            elif "anyOf" in param_info:
                for possibility in param_info["anyOf"]:
                    # If we can successfully cast on the first type we will.
                    if "type" in possibility:
                        type_info = possibility["type"]
                        if type_info == "array":
                            cast_as = BASE_SCHEMA_TYPE_MAP[possibility["items"]["type"]]
                            try:
                                fields_for_params[param_name] = list(
                                    map(cast_as, working_value)
                                )
                                break
                            except ValueError:
                                # Maybe the next type will work
                                continue
                        elif type_info == "null":
                            fields_for_params[param_name] = None
                        else:
                            if isinstance(
                                working_value, BASE_SCHEMA_TYPE_MAP[possibility["type"]]
                            ):
                                fields_for_params[param_name] = working_value
                                break
                            else:
                                try:
                                    cast_as = BASE_SCHEMA_TYPE_MAP[possibility["type"]]
                                    fields_for_params[param_name] = cast_as(
                                        working_value
                                    )
                                    break
                                except ValueError:
                                    # Maybe the next type will work
                                    continue
                else:
                    raise ValueError(
                        f"Could not construct Field for parameter: {param_name}"
                    )

    obj_type: type
    if class_name == "TaskParameters":
        obj_type = TaskParameters
    elif class_name == "AnalysisHeader":
        obj_type = AnalysisHeader
    elif class_name == "TemplateConfig":
        obj_type = TemplateConfig
    elif class_name == "TemplateParameters":
        obj_type = TemplateParameters
    else:
        base_classes: Tuple[type] = (TaskParameters,)
        class_attrs: Dict[str, Any] = dict(TaskParameters.__dict__)
        # Remove bad fields
        ignore_keys: Set[str] = {"__weakref__", "__dict__"}
        safe_class_attrs: Dict[str, Any] = {}
        for key in class_attrs:
            if key in ignore_keys:
                continue
            else:
                safe_class_attrs[key] = class_attrs[key]
        obj_type = type(class_name, base_classes, safe_class_attrs)

    if param_config_obj_vals:
        # We only have a non-empty dict if this type has a Config attr
        param_config = ParameterConfig(**param_config_obj_vals)
        assert hasattr(obj_type, "Config")
        obj_type.Config = param_config

    obj: Any
    if obj_type == "TemplateParameters":
        # This is the only base class that doesn't retain a schema
        obj = obj_type(**fields_for_params)
    else:
        obj = obj_type(schema, **fields_for_params)

    if hasattr(obj, "_dict"):
        # This is messy - leftover from the object construction.
        # TODO: The above process should be cleaned up at some point.
        del obj._dict
    return obj

handle_field_attrs(self, *args, **kwargs)

Source code in lute/io/parameters.py
106
107
108
109
110
def handle_field_attrs(self, *args, **kwargs):
    """"""
    for param_name, param_val in kwargs.items():
        setattr(self, param_name, param_val)
        self._dict[param_name] = param_val