Skip to content

Configuration

Config

Configuration for the high-level interface aindo.anonymize.AnonymizationPipeline.

from_dict classmethod

from_dict(value: dict[str, Any]) -> Config

Creates an instance of the class from a dictionary.

Parameters:

Name Type Description Default
value dict[str, Any]

A dictionary where keys represent the attributes of the class and values are their corresponding values.

required

Returns:

Type Description
Config

An instance of the class populated with the data from the dictionary.

Configuration Schema

The Config.from_dict method accepts a Python dictionary following the schema below.

Key Type Description
steps list[dict] A list of anonymization steps.
steps[i].method dict Defines the anonymization technique and its parameters.
steps[i].method.type str The name of the anonymization technique in snake_case.
steps[i].method.<param> dict Additional key-value pairs for technique-specific parameters. See the list of anonymization techniques and their respective parameters.
steps[i].columns list[string] | None The list of column names to which the anonymization method applies.

For a full configuration example, see the code below. Note that some parameters may be mutually exclusive and are therefore not included in this example.
For a complete reference of technique-specific parameters, see the API reference - Techniques.

Full configuration example
config.json
{
  "steps": [
    {
      "method": {
        "type": "binning",
        "bins": 10
      },
      "columns": ["column_a"]
    },
    {
      "method": {
        "type": "character_masking",
        "mask_length": 3,
        "symbol": "*",
        "starting_direction": "left"
      },
      "columns": ["column_b"]
    },
    {
      "method": {
        "type": "data_nulling",
        "constant_value": "BLANK"
      },
      "columns": ["column_c"]
    },
    {
      "method": {
        "type": "key_hashing",
        "key": "my key",
        "salt": "my salt",
        "hash_name": "sha256"
      },
      "columns": ["column_d"]
    },
    {
      "method": {
        "type": "mocking",
        "data_generator": "name"
      },
      "columns": ["column_e"]
    },
    {
      "method": {
        "type": "perturbation_categorical",
        "alpha": 0.8,
        "sampling_mode": "uniform",
        "frequencies": [
          {"A": 0.5},
          {"B": 0.5}
        ],
        "seed": 42
      },
      "columns": ["column_f"]
    },
    {
      "method": {
        "type": "perturbation_numerical",
        "alpha": 0.8,
        "sampling_mode": "weighted",
        "perturbation_range": [1, 10],
        "seed": 42
      },
      "columns": ["column_g"]
    },
    {
      "method": {
        "type": "swapping",
        "alpha": 0.8,
        "seed": 42
      },
      "columns": ["column_h"]
    },
    {
      "method": {
        "type": "top_bottom_coding_categorical",
        "q": 0.8,
        "other_label": "OTHER"
      },
      "columns": ["column_i"]
    },
    {
      "method": {
        "type": "top_bottom_coding_numerical",
        "q": 0.3
      },
      "columns": ["column_l"]
    }
  ]
}