Configuration
Config
Configuration for the high-level interface aindo.anonymize.AnonymizationPipeline
.
from_dict
classmethod
Creates an instance of the class from a dictionary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
value
|
dict[str, Any]
|
A dictionary where keys represent the attributes of the class and values are their corresponding values. |
required |
Returns:
Type | Description |
---|---|
Config
|
An instance of the class populated with the data from the dictionary. |
Configuration Schema
The Config.from_dict
method accepts a Python dictionary following the schema below.
Key | Type | Description |
---|---|---|
steps |
list[dict] |
A list of anonymization steps. |
steps[i].method |
dict |
Defines the anonymization technique and its parameters. |
steps[i].method.type |
str |
The name of the anonymization technique in snake_case. |
steps[i].method.<param> |
dict |
Additional key-value pairs for technique-specific parameters. See the list of anonymization techniques and their respective parameters. |
steps[i].columns |
list[string] | None |
The list of column names to which the anonymization method applies. |
For a full configuration example, see the code below.
Note that some parameters may be mutually exclusive and are therefore not included in this example.
For a complete reference of technique-specific parameters, see the API reference - Techniques.
Full configuration example
config.json
{
"steps": [
{
"method": {
"type": "binning",
"bins": 10
},
"columns": ["column_a"]
},
{
"method": {
"type": "character_masking",
"mask_length": 3,
"symbol": "*",
"starting_direction": "left"
},
"columns": ["column_b"]
},
{
"method": {
"type": "data_nulling",
"constant_value": "BLANK"
},
"columns": ["column_c"]
},
{
"method": {
"type": "key_hashing",
"key": "my key",
"salt": "my salt",
"hash_name": "sha256"
},
"columns": ["column_d"]
},
{
"method": {
"type": "mocking",
"data_generator": "name"
},
"columns": ["column_e"]
},
{
"method": {
"type": "perturbation_categorical",
"alpha": 0.8,
"sampling_mode": "uniform",
"frequencies": [
{"A": 0.5},
{"B": 0.5}
],
"seed": 42
},
"columns": ["column_f"]
},
{
"method": {
"type": "perturbation_numerical",
"alpha": 0.8,
"sampling_mode": "weighted",
"perturbation_range": [1, 10],
"seed": 42
},
"columns": ["column_g"]
},
{
"method": {
"type": "swapping",
"alpha": 0.8,
"seed": 42
},
"columns": ["column_h"]
},
{
"method": {
"type": "top_bottom_coding_categorical",
"q": 0.8,
"other_label": "OTHER"
},
"columns": ["column_i"]
},
{
"method": {
"type": "top_bottom_coding_numerical",
"q": 0.3
},
"columns": ["column_l"]
}
]
}