fatf.utils.data.transformation
.dataset_row_masking¶
-
fatf.utils.data.transformation.
dataset_row_masking
(dataset: numpy.ndarray, data_row: Union[numpy.ndarray, numpy.void]) → numpy.ndarray[source]¶ Creates a binary representation of the
dataset
by masking its rows.New in version 0.0.2.
The rows of the
dataset
array are compared against specifieddata_row
to determine which features values are the same and which are different. The same values are represented as1
in the binary output and different ones are indicated by0
.For a
['a', 'b']
data_row
and[['x', 'b'], ['a', 'b'], ['a', 'x']]
dataset
the binary representation would be[[0, 1], [1, 1], [1, 0]]
.- Parameters
- datasetnumpy.ndarray
A 2-dimensional numpy array used to generate the binary representation.
- data_rowUnion[numpy.ndarray, numpy.void]
A 1-dimensional numpy array for unstructured arrays or numpy void for structured rows containing feature values that will be compared against the
dataset
rows.
- Returns
- binary_representationnumpy.ndarray
A binary (0’s and 1’s in an array of
numpy.int8
type) representation of thedataset
(with the same shape asdataset
) achieved by “masking” it with thedata_row
.
- Raises
- IncorrectShapeError
The
dataset
is not a 2-dimensional array ordata_row
is not a 1-dimensional array. The length of thedata_row
is different to the number of columns in thedataset
.- TypeError
The
dataset
is not of a base type or thedata_row
’s dtype is too different from thedataset
’s dtype.