I have a dataset with parallel time series. The column ‘A’ depends on columns ‘B’ and ‘C’. The order (and the number) of dependent columns can change. For example:
A B C 2022-07-23 1 10 100 2022-07-24 2 20 200 2022-07-25 3 30 300
How should I transform this data, or how should I build the model so the order of columns ‘B’ and ‘C’ (‘A’, ‘B’, ‘C’ vs ‘A’, C’, ‘B’`) doesn’t change the result? I know about GCN, but I don’t know how to implement it. Maybe there are other ways to achieve it.
I want to generalize my question and make one more example. Let’s say we have a matrix as a singe observation (no time series data):
col1 col2 target 0 1 a 20 1 2 a 30 2 3 b 30 3 4 b 40
I would like to predict one value ‘target’ per each row/instance. Each instance depends on other instances. The order of rows is irrelevant, and the number of rows in each observation can change.
You are looking for a permutation invariant operation on the columns.
One way of achieving this would be to apply column-wise operation, followed by a global pooling operation.
How that achieves your goal:
- column-wise operations are permutation equivariant; that is, applying the operation on the columns and permuting the output, is the same as permuting the columns and then applying the operation.
- A global pooling operation (e.g., max-pool, avg-pool) across the columns is permutation invariant: the result of an average pool does not depend on the order of the columns.
- Applying a permutation invariant operation on top of a permutation equivariant once results in an overall permutation invariant function.
Additionally, you should look at self-attention layers, which are also permutation equivariant.
What I would try is:
- Learn a representation (RNN/Transformer) for a single time series. Apply this representation to
- Learn a transformer between the representation of
Ato those of
C: that is, use the representation of
Aas "query" and those of
Cas "keys" and "values".
This will give you a representation of
A that is permutation invariant in
Update (Aug 3rd, 2022):
For the case of "observations" with varying number of rows, and fixed number of columns:
I think you can treat each row as a "token" (with a fixed dimension = number of columns), and apply a Transformer encoder to predict the target for each "token", from the encoded tokens.
Answered By – Shai