Order-independent Deep Learning Model

Issue

I have a dataset with parallel time series. The column ‘A’ depends on columns ‘B’ and ‘C’. The order (and the number) of dependent columns can change. For example:

            A   B    C
2022-07-23  1  10  100
2022-07-24  2  20  200
2022-07-25  3  30  300

How should I transform this data, or how should I build the model so the order of columns ‘B’ and ‘C’ (‘A’, ‘B’, ‘C’ vs ‘A’, C’, ‘B’`) doesn’t change the result? I know about GCN, but I don’t know how to implement it. Maybe there are other ways to achieve it.

UPDATE:

I want to generalize my question and make one more example. Let’s say we have a matrix as a singe observation (no time series data):

   col1 col2  target
0     1    a      20
1     2    a      30
2     3    b      30
3     4    b      40

I would like to predict one value ‘target’ per each row/instance. Each instance depends on other instances. The order of rows is irrelevant, and the number of rows in each observation can change.

Solution

You are looking for a permutation invariant operation on the columns.

One way of achieving this would be to apply column-wise operation, followed by a global pooling operation.
How that achieves your goal:

  • column-wise operations are permutation equivariant; that is, applying the operation on the columns and permuting the output, is the same as permuting the columns and then applying the operation.
  • A global pooling operation (e.g., max-pool, avg-pool) across the columns is permutation invariant: the result of an average pool does not depend on the order of the columns.
  • Applying a permutation invariant operation on top of a permutation equivariant once results in an overall permutation invariant function.

Additionally, you should look at self-attention layers, which are also permutation equivariant.

What I would try is:

  1. Learn a representation (RNN/Transformer) for a single time series. Apply this representation to A, B and C.
  2. Learn a transformer between the representation of A to those of B and C: that is, use the representation of A as "query" and those of B and C as "keys" and "values".

This will give you a representation of A that is permutation invariant in B and C.


Update (Aug 3rd, 2022):

For the case of "observations" with varying number of rows, and fixed number of columns:
I think you can treat each row as a "token" (with a fixed dimension = number of columns), and apply a Transformer encoder to predict the target for each "token", from the encoded tokens.

Answered By – Shai

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published