Multi-output/
Multi-label Regression

Machine Learning Techniques

In case of multi-output regression, there are more than one output labels, all of which are real numbers.

Examples:

where

\(\mathbf{X}\) is a \( n \times m\) input matrix.
\(\mathbf{Y}\) is a \( n \times k\) label matrix.
\(\mathbf{y}^{(i)} \in \mathcal{R}^k\) where \(k\) is the number of output labels, i.e., \(\mathbf{y}^{(i)}\) has \(k\) components that are real numbers.
\(\mathbf{x}^{(i)}\) is a \(m \times 1\) feature vector

\( \mathbf{D} = (\mathbf{X}, \mathbf{Y}) = \{\left(\mathbf{x}^{(i)}, \mathbf{y}^{(i)}\right) \}_{i=0}^{n}\)

The output label is a matrix \(\mathbf{Y}\). In order to generate multiple outputs, we need one weight vector per output.
Hence, total of \(k\) weight vectors corresponding to the \(k\) outputs.

\mathbf{Y}_{n \times k} = \mathbf{X}_{n \times (m+1)} \mathbf{W}_{(m+1) \times k}

There are two options for modeling this problem:

Solve \(k\) independent linear regression problems. Gives some flexibility in using different representation for each problem.
Solve a joint learning problem as outlined in the equation above.

\mathbf{Y}_{n \times k} = \mathbf{X}_{n \times (m+1)} \mathbf{W}_{(m+1) \times k}

J(\mathbf{W}) = \frac{1}{2}(\mathbf{X} \mathbf{W} - \mathbf{Y})^T (\mathbf{X} \mathbf{W} - \mathbf{Y})

Multi-output/Multi-label Regression