kernel limitations

NNGP experiment on CIFAR

kernel methods on larger datasets

data samples needed damian work with polynomials

Question: Is there any work that says if a neural network is an universal approximator so is it’s NNGP and NTK? I think this says it for NTK, at least for ReLU networks. This might also be related.

The thing about these kernel models is they can’t build up hierarchical features like neural networks do. This matters