The initialization determines whether in-context learning is gradient descent