Nov-04-2019, 02:33 PM
This was my first solution. This will already give you a speed boost.
https://docs.scipy.org/doc/numpy/referen...ordot.html
def forward_path_half_vectorized(x, w, b): """ Inputs: - x: A numpy array of images of shape (N, H, W) - w: A numpy array of weights of shape (M, H, W) - b: A numpy vector of biases of size M Outputs: - cout: a numpy array of shape (N, M) """ N, _, _ = x.shape M, _, _ = w.shape cout = np.zeros((N, M)) for ni in range(N): for mi in range(M): cout[ni, mi] = np.sum(x[ni] * w[mi]) return cout + bBut I thought there must be a better way and i found it looking through the numpy documentation.
https://docs.scipy.org/doc/numpy/referen...ordot.html
def forward_path_full_vectorized(x, w, b): """ Inputs: - x: A numpy array of images of shape (N, H, W) - w: A numpy array of weights of shape (M, H, W) - b: A numpy vector of biases of size M Outputs: - cout: a numpy array of shape (N, M) """ return np.tensordot(x, w, axes=([1,2],[1,2])) + bThe full vectorized version is even 90 times faster !
X = np.ones((100, 64, 64), dtype=np.float64) * 0.3 W = np.ones((200, 64, 64), dtype=np.float64) * 1.5 B = np.ones((200), dtype=np.float64) * 3.3 %timeit forward_path_half_vectorized(X, W, B) -> 408 ms ± 2.49 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %timeit forward_path_full_vectorized(X, W, B) -> 4.63 ms ± 125 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)Life can be easy knowing where to look. :-)