TDSM 9.15

Based on the textbook, the concept would be

1) randomize the order of the training example once,

2) march down the training sample with given batch size,

3) before update the weight, compute the average batch loss value.

4) and after several iteration, these precomputed loss values would give you a hint that how the stochastic gradient descent perform and whether it is converge.

Credit: [1]

TDSM 9.15

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools