hpar.Rd
List of Neural Network parameters and hyperparameters to train with gradient descent or particle swarm optimization
Not mandatory (the list is preset and all arguments are initialized with default value) but it is advisable to adjust some important arguments for performance reasons (including processing time)
modexec | ‘trainwgrad’ (the default value) to train with gradient descent (suitable for all volume of data) |
---|---|
learningrate | learningrate alpha (default value 0.001) |
beta1 | see below |
beta2 | ‘Momentum’ if beta1 different from 0 and beta2 equal 0 ) |
lrdecayrate | learning rate decay value (default value 0, no learning rate decay, 1e-6 should be a good value to start with) |
chkgradevery | epoch interval to run gradient check function (default value 0, for debug only) |
chkgradepsilon | epsilon value for derivative calculations and threshold test in gradient check function (default 0.0000001) |
psoxxx | see pso for PSO specific arguments details |
costcustformul | custom cost formula (default ‘’, no custom cost function) |
numiterations | number of training epochs (default value 50)) |
seed | seed for reproductibility (default 4) |
minibatchsize | mini batch size, 2 to the power 0 for stochastic gradient descent (default 2 to the power 5) #tuning priority 3 |
layersshape | number of nodes per layer, each nodes number initialize a hidden layer |
layersacttype | activation function for each layer; ‘linear’ for no activation or ‘sigmoid’, ‘relu’ or ‘reluleaky’ or ‘tanh’ or ‘softmax’ (softmax for output layer only supported in trainwpso exec mode) |
layersdropoprob | drop out probability for each layer, continuous value from 0 to less than 1 (give the percentage of matrix weight values to drop out randomly) |
printcostevery | epoch interval to test and print costs (train and cross validation cost: default value 10, for 1 test every 10 epochs) |
testcvsize | size of cross validation sample, 0 for no cross validation sample (default 10, for 10 percent) |
testgainunder | threshold to stop the training if the gain between last train or cross validation cost is smaller than the threshold, 0 for no stop test (default 0.000001) |
costtype | cost type function name ‘mse’ or ‘crossentropy’ or ‘custom’ |
lambda | regularization term added to cost function (default value 0, no regularization) |
batchnor_mom | batch normalization momentum for j and B (default 0, no batch normalization, may be set to 0.9 for deep neural net) |
epsil | epsilon the low value to avoid dividing by 0 or log(0) in cost function, etc ... (default value 1e-12) |
verbose | to display or not the costs and the shapes (default TRUE) |
Deep Learning specialization from Andrew NG on Coursera