Strada
Exported
ActivationLayer(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
Activation layers compute element-wise function, taking one bottom blob as input and producing one top blob of the same size. Parameters are
activation
(defaultSigmoid
): The nonlinear function applied. Can beReLU
,Sigmoid
orTanH
.
Both input and output are of shape n * c * h * w
.
source: Strada/src/layers.jl:182
ConcatLayer(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
The Concat layer is a utility layer that concatenates its multiple input blobs to one single output blob. It takes one keyword parameter, axis
. The input shape of the bottoms are n_i * c_i * h * w
for i = 1, ..., K
and the output shape is
-
(n_1 + n_2 + ... + n_K) * c_1 * h * w
ifaxis = 0
in which case allc_i
should be the same and -
n_1 * (c_1 + c_2 + ... + c_K) * h * w
ifaxis = 1
in which case alln_i
should be the same.
source: Strada/src/layers.jl:284
ConvLayer(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
The Convolution layer convolves the input image with a set of learnable filters, each producing one feature map in the output image. keyword parameters are
-
n_filter
: The number of filters (required) -
kernel
: A tuple specifying height and width of each filter (required) -
stride
: A tuple which specifies the intervals at which to apply the filters to the input (horizontally and vertically) -
pad
: A tuple which specifies the number of pixels to (implicitly) add to each side of the input -
group
(default 1): If g > 1, we restrict the connectivity of each filter to a subset of the input. Specifically, the input and output channels are separated into g groups, and the ith output group channels will be only connected to the ith input group channels.
The input is of shape n * c_i * h_i * w_i
and the output is of shape n * c_o * h_o * w_o
, where h_o = (h_i + 2 * pad_h - kernel_h) / stride_h + 1
and w_o
likewise.
source: Strada/src/layers.jl:96
DataLayer(name::ASCIIString) ¶
The DataLayer makes it easy to propagate data through the network while doing computation. The data is being stored in Google Protocol Buffers and transferred to Caffe in this way. Its only keyword argument is data
which is an array that will be presented to the next layer through the top blob. It is meant to be used only with ApolloNet
s.
source: Strada/src/layers.jl:351
DropoutLayer(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
The dropout layer is a regularizer that randomly sets input values to zero.
source: Strada/src/layers.jl:217
EuclideanLoss(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
The Euclidean loss layer computes the sum of squares of differences of its two inputs
source: Strada/src/layers.jl:332
LRNLayer(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
The local response normalization layer performs a kind of “lateral inhibition” by normalizing over local input regions. In ACROSS_CHANNELS
mode, the local regions extend across nearby channels, but have no spatial extent (i.e., they have shape local_size
x 1 x 1). In WITHIN_CHANNEL
mode, the local regions extend spatially, but are in separate channels (i.e., they have shape 1 x local_size
x local_size
). Each input value is divided by (1+(α/n) sum_i x_i^2)β)
, where n is the size of each local region, and the sum is taken over the region centered at that value (zero padding is added where necessary). It accepts the following keyword arguments:
-
local_size
(default 3): Size of the region the normalization is computed over -
alpha
(default5e-5
): Value of the parameter α -
beta
(default0.75
): Value of the parameter β -
norm_region
: Mode of the local contrast normalization. Can beACROSS_CHANNELS
orWITHIN_CHANNEL
.
source: Strada/src/layers.jl:202
LinearLayer(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
The InnerProduct layer (also usually referred to as the fully connected layer) treats the input as a simple vector and produces an output in the form of a single vector (with the blob’s height and width set to 1). The keyword parameters are
n_filter
: The number of filters (required)
The input is of shape n * c_i * h_i * w_i
and the output of shape n * c_o * 1 * 1
.
source: Strada/src/layers.jl:125
LstmLayer(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
The LstmLayer is an LSTM unit. It takes two blobs as input, the current LSTM input and the previous memory cell content. It outputs the new hidden state and the updated memory cell.
source: Strada/src/layers.jl:240
LstmLayer(name::ASCIIString, bottoms::Array{ASCIIString, 1}, tops::Array{ASCIIString, 1}) ¶
The LstmLayer is an LSTM unit. It takes two blobs as input, the current LSTM input and the previous memory cell content. It outputs the new hidden state and the updated memory cell.
source: Strada/src/layers.jl:240
MemoryLayer(name::ASCIIString) ¶
The MemoryLayer presents data to Caffe through a pointer (it is implemented as a new Caffe Layer called PointerData), which can be set using set_data!
method of CaffeNet
. It is the preferred way to fill CaffeNet
with data. As each MemoryLayer provides exactly one top blob, you will typically have multiple of these (in the supervised setting, one for labels and one for images for example). In set_data!
, you can specify with an integer index which of the layers will be filled with the data provided.
source: Strada/src/layers.jl:339
Net(name::ASCIIString) ¶
Create an empty ApolloNet. A log_level of 0 prints full caffe debug information, a log_level of 3 prints nothing.
source: Strada/src/apollonet.jl:11
Net(name::ASCIIString, layers::Array{Layer, 1}) ¶
Load a model from a caffe compatible .caffemodel file (for example from the caffe model zoo).
source: Strada/src/caffenet.jl:22
PoolLayer(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
The PoolLayer partitions the input image into a set of non-overlapping rectangles and, for each such sub-region, outputs the maximum or average value. The keyword parameters are
-
method
(defaultMAX
): The pooling method. Can beMAX
,AVE
orSTOCHASTIC
. -
pad
(default 0): Specifies the number of pixels to (implicitly) add to each side of the input -
stride
(default 1): Specifies the intervals at which to apply the filters to the input
The input is of shape n * c * h_i * w_i
and the output of shape n * c * h_o * w_o
where h_o
and w_o
are computed in the same way as for the convolution.
source: Strada/src/layers.jl:151
Softmax(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
Computes the softmax of the input. The parameter axis
specifies which axis the softmax is computed over.
source: Strada/src/layers.jl:316
SoftmaxWithLoss(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
The softmax loss layer computes the multinomial logistic loss of the softmax of its inputs. It’s conceptually identical to a softmax layer followed by a multinomial logistic loss layer, but provides a more numerically stable gradient. Its parameters are
ignore_label
(default -1): Label does not contribute to the loss
This layer expects two bottom blobs, the actual data of size n * c * h * w
and a label of size n * 1 * 1 * 1
.
source: Strada/src/layers.jl:299
WordvecLayer(name::ASCIIString, bottoms::Array{ASCIIString, 1}) ¶
The WordvecLayer turns positive integers (indexes) between 0 and vocab_size - 1
into dense vectors of fixed size dimension
. The input is of size n
where n
is the batchsize and the output is of size n * dimension
.
source: Strada/src/layers.jl:256
backward(net::ApolloNet) ¶
Run a backward pass through the whole network.
source: Strada/src/apollonet.jl:34
backward(net::CaffeNet) ¶
Run a backward pass through the whole network.
source: Strada/src/caffenet.jl:79
copy!(output::ApolloDict, input::Array{T, 1}) ¶
Copy a flat parameter vector into a binary blob.
source: Strada/src/blobs.jl:160
copy!(output::Array{T, 1}, input::ApolloDict) ¶
Copy a binary blob into a flat parameter vector.
source: Strada/src/blobs.jl:150
copy!(output::Array{T, 1}, input::CaffeDict) ¶
Copy a binary blob into a flat parameter vector.
source: Strada/src/blobs.jl:30
copy!(output::CaffeDict, input::Array{T, 1}) ¶
Copy a flat parameter vector into a binary blob.
source: Strada/src/blobs.jl:42
filler(name::Symbol) ¶
Fillers are random number generators that fills a blob using the specified algorithm. The algorithm is specified by
name
: Can be:gaussian
,:uniform
,:xavier
or:constant
The parameters are given by keyword arguments:
-
value
: Gives the value for a constant filler -
min
andmax
: Range for a uniform filler -
mean
andstd
: Mean and standard deviation of a Gaussian filler
source: Strada/src/layers.jl:64
forward(net::ApolloNet, layer::Layer) ¶
Run a forward pass of a single layer.
source: Strada/src/apollonet.jl:25
forward(net::CaffeNet) ¶
Run a forward pass through the whole network.
source: Strada/src/caffenet.jl:74
get_batchsize(str::MinibatchStream) ¶
Batchsize of the MinibatchStream
source: Strada/src/stream.jl:70
getminibatch(str::MinibatchStream) ¶
Get a random minibatch from the MinibatchStream
source: Strada/src/stream.jl:75
grad_check{F}(objective::Function, theta::Array{F, 1}, data, epsilon::Float64) ¶
Check gradients using symmetric finite differences. See the tests for example how to run.
source: Strada/src/gradcheck.jl:11
length(blob::ApolloDict) ¶
The total number of variables in a binary blob.
source: Strada/src/blobs.jl:141
length(blob::CaffeDict) ¶
Number of parameters in the blob.
source: Strada/src/blobs.jl:19
load_caffemodel(net::CaffeNet, filename::String) ¶
Load a model from a caffe compatible .caffemodel file (for example from the caffe model zoo).
source: Strada/src/caffenet.jl:56
minibatch_stream(args::AbstractArray{T, N}...) ¶
Construct a MinibatchStream from a tuple of data arrays with full batch size.
source: Strada/src/stream.jl:46
num_batches(str::MinibatchStream) ¶
Number of minibatches in the MinibatchStream
source: Strada/src/stream.jl:60
read_svmlight(filename::String) ¶
Load a dataset from the libsvm compatible svmlight file format into a sparse matrix.
source: Strada/src/svmlight.jl:3
read_svmlight(filename::String, Dtype::DataType) ¶
Load a dataset from the libsvm compatible svmlight file format into a sparse matrix.
source: Strada/src/svmlight.jl:3
reset(net::ApolloNet) ¶
Clear the active layers and active parameters of the net so a new forward pass can be run.
source: Strada/src/apollonet.jl:19
set_mode_cpu(net::ApolloNet) ¶
Activate CPU mode.
source: Strada/src/apollonet.jl:39
set_mode_cpu(net::CaffeNet) ¶
Activate CPU mode.
source: Strada/src/caffenet.jl:44
set_mode_gpu(net::ApolloNet) ¶
Activate GPU mode.
source: Strada/src/apollonet.jl:44
set_mode_gpu(net::CaffeNet) ¶
Activate CPU mode.
source: Strada/src/caffenet.jl:50
sgd{F}(objective!::Function, data::DataStream, theta::Array{F, 1}) ¶
Run the stochastic gradient descent method on the objective. If a testset is provided, generalization performance will also periodically be evaluated.
source: Strada/src/sgd.jl:22
zero!(A::ApolloDict) ¶
Fill a binary blob with zeros.
source: Strada/src/blobs.jl:170
zero!(A::CaffeDict) ¶
Fill a binary blob with zeros.
source: Strada/src/blobs.jl:54
DataStream ¶
A data stream represents a data source for a neural network. It could be data held in memory, in a database on disk, or a network socket for example.
source: Strada/src/stream.jl:9
Data ¶
Data to be fed into a CaffeNet is kept in a Data{F,N}
structure where F
is the type of floating point number used to store the data (Float32 or Float64) and N
is the number of data layers of the network. We represent Data{F,N} as a tuple, where the dimension i holds data that will be fed into data layer i of the network. A canonical example is for supervised learning, where N is 2, the first component representing the image (say) and the second component representing the label. Each dimension of the tuple typically holds an array whose last dimension corresponds to the index in the minibatch.
source: Strada/src/stream.jl:4
Internal
calc_full_gradient{F}(objective!::Function, data::DataStream, theta::Array{F, 1}, grad::Array{F, 1}) ¶
Calculate the full gradient of the model at parameters theta
over the dataset data
. The gradient will be stored in grad
.
source: Strada/src/utils.jl:43
calc_full_prediction{F}(predictor::Function, data::DataStream, theta::Array{F, 1}) ¶
Calculate prediction performance of the model with parameters theta
over a whole dataset data
source: Strada/src/utils.jl:57
size(str::MinibatchStream) ¶
Number of datapoints in the MinibatchStream
source: Strada/src/stream.jl:65
ApolloDict ¶
An ApolloDict is a collection of blobs with names, each name is associated with one floating point array. Example: The name 'ip_weights' could map to the weights of a linear layer.
source: Strada/src/blobs.jl:71
CaffeDict ¶
A CaffeDict is a collection of blobs with names, each name is associated with an arbitrary number of floating point arrays. Example: The name 'conv1' could map to a vector containing the biases and weights of a convolution.
source: Strada/src/blobs.jl:7
EmptyStream ¶
An empty stream represents a data source with no data.
source: Strada/src/stream.jl:14
MinibatchStream ¶
A MinibatchStream is a collection of data represented in memory that has been partitioned into minibatches.
source: Strada/src/stream.jl:25
NetData{D} ¶
A collection of blobs in a network. Grouped into 'data' (the actual parameters) and 'diff' (the gradients) so they can be treated as vectors that can be added together.
source: Strada/src/blobs.jl:64