Comments on: Defining Your Own Network Layer https://blogs.mathworks.com/deep-learning/2018/01/05/defining-your-own-network-layer/?s_tid=feedtopost Johanna specializes in deep learning and computer vision. Her goal is to give insight into deep learning through code examples, developer Q&As, and tips and tricks using MATLAB. Tue, 06 Apr 2021 19:52:32 +0000 hourly 1 https://wordpress.org/?v=6.2.2 By: Kookmin University https://blogs.mathworks.com/deep-learning/2018/01/05/defining-your-own-network-layer/#comment-531 Thu, 18 Oct 2018 12:02:36 +0000 https://blogs.mathworks.com/deep-learning/?p=88#comment-531 Hi Steve. Thank you so much for your explanation.
I am trying to use this to build my own fully connected layer.
But I meet a problem here is that how could I retrieve the output of the previous layer? In keras, there is a flatten layer followed the convolution layer.
And in this case, how could Matlab recognize the shape of the output layer?
Could you please help me?

]]>
By: Sunny Arokia Swamy Bellary https://blogs.mathworks.com/deep-learning/2018/01/05/defining-your-own-network-layer/#comment-471 Mon, 24 Sep 2018 23:08:54 +0000 https://blogs.mathworks.com/deep-learning/?p=88#comment-471 Hi Steve… Thanks for the explanation… I tried to implement for prediction problem using LSTM… How can I add the custom defined layer to thats model imported from KERAS?

Thanks and Regards,
Sunny

]]>
By: Jack Xiao https://blogs.mathworks.com/deep-learning/2018/01/05/defining-your-own-network-layer/#comment-323 Thu, 19 Jul 2018 09:02:04 +0000 https://blogs.mathworks.com/deep-learning/?p=88#comment-323 Hi Steve,
Why is the backward (the derivative of the loss function) is so?
I think the backward used in this example is the the derivative of the active function but not the derivative of the loss function.
maybe we should confirm the loss (such as MAE, MSE) first , then we can get the final backward in terms of the active function and the loss function. Is it so?
another question:
why the example in https://ww2.mathworks.cn/help/nnet/ug/define-custom-regression-output-layer.html used backwardloss or forwardloss but not backward or forward? does it have any difference?

]]>
By: Steve Eddins https://blogs.mathworks.com/deep-learning/2018/01/05/defining-your-own-network-layer/#comment-142 Wed, 24 Jan 2018 15:53:03 +0000 https://blogs.mathworks.com/deep-learning/?p=88#comment-142 Guillaume—Thanks for the idea!

]]>
By: guillaume godin https://blogs.mathworks.com/deep-learning/2018/01/05/defining-your-own-network-layer/#comment-140 Wed, 24 Jan 2018 09:10:15 +0000 https://blogs.mathworks.com/deep-learning/?p=88#comment-140 Hi Steve,

I implemented the ELU like this: to optimize computation (using memory data)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    function Z = predict(layer, X)
            % Forward input data through the layer and output the result
            Z = max(0, X) + layer.Alpha .*(exp(min(0, X))-1);
            
        end
        
        
        function [Z, memory] = forward(layer, X)
            % (Optional) Forward input data through the layer at training
            % time and output the result and a memory value
            %
            % Inputs:
            %         layer  - Layer to forward propagate through
            %         X      - Input data
            % Output:
            %         Z      - Output of layer forward function
            %         memory - Memory value which can be used for
            %                  backward propagation

            % Layer forward function for training goes here
            memory = exp(min(0, X))-1;
            Z = max(0, X) + layer.Alpha .* memory;

        end
        
        
        function [dLdX, dLdAlpha] = backward(layer, X, Z, dLdZ, memory)
            % Backward propagate the derivative of the loss function through 
            % the layer 
            % y = dLdZ
            
            dLdX = (layer.Alpha+Z) .* dLdZ; % negative part => (a+f(x))*y
            dLdX(X>0) = dLdZ(X>0); % positive part => y
            
            % derivat the the 
            dLdAlpha = memory .* dLdZ; % negative part only => exp(x)-1 
            dLdAlpha = sum(sum(dLdAlpha,1),2);
            
            % Sum over all observations in mini-batch
            dLdAlpha = sum(dLdAlpha,4);            
         end

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

my results are similar to yours:

|========================================================================================|
|  Epoch  |  Iteration  |  Time Elapsed  |  Mini-batch  |  Mini-batch  |  Base Learning  |
|         |             |   (hh:mm:ss)   |   Accuracy   |     Loss     |      Rate       |
|========================================================================================|
|       1 |           1 |       00:00:00 |        8.59% |       2.6340 |          0.0100 |
|       2 |          50 |       00:00:04 |       80.47% |       0.5700 |          0.0100 |
|       3 |         100 |       00:00:07 |       96.88% |       0.1470 |          0.0100 |
|       4 |         150 |       00:00:11 |       97.66% |       0.1322 |          0.0100 |
|       6 |         200 |       00:00:14 |       99.22% |       0.0621 |          0.0100 |
|       7 |         250 |       00:00:18 |       99.22% |       0.0395 |          0.0100 |
|       8 |         300 |       00:00:21 |      100.00% |       0.0212 |          0.0100 |
|       9 |         350 |       00:00:24 |      100.00% |       0.0191 |          0.0100 |
|      11 |         400 |       00:00:28 |      100.00% |       0.0170 |          0.0100 |
|      12 |         450 |       00:00:31 |      100.00% |       0.0119 |          0.0100 |
|      13 |         500 |       00:00:35 |      100.00% |       0.0116 |          0.0100 |
|      15 |         550 |       00:00:38 |      100.00% |       0.0056 |          0.0100 |
|      16 |         600 |       00:00:42 |      100.00% |       0.0099 |          0.0100 |
|      17 |         650 |       00:00:45 |      100.00% |       0.0080 |          0.0100 |
|      18 |         700 |       00:00:49 |      100.00% |       0.0058 |          0.0100 |
|      20 |         750 |       00:00:52 |      100.00% |       0.0063 |          0.0100 |
|      21 |         800 |       00:00:56 |      100.00% |       0.0055 |          0.0100 |
|      22 |         850 |       00:01:00 |      100.00% |       0.0060 |          0.0100 |
|      24 |         900 |       00:01:03 |      100.00% |       0.0045 |          0.0100 |
|      25 |         950 |       00:01:06 |      100.00% |       0.0039 |          0.0100 |
|      26 |        1000 |       00:01:10 |      100.00% |       0.0033 |          0.0100 |
|      27 |        1050 |       00:01:13 |      100.00% |       0.0046 |          0.0100 |
|      29 |        1100 |       00:01:17 |      100.00% |       0.0042 |          0.0100 |
|      30 |        1150 |       00:01:20 |      100.00% |       0.0040 |          0.0100 |
|      30 |        1170 |       00:01:21 |      100.00% |       0.0042 |          0.0100 |
|========================================================================================|
>> [XTest, YTest] = digitTest4DArrayData;
YPred = classify(net, XTest);
accuracy = sum(YTest==YPred)/numel(YTest)

accuracy =

    0.9896

BR,

Guillaume

]]>
By: Binbin Qi https://blogs.mathworks.com/deep-learning/2018/01/05/defining-your-own-network-layer/#comment-138 Wed, 17 Jan 2018 02:58:51 +0000 https://blogs.mathworks.com/deep-learning/?p=88#comment-138 sorry, It is my fault, I set a wrong parameter.

]]>
By: Binbin Qi https://blogs.mathworks.com/deep-learning/2018/01/05/defining-your-own-network-layer/#comment-136 Wed, 17 Jan 2018 01:32:24 +0000 https://blogs.mathworks.com/deep-learning/?p=88#comment-136 when use this layers, the channels of images is 3, it does not work

]]>
By: Eric Shields https://blogs.mathworks.com/deep-learning/2018/01/05/defining-your-own-network-layer/#comment-132 Fri, 05 Jan 2018 15:37:04 +0000 https://blogs.mathworks.com/deep-learning/?p=88#comment-132 Batch normalization may not be necessary with ELUs. Clevert, et al, indicate that “Batch normalization improved ReLU and LReLU networks, but did not improve ELU and SReLU networks.” On the example code I get 10% faster performance for the same accuracy by removing the batch normalization layer.

]]>
By: Eric https://blogs.mathworks.com/deep-learning/2018/01/05/defining-your-own-network-layer/#comment-130 Fri, 05 Jan 2018 15:16:08 +0000 https://blogs.mathworks.com/deep-learning/?p=88#comment-130 Thanks for a great blog post. Is there an easy way to modify this code so that the user can determine at run-time whether alpha is learned or fixed? Or does a separate class need to be defined with alpha outside of the Learnable properties block?

]]>