how to deal with skip layers

fawda123 · Mar 5, 2015 · 552eac7 · 552eac7
1 parent 041ea9f
commit 552eac7
Show file tree

Hide file tree

Showing 7 changed files with 40 additions and 4 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,8 +1,8 @@
 Package: NeuralNetTools
 Type: Package
 Title: Visualization and Analysis Tools for Neural Networks
-Version: 1.0.1
-Date: 2014-01-27
+Version: 1.0.2
+Date: 2015-03-05
 Author: Marcus W. Beck [aut, cre]
 Maintainer: Marcus W. Beck <[email protected]>
 Description: Visualization and analysis tools to aid in the interpretation of
@@ -20,7 +20,7 @@ Imports:
     RSNNS,
     scales
 Depends:
-    R (>= 3.1.2)
+    R (>= 3.1.1)
 Authors@R: person(given = "Marcus W.", family = "Beck",
     role = c("aut","cre"),
     email = "[email protected]")
diff --git a/R/NeuralNetTools_gar.R b/R/NeuralNetTools_gar.R
@@ -11,6 +11,8 @@
 #'
 #' A method described in Garson 1991 (also see Goh 1995) identifies the relative importance of explanatory variables for specific response variables in a supervised neural network by deconstructing the model weights. The basic idea is that the relative importance (or strength of association) of a specific explanatory variable for a specific response variable can be determined by identifying all weighted connections between the nodes of interest. That is, all weights connecting the specific input node that pass through the hidden layer to the specific response variable are identified. This is repeated for all other explanatory variables until the analyst has a list of all weights that are specific to each input variable. The connections are tallied for each input node and scaled relative to all other inputs. A single value is obtained for each explanatory variable that describes the relationship with response variable in the model (see the appendix in Goh 1995 for a more detailed description). The original algorithm presented in Garson 1991 indicated relative importance as the absolute magnitude from zero to one such the direction of the response could not be determined.
 #' 
+#' Misleading results may be produced if the neural network was created with a skip-layer using \code{skip = TRUE} with the \code{\link[nnet]{nnet}} function.  Garson's algorithm does not describe the effects of skip layer connections on estimates of variable importance.  As such, these values are removed prior to estimating variable importance.  
+#' 
 #' @export
 #' 
 #' @import ggplot2 neuralnet nnet RSNNS
@@ -179,6 +181,11 @@ garson.nnet <- function(mod_in, out_var, bar_plot = TRUE, x_lab = NULL, y_lab =
   struct <- best_wts$struct
   best_wts <- best_wts$wts
 
+  # check for skip layers
+  chk <- grepl('skip-layer', capture.output(mod_in))
+  if(any(chk))
+    warning('Skip layer used, results may be inaccurate because input and output connections are removed')
+
   # weights only if TRUE
   if(wts_only) return(best_wts)
 
@@ -450,6 +457,11 @@ garson.train <- function(mod_in, out_var, bar_plot = TRUE, x_lab = NULL, y_lab =
   struct <- best_wts$struct
   best_wts <- best_wts$wts
 
+  # check for skip layers
+  chk <- grepl('skip-layer', capture.output(mod_in))
+  if(any(chk))
+    warning('Skip layer used, results may be inaccurate because input and output connections are removed')
+
   # weights only if TRUE
   if(wts_only) return(best_wts)
 

diff --git a/R/NeuralNetTools_plot.R b/R/NeuralNetTools_plot.R
@@ -116,6 +116,11 @@ plotnet.nnet <- function(mod_in, nid = TRUE, all_out = TRUE, all_in = TRUE, bias
   struct <- wts$struct
   wts <- wts$wts
 
+  # check for skip layers
+  chk <- grepl('skip-layer', capture.output(mod_in))
+  if(any(chk))
+    warning('Skip layer used, results may be inaccurate because input and output connections are removed')
+
   if(wts_only) return(wts)
 
   #circle colors for input, if desired, must be two-vector list, first vector is for input layer
@@ -1081,6 +1086,11 @@ plotnet.train <- function(mod_in, nid = TRUE, all_out = TRUE, all_in = TRUE, bia
   struct <- wts$struct
   wts <- wts$wts
 
+  # check for skip layers
+  chk <- grepl('skip-layer', capture.output(mod_in))
+  if(any(chk))
+    warning('Skip layer used, results may be inaccurate because input and output connections are removed')
+
   if(wts_only) return(wts)
 
   #circle colors for input, if desired, must be two-vector list, first vector is for input layer

diff --git a/R/NeuralNetTools_utils.R b/R/NeuralNetTools_utils.R
@@ -12,6 +12,7 @@
 #' @return Returns a two-element list with the first element being a vector indicating the number of nodes in each layer of the neural network and the second element being a named list of weight values for the input model.  
 #' 
 #' @details Each element of the returned list is named using the construct 'layer node', e.g. 'out 1' is the first node of the output layer.  Hidden layers are named using three values for instances with more than one hidden layer, e.g., 'hidden 1 1' is the first node in the first hidden layer, 'hidden 1 2' is the second node in the first hidden layer, 'hidden 2 1' is the first node in the second hidden layer, etc.  The values in each element of the list represent the weights entering the specific node from the preceding layer in sequential order, starting with the bias layer if applicable.  
+#' The function will remove direct weight connections between input and output layers if the neural network was created with a skip-layer using \code{skip = TRUE} with the \code{\link[nnet]{nnet}} function.  This may produce misleading results when evaluating variable performance with the \code{\link{garson}} function.  
 #' 
 #' @examples
 #' 
@@ -103,7 +104,14 @@ neuralweights.nnet <-  function(mod_in, rel_rsc = NULL, ...){
   struct <-  mod_in$n
   wts <-  mod_in$wts
 
-  if(!is.null(rel_rsc)) wts <-  scales::rescale(abs(wts), c(1, rel_rsc))
+  if(!is.null(rel_rsc)) wts <- scales::rescale(abs(wts), c(1, rel_rsc))
+
+  # remove wts from input to output if skip layers present
+  chk <- grepl('skip-layer', capture.output(mod_in))
+  if(any(chk)){
+    skips <- struct[1] * struct[length(struct)]
+    wts <- wts[-c((length(wts) - skips + 1):length(wts))]
+  }
 
   #convert wts to list with appropriate names 
   hid_struct <-  struct[ -c(length(struct))]
@@ -116,6 +124,7 @@ neuralweights.nnet <-  function(mod_in, rel_rsc = NULL, ...){
     row_nms, 
     rep(paste('out', seq(1:struct[length(struct)])), each = 1 + struct[length(struct) - 1])
   )
+
   out_ls <-  data.frame(wts, row_nms)
   out_ls$row_nms <-  factor(row_nms, levels = unique(row_nms), labels = unique(row_nms))
   out_ls <-  split(out_ls$wts, f = out_ls$row_nms)

diff --git a/cran-comments.md b/cran-comments.md
@@ -3,6 +3,8 @@ This is a resubmission for an update that was submitted earlier today (Jan. 27 2
 
 I think I have fixed the NOTE that was received on submission regarding the export of unregistered S3 methods.  This included changes to NAMESPACE by adding S3method(print, foo) for a print method for class foo, repeated for each S3 method in my package.
 
+All test environments were also checked again before resubmission. 
+
 ## Test environments
 * local Windows 7 install, R 3.1.2 
 * local Windows 7 install, Current r-devel (2015-01-27 r67627)

diff --git a/man/garson.Rd b/man/garson.Rd
@@ -53,6 +53,8 @@ Relative importance of input variables in neural networks using Garson's algorit
 The weights that connect variables in a neural network are partially analogous to parameter coefficients in a standard regression model and can be used to describe relationships between variables. The weights dictate the relative influence of information that is processed in the network such that input variables that are not relevant in their correlation with a response variable are suppressed by the weights. The opposite effect is seen for weights assigned to explanatory variables that have strong, positive associations with a response variable. An obvious difference between a neural network and a regression model is that the number of weights is excessive in the former case. This characteristic is advantageous in that it makes neural networks very flexible for modeling non-linear functions with multiple interactions, although interpretation of the effects of specific variables is of course challenging.
 
 A method described in Garson 1991 (also see Goh 1995) identifies the relative importance of explanatory variables for specific response variables in a supervised neural network by deconstructing the model weights. The basic idea is that the relative importance (or strength of association) of a specific explanatory variable for a specific response variable can be determined by identifying all weighted connections between the nodes of interest. That is, all weights connecting the specific input node that pass through the hidden layer to the specific response variable are identified. This is repeated for all other explanatory variables until the analyst has a list of all weights that are specific to each input variable. The connections are tallied for each input node and scaled relative to all other inputs. A single value is obtained for each explanatory variable that describes the relationship with response variable in the model (see the appendix in Goh 1995 for a more detailed description). The original algorithm presented in Garson 1991 indicated relative importance as the absolute magnitude from zero to one such the direction of the response could not be determined.
+
+Misleading results may be produced if the neural network was created with a skip-layer using \code{skip = TRUE} with the \code{\link[nnet]{nnet}} function.  Garson's algorithm does not describe the effects of skip layer connections on estimates of variable importance.  As such, these values are removed prior to estimating variable importance.
 }
 \examples{
 ## using numeric input

diff --git a/man/neuralweights.Rd b/man/neuralweights.Rd
@@ -35,6 +35,7 @@ Get weights for a neural network in an organized list by extracting values from
 }
 \details{
 Each element of the returned list is named using the construct 'layer node', e.g. 'out 1' is the first node of the output layer.  Hidden layers are named using three values for instances with more than one hidden layer, e.g., 'hidden 1 1' is the first node in the first hidden layer, 'hidden 1 2' is the second node in the first hidden layer, 'hidden 2 1' is the first node in the second hidden layer, etc.  The values in each element of the list represent the weights entering the specific node from the preceding layer in sequential order, starting with the bias layer if applicable.
+The function will remove direct weight connections between input and output layers if the neural network was created with a skip-layer using \code{skip = TRUE} with the \code{\link[nnet]{nnet}} function.  This may produce misleading results when evaluating variable performance with the \code{\link{garson}} function.
 }
 \examples{
 data(neuraldat)