R function for Dynamic Ordinary Least Squares regression











up vote
6
down vote

favorite












SHORT VERSION: Made a workflow function, want a better output with methods.



General Background



I have been working on a package for some time to eliminate the tedium of preliminary time series econometrics (unit root testing, cointegration, model building and lag/lead selection). I am aware of many alternatives out there but I am still happy with the direction I am going for now. Based off of Stock & Watson.



Function Workflow (simple version)



1. Takes a user-specified cointegrating relationship (written as a formula): $$Y_t = alpha_t + X_t$$ in R: Y ~ 1 + X
where the dependent and independent variables are all nonstationary and alpha is the (optional) constant.



2. Creates the formula
$$Y_t = alpha_t + X_t + sum_{i = -k}^k Delta X_{t-i}$$
in R: Y ~ 1 + X + L(diff(X),-k:k) using dynlm's handy L() capabilities (although I do not use D() for diff(). Where k is the maximum lag/lead value.



3. The rest of the function runs different k values and selects the best model using the BIC function (but any model selection function can be put in as an argument). It also computes HAC estimated errors using Newey West.



Code
Here is the function. I am very open to criticism/critiques/comments.



Requires packages:



library(dynlm)
library(lmtest)
library(sandwich)

buildDOLS <- function (coint_formula, data, fixedk = NULL, robusterrors = TRUE, selection = BIC){
# checks
stopifnot(is.ts(data)) # time series data
stopifnot(is.null(fixedk)|is.numeric(fixedk)) # fixed k either is null or is numeric
stopifnot(is.function(selection)) # selection method is a function (should work on a function)
# Formula creation
ff <- coint_formula
all_names <- dimnames(attr(terms(ff), "factors")) # X and Y variables
y_names <- all_names[[1]][!(all_names[[1]] %in% all_names[[2]])]
x_names <- all_names[[2]][all_names[[2]] %in% colnames(data)]
# Dynamic Ordinary Least Squares formulation
ff_LHS <- y_names
ff_RHS <- paste(c(ifelse(attr(terms(ff), "intercept") == 1, "1", "-1"), # constant
x_names, # input variables
paste0("L(diff(", x_names, "),-k:k)")), # sum of lead/lagged differences of x variables
collapse=" + ")
ff_k <- paste(ff_LHS, "~", ff_RHS)
# if k (the maximum number of lags/leads) was not fixed, use a default value
k <- ifelse(is.null(fixedk), floor(dim(data)[1]^(1/3)/2), fixedk)
# run the model. If k was fixed, this is the final model:
DOLS_k <- dynlm(formula(ff_k), data = data)
# If k was not fixed, DOLS_k will be used to keep constant the start and end dates during model selection
if(is.null(fixedk)){
# Use any selection function that is indicated in the selection argument
k_select <- sapply(1:k, function(k) match.fun(FUN = selection)(dynlm(formula(ff_k),
data = data,
start = start(DOLS_k),
end = end(DOLS_k))))
# only re-estimate the model if k_select differs from k to be efficient
if(k != which.min(k_select)){
k <- which.min(k_select)
DOLS_k <- dynlm(formula(ff_k), data = data)
}
# save the selection matrix results inside the model
DOLS_k$selection <- cbind(1:k, k_select, DOLS_k$df + length(DOLS_k$coeff),
start(DOLS_k)[1], end(DOLS_k)[1])
colnames(DOLS_k$selection) <- c("# of lags/leads (k)", deparse(substitute(selection)),"#Obs",
"StartDate", "EndDate")
}
DOLS_k$k <- k # save the lag used inside the model
# save the HAC estimated errors inside the model
if(robusterrors) DOLS_k$HAC <- lmtest::coeftest(DOLS_k, vcov = sandwich::NeweyWest(DOLS_k, lag = k))
# rewriting the call function to be a run on its own
DOLS_k$call <- as.call(c(quote(dynlm),
formula = formula(gsub("-k:k", paste0("-",k,":",k), ff_k)),
data = substitute(data)))
class(DOLS_k) <- append(class(DOLS_k), "workflow")
DOLS_k
}


Example Case



Say we have the cointegrating relationship: MB ~ RTPS from lmtest::valueofstocks. (The interpretation is not important here).



dols <- buildDOLS(coint_formula = MB ~ RTPS, data = valueofstocks, fixedk = NULL, robusterrors = T, selection = BIC)
dols
summary(dols) # the Non-HAC, biased standard errors but model fit results
dols$selection # shows the model selection process
dols$k # the result of that process
dols$call # a stand-alone call function to replicate results with dynlm
dols$HAC # the HAC estimated standard errors/significance values


Output



The output is not ideal as it is hidden within the model. I have written a few methods but would like some feedback. Does it clearly show what is interesting to the user? Is it too long? etc.



Print Method Shows model selection and a cleaner eval



print.workflow <- function(x) {
if(!is.null(x$selection)){
cat("n Selecting k for Modeln")
cat("______________________________n")
print(x$selection)
}
cat("n Model with k =",x$k,"n")
cat("______________________________")
print(eval(x$call))
}
dols #** Is it too long? Does it give the important information?


Summary Method Shows HAC estimated errors automatically



summary.workflow <- function(x, ...){
if("HAC" %in% names(x)){ # if the function has a robusterrors arg
addHAC <- NextMethod(x) # alter the summary coeffs
addHAC$coefficients <- x$HAC
cat("*Summary table depicts HAC estimated errors found by:n")
cat(paste0("lmtest::coeftest(model, vcov = sandwich::NeweyWest(model, lag = ",x$k,"))n"))
addHAC
} else {
warning("tSummary table does not depict HAC estimated errorsntPlease indicate the buildDOLS robusterrors argument to be TRUE")
print(NextMethod(x))
}
}
summary(dols)


Please let me know what you think!










share|improve this question




























    up vote
    6
    down vote

    favorite












    SHORT VERSION: Made a workflow function, want a better output with methods.



    General Background



    I have been working on a package for some time to eliminate the tedium of preliminary time series econometrics (unit root testing, cointegration, model building and lag/lead selection). I am aware of many alternatives out there but I am still happy with the direction I am going for now. Based off of Stock & Watson.



    Function Workflow (simple version)



    1. Takes a user-specified cointegrating relationship (written as a formula): $$Y_t = alpha_t + X_t$$ in R: Y ~ 1 + X
    where the dependent and independent variables are all nonstationary and alpha is the (optional) constant.



    2. Creates the formula
    $$Y_t = alpha_t + X_t + sum_{i = -k}^k Delta X_{t-i}$$
    in R: Y ~ 1 + X + L(diff(X),-k:k) using dynlm's handy L() capabilities (although I do not use D() for diff(). Where k is the maximum lag/lead value.



    3. The rest of the function runs different k values and selects the best model using the BIC function (but any model selection function can be put in as an argument). It also computes HAC estimated errors using Newey West.



    Code
    Here is the function. I am very open to criticism/critiques/comments.



    Requires packages:



    library(dynlm)
    library(lmtest)
    library(sandwich)

    buildDOLS <- function (coint_formula, data, fixedk = NULL, robusterrors = TRUE, selection = BIC){
    # checks
    stopifnot(is.ts(data)) # time series data
    stopifnot(is.null(fixedk)|is.numeric(fixedk)) # fixed k either is null or is numeric
    stopifnot(is.function(selection)) # selection method is a function (should work on a function)
    # Formula creation
    ff <- coint_formula
    all_names <- dimnames(attr(terms(ff), "factors")) # X and Y variables
    y_names <- all_names[[1]][!(all_names[[1]] %in% all_names[[2]])]
    x_names <- all_names[[2]][all_names[[2]] %in% colnames(data)]
    # Dynamic Ordinary Least Squares formulation
    ff_LHS <- y_names
    ff_RHS <- paste(c(ifelse(attr(terms(ff), "intercept") == 1, "1", "-1"), # constant
    x_names, # input variables
    paste0("L(diff(", x_names, "),-k:k)")), # sum of lead/lagged differences of x variables
    collapse=" + ")
    ff_k <- paste(ff_LHS, "~", ff_RHS)
    # if k (the maximum number of lags/leads) was not fixed, use a default value
    k <- ifelse(is.null(fixedk), floor(dim(data)[1]^(1/3)/2), fixedk)
    # run the model. If k was fixed, this is the final model:
    DOLS_k <- dynlm(formula(ff_k), data = data)
    # If k was not fixed, DOLS_k will be used to keep constant the start and end dates during model selection
    if(is.null(fixedk)){
    # Use any selection function that is indicated in the selection argument
    k_select <- sapply(1:k, function(k) match.fun(FUN = selection)(dynlm(formula(ff_k),
    data = data,
    start = start(DOLS_k),
    end = end(DOLS_k))))
    # only re-estimate the model if k_select differs from k to be efficient
    if(k != which.min(k_select)){
    k <- which.min(k_select)
    DOLS_k <- dynlm(formula(ff_k), data = data)
    }
    # save the selection matrix results inside the model
    DOLS_k$selection <- cbind(1:k, k_select, DOLS_k$df + length(DOLS_k$coeff),
    start(DOLS_k)[1], end(DOLS_k)[1])
    colnames(DOLS_k$selection) <- c("# of lags/leads (k)", deparse(substitute(selection)),"#Obs",
    "StartDate", "EndDate")
    }
    DOLS_k$k <- k # save the lag used inside the model
    # save the HAC estimated errors inside the model
    if(robusterrors) DOLS_k$HAC <- lmtest::coeftest(DOLS_k, vcov = sandwich::NeweyWest(DOLS_k, lag = k))
    # rewriting the call function to be a run on its own
    DOLS_k$call <- as.call(c(quote(dynlm),
    formula = formula(gsub("-k:k", paste0("-",k,":",k), ff_k)),
    data = substitute(data)))
    class(DOLS_k) <- append(class(DOLS_k), "workflow")
    DOLS_k
    }


    Example Case



    Say we have the cointegrating relationship: MB ~ RTPS from lmtest::valueofstocks. (The interpretation is not important here).



    dols <- buildDOLS(coint_formula = MB ~ RTPS, data = valueofstocks, fixedk = NULL, robusterrors = T, selection = BIC)
    dols
    summary(dols) # the Non-HAC, biased standard errors but model fit results
    dols$selection # shows the model selection process
    dols$k # the result of that process
    dols$call # a stand-alone call function to replicate results with dynlm
    dols$HAC # the HAC estimated standard errors/significance values


    Output



    The output is not ideal as it is hidden within the model. I have written a few methods but would like some feedback. Does it clearly show what is interesting to the user? Is it too long? etc.



    Print Method Shows model selection and a cleaner eval



    print.workflow <- function(x) {
    if(!is.null(x$selection)){
    cat("n Selecting k for Modeln")
    cat("______________________________n")
    print(x$selection)
    }
    cat("n Model with k =",x$k,"n")
    cat("______________________________")
    print(eval(x$call))
    }
    dols #** Is it too long? Does it give the important information?


    Summary Method Shows HAC estimated errors automatically



    summary.workflow <- function(x, ...){
    if("HAC" %in% names(x)){ # if the function has a robusterrors arg
    addHAC <- NextMethod(x) # alter the summary coeffs
    addHAC$coefficients <- x$HAC
    cat("*Summary table depicts HAC estimated errors found by:n")
    cat(paste0("lmtest::coeftest(model, vcov = sandwich::NeweyWest(model, lag = ",x$k,"))n"))
    addHAC
    } else {
    warning("tSummary table does not depict HAC estimated errorsntPlease indicate the buildDOLS robusterrors argument to be TRUE")
    print(NextMethod(x))
    }
    }
    summary(dols)


    Please let me know what you think!










    share|improve this question


























      up vote
      6
      down vote

      favorite









      up vote
      6
      down vote

      favorite











      SHORT VERSION: Made a workflow function, want a better output with methods.



      General Background



      I have been working on a package for some time to eliminate the tedium of preliminary time series econometrics (unit root testing, cointegration, model building and lag/lead selection). I am aware of many alternatives out there but I am still happy with the direction I am going for now. Based off of Stock & Watson.



      Function Workflow (simple version)



      1. Takes a user-specified cointegrating relationship (written as a formula): $$Y_t = alpha_t + X_t$$ in R: Y ~ 1 + X
      where the dependent and independent variables are all nonstationary and alpha is the (optional) constant.



      2. Creates the formula
      $$Y_t = alpha_t + X_t + sum_{i = -k}^k Delta X_{t-i}$$
      in R: Y ~ 1 + X + L(diff(X),-k:k) using dynlm's handy L() capabilities (although I do not use D() for diff(). Where k is the maximum lag/lead value.



      3. The rest of the function runs different k values and selects the best model using the BIC function (but any model selection function can be put in as an argument). It also computes HAC estimated errors using Newey West.



      Code
      Here is the function. I am very open to criticism/critiques/comments.



      Requires packages:



      library(dynlm)
      library(lmtest)
      library(sandwich)

      buildDOLS <- function (coint_formula, data, fixedk = NULL, robusterrors = TRUE, selection = BIC){
      # checks
      stopifnot(is.ts(data)) # time series data
      stopifnot(is.null(fixedk)|is.numeric(fixedk)) # fixed k either is null or is numeric
      stopifnot(is.function(selection)) # selection method is a function (should work on a function)
      # Formula creation
      ff <- coint_formula
      all_names <- dimnames(attr(terms(ff), "factors")) # X and Y variables
      y_names <- all_names[[1]][!(all_names[[1]] %in% all_names[[2]])]
      x_names <- all_names[[2]][all_names[[2]] %in% colnames(data)]
      # Dynamic Ordinary Least Squares formulation
      ff_LHS <- y_names
      ff_RHS <- paste(c(ifelse(attr(terms(ff), "intercept") == 1, "1", "-1"), # constant
      x_names, # input variables
      paste0("L(diff(", x_names, "),-k:k)")), # sum of lead/lagged differences of x variables
      collapse=" + ")
      ff_k <- paste(ff_LHS, "~", ff_RHS)
      # if k (the maximum number of lags/leads) was not fixed, use a default value
      k <- ifelse(is.null(fixedk), floor(dim(data)[1]^(1/3)/2), fixedk)
      # run the model. If k was fixed, this is the final model:
      DOLS_k <- dynlm(formula(ff_k), data = data)
      # If k was not fixed, DOLS_k will be used to keep constant the start and end dates during model selection
      if(is.null(fixedk)){
      # Use any selection function that is indicated in the selection argument
      k_select <- sapply(1:k, function(k) match.fun(FUN = selection)(dynlm(formula(ff_k),
      data = data,
      start = start(DOLS_k),
      end = end(DOLS_k))))
      # only re-estimate the model if k_select differs from k to be efficient
      if(k != which.min(k_select)){
      k <- which.min(k_select)
      DOLS_k <- dynlm(formula(ff_k), data = data)
      }
      # save the selection matrix results inside the model
      DOLS_k$selection <- cbind(1:k, k_select, DOLS_k$df + length(DOLS_k$coeff),
      start(DOLS_k)[1], end(DOLS_k)[1])
      colnames(DOLS_k$selection) <- c("# of lags/leads (k)", deparse(substitute(selection)),"#Obs",
      "StartDate", "EndDate")
      }
      DOLS_k$k <- k # save the lag used inside the model
      # save the HAC estimated errors inside the model
      if(robusterrors) DOLS_k$HAC <- lmtest::coeftest(DOLS_k, vcov = sandwich::NeweyWest(DOLS_k, lag = k))
      # rewriting the call function to be a run on its own
      DOLS_k$call <- as.call(c(quote(dynlm),
      formula = formula(gsub("-k:k", paste0("-",k,":",k), ff_k)),
      data = substitute(data)))
      class(DOLS_k) <- append(class(DOLS_k), "workflow")
      DOLS_k
      }


      Example Case



      Say we have the cointegrating relationship: MB ~ RTPS from lmtest::valueofstocks. (The interpretation is not important here).



      dols <- buildDOLS(coint_formula = MB ~ RTPS, data = valueofstocks, fixedk = NULL, robusterrors = T, selection = BIC)
      dols
      summary(dols) # the Non-HAC, biased standard errors but model fit results
      dols$selection # shows the model selection process
      dols$k # the result of that process
      dols$call # a stand-alone call function to replicate results with dynlm
      dols$HAC # the HAC estimated standard errors/significance values


      Output



      The output is not ideal as it is hidden within the model. I have written a few methods but would like some feedback. Does it clearly show what is interesting to the user? Is it too long? etc.



      Print Method Shows model selection and a cleaner eval



      print.workflow <- function(x) {
      if(!is.null(x$selection)){
      cat("n Selecting k for Modeln")
      cat("______________________________n")
      print(x$selection)
      }
      cat("n Model with k =",x$k,"n")
      cat("______________________________")
      print(eval(x$call))
      }
      dols #** Is it too long? Does it give the important information?


      Summary Method Shows HAC estimated errors automatically



      summary.workflow <- function(x, ...){
      if("HAC" %in% names(x)){ # if the function has a robusterrors arg
      addHAC <- NextMethod(x) # alter the summary coeffs
      addHAC$coefficients <- x$HAC
      cat("*Summary table depicts HAC estimated errors found by:n")
      cat(paste0("lmtest::coeftest(model, vcov = sandwich::NeweyWest(model, lag = ",x$k,"))n"))
      addHAC
      } else {
      warning("tSummary table does not depict HAC estimated errorsntPlease indicate the buildDOLS robusterrors argument to be TRUE")
      print(NextMethod(x))
      }
      }
      summary(dols)


      Please let me know what you think!










      share|improve this question















      SHORT VERSION: Made a workflow function, want a better output with methods.



      General Background



      I have been working on a package for some time to eliminate the tedium of preliminary time series econometrics (unit root testing, cointegration, model building and lag/lead selection). I am aware of many alternatives out there but I am still happy with the direction I am going for now. Based off of Stock & Watson.



      Function Workflow (simple version)



      1. Takes a user-specified cointegrating relationship (written as a formula): $$Y_t = alpha_t + X_t$$ in R: Y ~ 1 + X
      where the dependent and independent variables are all nonstationary and alpha is the (optional) constant.



      2. Creates the formula
      $$Y_t = alpha_t + X_t + sum_{i = -k}^k Delta X_{t-i}$$
      in R: Y ~ 1 + X + L(diff(X),-k:k) using dynlm's handy L() capabilities (although I do not use D() for diff(). Where k is the maximum lag/lead value.



      3. The rest of the function runs different k values and selects the best model using the BIC function (but any model selection function can be put in as an argument). It also computes HAC estimated errors using Newey West.



      Code
      Here is the function. I am very open to criticism/critiques/comments.



      Requires packages:



      library(dynlm)
      library(lmtest)
      library(sandwich)

      buildDOLS <- function (coint_formula, data, fixedk = NULL, robusterrors = TRUE, selection = BIC){
      # checks
      stopifnot(is.ts(data)) # time series data
      stopifnot(is.null(fixedk)|is.numeric(fixedk)) # fixed k either is null or is numeric
      stopifnot(is.function(selection)) # selection method is a function (should work on a function)
      # Formula creation
      ff <- coint_formula
      all_names <- dimnames(attr(terms(ff), "factors")) # X and Y variables
      y_names <- all_names[[1]][!(all_names[[1]] %in% all_names[[2]])]
      x_names <- all_names[[2]][all_names[[2]] %in% colnames(data)]
      # Dynamic Ordinary Least Squares formulation
      ff_LHS <- y_names
      ff_RHS <- paste(c(ifelse(attr(terms(ff), "intercept") == 1, "1", "-1"), # constant
      x_names, # input variables
      paste0("L(diff(", x_names, "),-k:k)")), # sum of lead/lagged differences of x variables
      collapse=" + ")
      ff_k <- paste(ff_LHS, "~", ff_RHS)
      # if k (the maximum number of lags/leads) was not fixed, use a default value
      k <- ifelse(is.null(fixedk), floor(dim(data)[1]^(1/3)/2), fixedk)
      # run the model. If k was fixed, this is the final model:
      DOLS_k <- dynlm(formula(ff_k), data = data)
      # If k was not fixed, DOLS_k will be used to keep constant the start and end dates during model selection
      if(is.null(fixedk)){
      # Use any selection function that is indicated in the selection argument
      k_select <- sapply(1:k, function(k) match.fun(FUN = selection)(dynlm(formula(ff_k),
      data = data,
      start = start(DOLS_k),
      end = end(DOLS_k))))
      # only re-estimate the model if k_select differs from k to be efficient
      if(k != which.min(k_select)){
      k <- which.min(k_select)
      DOLS_k <- dynlm(formula(ff_k), data = data)
      }
      # save the selection matrix results inside the model
      DOLS_k$selection <- cbind(1:k, k_select, DOLS_k$df + length(DOLS_k$coeff),
      start(DOLS_k)[1], end(DOLS_k)[1])
      colnames(DOLS_k$selection) <- c("# of lags/leads (k)", deparse(substitute(selection)),"#Obs",
      "StartDate", "EndDate")
      }
      DOLS_k$k <- k # save the lag used inside the model
      # save the HAC estimated errors inside the model
      if(robusterrors) DOLS_k$HAC <- lmtest::coeftest(DOLS_k, vcov = sandwich::NeweyWest(DOLS_k, lag = k))
      # rewriting the call function to be a run on its own
      DOLS_k$call <- as.call(c(quote(dynlm),
      formula = formula(gsub("-k:k", paste0("-",k,":",k), ff_k)),
      data = substitute(data)))
      class(DOLS_k) <- append(class(DOLS_k), "workflow")
      DOLS_k
      }


      Example Case



      Say we have the cointegrating relationship: MB ~ RTPS from lmtest::valueofstocks. (The interpretation is not important here).



      dols <- buildDOLS(coint_formula = MB ~ RTPS, data = valueofstocks, fixedk = NULL, robusterrors = T, selection = BIC)
      dols
      summary(dols) # the Non-HAC, biased standard errors but model fit results
      dols$selection # shows the model selection process
      dols$k # the result of that process
      dols$call # a stand-alone call function to replicate results with dynlm
      dols$HAC # the HAC estimated standard errors/significance values


      Output



      The output is not ideal as it is hidden within the model. I have written a few methods but would like some feedback. Does it clearly show what is interesting to the user? Is it too long? etc.



      Print Method Shows model selection and a cleaner eval



      print.workflow <- function(x) {
      if(!is.null(x$selection)){
      cat("n Selecting k for Modeln")
      cat("______________________________n")
      print(x$selection)
      }
      cat("n Model with k =",x$k,"n")
      cat("______________________________")
      print(eval(x$call))
      }
      dols #** Is it too long? Does it give the important information?


      Summary Method Shows HAC estimated errors automatically



      summary.workflow <- function(x, ...){
      if("HAC" %in% names(x)){ # if the function has a robusterrors arg
      addHAC <- NextMethod(x) # alter the summary coeffs
      addHAC$coefficients <- x$HAC
      cat("*Summary table depicts HAC estimated errors found by:n")
      cat(paste0("lmtest::coeftest(model, vcov = sandwich::NeweyWest(model, lag = ",x$k,"))n"))
      addHAC
      } else {
      warning("tSummary table does not depict HAC estimated errorsntPlease indicate the buildDOLS robusterrors argument to be TRUE")
      print(NextMethod(x))
      }
      }
      summary(dols)


      Please let me know what you think!







      object-oriented r statistics






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jul 1 '17 at 17:31

























      asked May 30 '17 at 19:59









      Evan Friedland

      1585




      1585






















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote













          Nice package, looks pretty solid.



          The choice of k, half the cube root, is not obvious. Please cite http://www.ssc.wisc.edu/~kwest/publications/1990/ Automatic Lag Selection eqn. 2.1 (though I still found gamma hat & parameter a bit mysterious).



          The "maximum number of lags/leads" comment is helpful. Could we please promote it to a @param appearing near the function signature, so Roxygen2 / devtools::document() will find it?






          share|improve this answer





















          • Hi J H thanks for looking. I I wonder if you are talking about the package I put this in or just the above code? Also, thank you for tracking down a citation (I myself have not read this paper). Assuming it provides an optimal number of maximum lags to start selecting from, because this is a lag AND lead model, the cube root would tell me 4 lags, when the model is really looking for 2 lags 2 leads (k). Does this make sense? And where would you propose promoting*? I currently have not adapted my package to be built by Roxygen2, only manually changing items on Github (efriedland/friedland)
            – Evan Friedland
            Sep 10 '17 at 21:23












          • I was just looking at the posted code. The paper is not completely transparent, but it does make it clear that the desired exponent will change with kernel type, and 1/3 (or 2/9) is specifically for Bartlett. It sounds like you got the 0.5 coefficient from elsewhere. I suspect 2 + 2 leads / lags is a lot like 4 lags, if there's enough data that we're not worried about the edge cases. By promote I meant list at top where Help can find it, similar to javadoc or python docstrings. Basically I was calling it a useful comment, worthy of being more easily accessed.
            – J_H
            Sep 10 '17 at 21:27













          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "196"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f164554%2fr-function-for-dynamic-ordinary-least-squares-regression%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          0
          down vote













          Nice package, looks pretty solid.



          The choice of k, half the cube root, is not obvious. Please cite http://www.ssc.wisc.edu/~kwest/publications/1990/ Automatic Lag Selection eqn. 2.1 (though I still found gamma hat & parameter a bit mysterious).



          The "maximum number of lags/leads" comment is helpful. Could we please promote it to a @param appearing near the function signature, so Roxygen2 / devtools::document() will find it?






          share|improve this answer





















          • Hi J H thanks for looking. I I wonder if you are talking about the package I put this in or just the above code? Also, thank you for tracking down a citation (I myself have not read this paper). Assuming it provides an optimal number of maximum lags to start selecting from, because this is a lag AND lead model, the cube root would tell me 4 lags, when the model is really looking for 2 lags 2 leads (k). Does this make sense? And where would you propose promoting*? I currently have not adapted my package to be built by Roxygen2, only manually changing items on Github (efriedland/friedland)
            – Evan Friedland
            Sep 10 '17 at 21:23












          • I was just looking at the posted code. The paper is not completely transparent, but it does make it clear that the desired exponent will change with kernel type, and 1/3 (or 2/9) is specifically for Bartlett. It sounds like you got the 0.5 coefficient from elsewhere. I suspect 2 + 2 leads / lags is a lot like 4 lags, if there's enough data that we're not worried about the edge cases. By promote I meant list at top where Help can find it, similar to javadoc or python docstrings. Basically I was calling it a useful comment, worthy of being more easily accessed.
            – J_H
            Sep 10 '17 at 21:27

















          up vote
          0
          down vote













          Nice package, looks pretty solid.



          The choice of k, half the cube root, is not obvious. Please cite http://www.ssc.wisc.edu/~kwest/publications/1990/ Automatic Lag Selection eqn. 2.1 (though I still found gamma hat & parameter a bit mysterious).



          The "maximum number of lags/leads" comment is helpful. Could we please promote it to a @param appearing near the function signature, so Roxygen2 / devtools::document() will find it?






          share|improve this answer





















          • Hi J H thanks for looking. I I wonder if you are talking about the package I put this in or just the above code? Also, thank you for tracking down a citation (I myself have not read this paper). Assuming it provides an optimal number of maximum lags to start selecting from, because this is a lag AND lead model, the cube root would tell me 4 lags, when the model is really looking for 2 lags 2 leads (k). Does this make sense? And where would you propose promoting*? I currently have not adapted my package to be built by Roxygen2, only manually changing items on Github (efriedland/friedland)
            – Evan Friedland
            Sep 10 '17 at 21:23












          • I was just looking at the posted code. The paper is not completely transparent, but it does make it clear that the desired exponent will change with kernel type, and 1/3 (or 2/9) is specifically for Bartlett. It sounds like you got the 0.5 coefficient from elsewhere. I suspect 2 + 2 leads / lags is a lot like 4 lags, if there's enough data that we're not worried about the edge cases. By promote I meant list at top where Help can find it, similar to javadoc or python docstrings. Basically I was calling it a useful comment, worthy of being more easily accessed.
            – J_H
            Sep 10 '17 at 21:27















          up vote
          0
          down vote










          up vote
          0
          down vote









          Nice package, looks pretty solid.



          The choice of k, half the cube root, is not obvious. Please cite http://www.ssc.wisc.edu/~kwest/publications/1990/ Automatic Lag Selection eqn. 2.1 (though I still found gamma hat & parameter a bit mysterious).



          The "maximum number of lags/leads" comment is helpful. Could we please promote it to a @param appearing near the function signature, so Roxygen2 / devtools::document() will find it?






          share|improve this answer












          Nice package, looks pretty solid.



          The choice of k, half the cube root, is not obvious. Please cite http://www.ssc.wisc.edu/~kwest/publications/1990/ Automatic Lag Selection eqn. 2.1 (though I still found gamma hat & parameter a bit mysterious).



          The "maximum number of lags/leads" comment is helpful. Could we please promote it to a @param appearing near the function signature, so Roxygen2 / devtools::document() will find it?







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Sep 10 '17 at 15:55









          J_H

          4,427130




          4,427130












          • Hi J H thanks for looking. I I wonder if you are talking about the package I put this in or just the above code? Also, thank you for tracking down a citation (I myself have not read this paper). Assuming it provides an optimal number of maximum lags to start selecting from, because this is a lag AND lead model, the cube root would tell me 4 lags, when the model is really looking for 2 lags 2 leads (k). Does this make sense? And where would you propose promoting*? I currently have not adapted my package to be built by Roxygen2, only manually changing items on Github (efriedland/friedland)
            – Evan Friedland
            Sep 10 '17 at 21:23












          • I was just looking at the posted code. The paper is not completely transparent, but it does make it clear that the desired exponent will change with kernel type, and 1/3 (or 2/9) is specifically for Bartlett. It sounds like you got the 0.5 coefficient from elsewhere. I suspect 2 + 2 leads / lags is a lot like 4 lags, if there's enough data that we're not worried about the edge cases. By promote I meant list at top where Help can find it, similar to javadoc or python docstrings. Basically I was calling it a useful comment, worthy of being more easily accessed.
            – J_H
            Sep 10 '17 at 21:27




















          • Hi J H thanks for looking. I I wonder if you are talking about the package I put this in or just the above code? Also, thank you for tracking down a citation (I myself have not read this paper). Assuming it provides an optimal number of maximum lags to start selecting from, because this is a lag AND lead model, the cube root would tell me 4 lags, when the model is really looking for 2 lags 2 leads (k). Does this make sense? And where would you propose promoting*? I currently have not adapted my package to be built by Roxygen2, only manually changing items on Github (efriedland/friedland)
            – Evan Friedland
            Sep 10 '17 at 21:23












          • I was just looking at the posted code. The paper is not completely transparent, but it does make it clear that the desired exponent will change with kernel type, and 1/3 (or 2/9) is specifically for Bartlett. It sounds like you got the 0.5 coefficient from elsewhere. I suspect 2 + 2 leads / lags is a lot like 4 lags, if there's enough data that we're not worried about the edge cases. By promote I meant list at top where Help can find it, similar to javadoc or python docstrings. Basically I was calling it a useful comment, worthy of being more easily accessed.
            – J_H
            Sep 10 '17 at 21:27


















          Hi J H thanks for looking. I I wonder if you are talking about the package I put this in or just the above code? Also, thank you for tracking down a citation (I myself have not read this paper). Assuming it provides an optimal number of maximum lags to start selecting from, because this is a lag AND lead model, the cube root would tell me 4 lags, when the model is really looking for 2 lags 2 leads (k). Does this make sense? And where would you propose promoting*? I currently have not adapted my package to be built by Roxygen2, only manually changing items on Github (efriedland/friedland)
          – Evan Friedland
          Sep 10 '17 at 21:23






          Hi J H thanks for looking. I I wonder if you are talking about the package I put this in or just the above code? Also, thank you for tracking down a citation (I myself have not read this paper). Assuming it provides an optimal number of maximum lags to start selecting from, because this is a lag AND lead model, the cube root would tell me 4 lags, when the model is really looking for 2 lags 2 leads (k). Does this make sense? And where would you propose promoting*? I currently have not adapted my package to be built by Roxygen2, only manually changing items on Github (efriedland/friedland)
          – Evan Friedland
          Sep 10 '17 at 21:23














          I was just looking at the posted code. The paper is not completely transparent, but it does make it clear that the desired exponent will change with kernel type, and 1/3 (or 2/9) is specifically for Bartlett. It sounds like you got the 0.5 coefficient from elsewhere. I suspect 2 + 2 leads / lags is a lot like 4 lags, if there's enough data that we're not worried about the edge cases. By promote I meant list at top where Help can find it, similar to javadoc or python docstrings. Basically I was calling it a useful comment, worthy of being more easily accessed.
          – J_H
          Sep 10 '17 at 21:27






          I was just looking at the posted code. The paper is not completely transparent, but it does make it clear that the desired exponent will change with kernel type, and 1/3 (or 2/9) is specifically for Bartlett. It sounds like you got the 0.5 coefficient from elsewhere. I suspect 2 + 2 leads / lags is a lot like 4 lags, if there's enough data that we're not worried about the edge cases. By promote I meant list at top where Help can find it, similar to javadoc or python docstrings. Basically I was calling it a useful comment, worthy of being more easily accessed.
          – J_H
          Sep 10 '17 at 21:27




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Code Review Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f164554%2fr-function-for-dynamic-ordinary-least-squares-regression%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Сан-Квентин

          Алькесар

          Josef Freinademetz