This article contains a list of models that you could use in NetCoupler as well as some published real-world examples of it being used.
There are some caveats to most of these examples, except for when a section explicitly indicates otherwise:
Across the different models, there are some general features, which is described in more depth in the Getting Started article.
First load up the packages.
Pre-processing by standardizing variables, since large differences in the values of variables can impact on the results of the algorithm.
standardized_data <- simulated_data %>% nc_standardize(starts_with("metabolite"))
Estimating the network structure to identify links between the metabolic variables. This network needs to be converted into an edge table that has two columns, one for the
source_node and another for the
metabolite_network <- simulated_data %>% nc_standardize(starts_with("metabolite")) %>% nc_estimate_network(starts_with("metabolite")) edge_table <- as_edge_tbl(metabolite_network) edge_table
If adjusting for confounders in the main models, these also need to be included when estimating the network. To do this, the metabolic data needs to be regressed on to the variables that will be standardized.
metabolite_network <- simulated_data %>% nc_standardize(starts_with("metabolite"), regressed_on = "age") %>% nc_estimate_network(starts_with("metabolite"))
Let’s revisit this image:
All types of models can be used for either the left hand side of this graph (the exposure side) or the right hand side (the outcome). If we want to estimate the links on the exposure side, where we’re interested in how a variable might influence the network, we would use the
standardized_data %>% nc_estimate_exposure_links( edge_tbl = edge_table, exposure = "exposure", model_function = lm )
If we are interested the outcome side, where we want to know how the network might influence the outcome, we would use the
standardized_data %>% nc_estimate_outcome_links( edge_tbl = edge_table, outcome = "outcome_continuous", model_function = lm )
This is the easiest and will probably be the most commonly used modeling method used when running NetCoupler. Adding additional arguments to settings to the
glm()) function can be done by using the
lm_results <- standardized_data %>% nc_estimate_outcome_links( edge_tbl = edge_table, outcome = "outcome_continuous", model_function = lm )
Probably the second most common model would be the binary classic logistic regression. Unlike the linear regression modeling above, we need to use the
model_arg_list argument in order to tell
glm() to use the binomial method for model estimation.
glm_bin_results <- standardized_data %>% nc_estimate_outcome_links( edge_tbl = edge_table, outcome = "outcome_binary", model_function = glm, model_arg_list = list(family = binomial), exponentiate = TRUE )
With Cox models, the response/y variable usually needs to be a
survival::Surv() object. While you can use this function in the
exposure argument of the
nc_estimate_exposure_links() functions, to keep the code and output a bit cleaner, we recommend creating the survival object beforehand with
library(survival) cox_surv_data <- standardized_data %>% mutate(surv_object = Surv( time = age, time2 = age + outcome_event_time, event = outcome_binary )) coxph_results <- cox_surv_data %>% nc_estimate_outcome_links( edge_tbl = edge_table, outcome = "surv_object", # Can also use Surv directly. # outcome = "Surv(time = time_start, time2 = time_end, event = outcome_binary)", model_function = survival::coxph )
You might want to add a clustering to calculate robust standard errors or to add a strata variable. You add these directly to the adjustment variable argument.
coxph_results_cluster <- cox_surv_data %>% mutate(age = as.integer(age)) %>% nc_estimate_outcome_links( edge_tbl = edge_table, outcome = "surv_object", adjustment_vars = c("strata(age)", "cluster(id)"), model_function = survival::coxph )
Wittenbecher, C., R. Cuadrat, L. Johnston, F. Eichelmann, S. Jäger, O. Kuxhaus, M. Prada, et al. 2022. “Dihydroceramide- and Ceramide-Profiling Provides Insights into Human Cardiometabolic Disease Etiology.” Nature Communications 13 (1). https://doi.org/10.1038/s41467-022-28496-1.
Wittenbecher, Clemens. 2017. “Linking Whole-Grain Bread, Coffee, and Red Meat to the Risk of Type 2 Diabetes.” Doctoralthesis, Universität Potsdam. https://nbn-resolving.org/urn:nbn:de:kobv:517-opus4-404592.