vignettes/articles/examples.Rmd
examples.Rmd
This article contains a list of models that you could use in NetCoupler as well as some published real-world examples of it being used.
There are some caveats to most of these examples, except for when a section explicitly indicates otherwise:
Across the different models, there are some general features, which is described in more depth in the Getting Started article.
First load up the packages.
Pre-processing by standardizing variables, since large differences in the values of variables can impact on the results of the algorithm.
standardized_data <- simulated_data %>%
nc_standardize(starts_with("metabolite"))
Estimating the network structure to identify links between the metabolic variables. This network needs to be converted into an edge table that has two columns, one for the source_node
and another for the target_node
.
metabolite_network <- simulated_data %>%
nc_standardize(starts_with("metabolite")) %>%
nc_estimate_network(starts_with("metabolite"))
edge_table <- as_edge_tbl(metabolite_network)
edge_table
If adjusting for confounders in the main models, these also need to be included when estimating the network. To do this, the metabolic data needs to be regressed on to the variables that will be standardized.
metabolite_network <- simulated_data %>%
nc_standardize(starts_with("metabolite"),
regressed_on = "age") %>%
nc_estimate_network(starts_with("metabolite"))
Let’s revisit this image:
All types of models can be used for either the left hand side of this graph (the exposure side) or the right hand side (the outcome). If we want to estimate the links on the exposure side, where we’re interested in how a variable might influence the network, we would use the nc_estimate_exposure_links()
function.
standardized_data %>%
nc_estimate_exposure_links(
edge_tbl = edge_table,
exposure = "exposure",
model_function = lm
)
If we are interested the outcome side, where we want to know how the network might influence the outcome, we would use the nc_estimate_outcome_links()
function.
standardized_data %>%
nc_estimate_outcome_links(
edge_tbl = edge_table,
outcome = "outcome_continuous",
model_function = lm
)
This is the easiest and will probably be the most commonly used modeling method used when running NetCoupler. Adding additional arguments to settings to the lm()
(or glm()
) function can be done by using the model_args_list
argument.
lm_results <- standardized_data %>%
nc_estimate_outcome_links(
edge_tbl = edge_table,
outcome = "outcome_continuous",
model_function = lm
)
Probably the second most common model would be the binary classic logistic regression. Unlike the linear regression modeling above, we need to use the model_arg_list
argument in order to tell glm()
to use the binomial method for model estimation.
glm_bin_results <- standardized_data %>%
nc_estimate_outcome_links(
edge_tbl = edge_table,
outcome = "outcome_binary",
model_function = glm,
model_arg_list = list(family = binomial),
exponentiate = TRUE
)
With Cox models, the response/y variable usually needs to be a survival::Surv()
object. While you can use this function in the outcome
/exposure
argument of the nc_estimate_outcome_links()
or nc_estimate_exposure_links()
functions, to keep the code and output a bit cleaner, we recommend creating the survival object beforehand with mutate()
.
library(survival)
cox_surv_data <- standardized_data %>%
mutate(surv_object = Surv(
time = age,
time2 = age + outcome_event_time,
event = outcome_binary
))
coxph_results <- cox_surv_data %>%
nc_estimate_outcome_links(
edge_tbl = edge_table,
outcome = "surv_object",
# Can also use Surv directly.
# outcome = "Surv(time = time_start, time2 = time_end, event = outcome_binary)",
model_function = survival::coxph
)
You might want to add a clustering to calculate robust standard errors or to add a strata variable. You add these directly to the adjustment variable argument.
coxph_results_cluster <- cox_surv_data %>%
mutate(age = as.integer(age)) %>%
nc_estimate_outcome_links(
edge_tbl = edge_table,
outcome = "surv_object",
adjustment_vars = c("strata(age)", "cluster(id)"),
model_function = survival::coxph
)
Wittenbecher, C., R. Cuadrat, L. Johnston, F. Eichelmann, S. Jäger, O. Kuxhaus, M. Prada, et al. 2022. “Dihydroceramide- and Ceramide-Profiling Provides Insights into Human Cardiometabolic Disease Etiology.” Nature Communications 13 (1). https://doi.org/10.1038/s41467-022-28496-1.
Wittenbecher, Clemens. 2017. “Linking Whole-Grain Bread, Coffee, and Red Meat to the Risk of Type 2 Diabetes.” Doctoralthesis, Universität Potsdam. https://nbn-resolving.org/urn:nbn:de:kobv:517-opus4-404592.