Compute model estimates between an external (exposure or outcome) variable and a network.
Source:R/estimate-links.R
nc_estimate_links.Rd
This is the main function that identifies potential links between external factors and the network. There are two functions to estimate and classify links:
nc_estimate_exposure_links()
: Computes the model estimates for the exposure side.nc_estimate_outcome_links()
: Computes the model estimates for the exposure side.
Usage
nc_estimate_exposure_links(
data,
edge_tbl,
exposure,
adjustment_vars = NA,
model_function,
model_arg_list = NULL,
exponentiate = FALSE,
classify_option_list = classify_options()
)
nc_estimate_outcome_links(
data,
edge_tbl,
outcome,
adjustment_vars = NA,
model_function,
model_arg_list = NULL,
exponentiate = FALSE,
classify_option_list = classify_options()
)
Arguments
- data
The data.frame or tibble that contains the variables of interest, including the variables used to make the network.
- edge_tbl
Output graph object from
nc_estimate_network()
, converted to an edge table usingas_edge_tbl()
.- exposure, outcome
Character. The exposure or outcome variable of interest.
- adjustment_vars
Optional. Variables to adjust for in the models.
- model_function
A function for the model to use (e.g.
stats::lm()
,stats::glm()
, survival::coxph()). Can be any model as long as the function has the argumentsformula
anddata
. Type in the model function as a bare object (without()
, for instance aslm
).- model_arg_list
Optional. A list containing the named arguments that will be passed to the model function. A simple example would be
list(family = binomial(link = "logit"))
to specify that theglm
model is a logistic model and not a linear one. See the examples for more on the usage.- exponentiate
Logical. Whether to exponentiate the log estimates, as computed with e.g. logistic regression models.
- classify_option_list
A list with classification options for direct, ambigious, or no effects. Used with the
classify_options()
function with the arguments:single_metabolite_threshold
: Default of 0.05. P-values from models with only the index metabolite (no neighbour adjustment) are classified as effects if below this threshold. For larger sample sizes and networks, we recommend lowering the threshold to reduce risk of false positives.network_threshold
: Default of 0.1. P-values from any models that have direct neighbour adjustments are classified as effects if below this threshold. This is assumed as a one-sided p-value threshold. Like the threshold above, a lower value should be used for larger sample sizes and networks.direct_effect_adjustment
: Default is NA. After running the algorithm once, sometimes it's useful to adjust for the direct effects identified to confirm whether other links exist.
Value
Outputs a tibble that contains the model estimates from either the exposure or outcome side of the network as well as the effect classification. Each row represents the "no neighbour node adjusted" model and has the results for the outcome/exposure to index node pathway. Columns for the outcome are:
outcome
orexposure
: The name of the variable used as the external variable.index_node
: The name of the metabolite used as the index node from the network. In combination with the outcome/exposure variable, they represent the individual model used for the classification.estimate
: The estimate from the outcome/exposure and index node model.std_error
: The standard error from the outcome/exposure and index node model.fdr_p_value
: The False Discovery Rate-adjusted p-value from the outcome/exposure and index node model.effect
: The NetCoupler classified effect between the index node and the outcome/exposure. Effects are classified as "direct" (there is a probable link based on the given thresholds), "ambigious" (there is a potential link but not all thresholds were passed), and "none" (no potential link seen).
The tibble output also has an attribute that contains all the models
generated before classification. Access it with attr(output, "all_models_df")
.
See also
vignette("examples")
article has more
details on how to use NetCoupler with different models.
Examples
standardized_data <- simulated_data %>%
nc_standardize(starts_with("metabolite"))
metabolite_network <- simulated_data %>%
nc_standardize(starts_with("metabolite"),
regressed_on = "age") %>%
nc_estimate_network(starts_with("metabolite"))
edge_table <- as_edge_tbl(metabolite_network)
results <- standardized_data %>%
nc_estimate_exposure_links(
edge_tbl = edge_table,
exposure = "exposure",
model_function = lm
)
results
#> # A tibble: 12 × 6
#> exposure index_node estimate std_error fdr_p_value effect
#> * <chr> <chr> <dbl> <dbl> <dbl> <chr>
#> 1 exposure metabolite_1 0.173 0.0228 0 direct
#> 2 exposure metabolite_10 0.318 0.0219 0 direct
#> 3 exposure metabolite_11 0.0543 0.0232 0.0409 ambiguous
#> 4 exposure metabolite_12 0.0242 0.0231 0.380 none
#> 5 exposure metabolite_2 -0.0430 0.0231 0.106 ambiguous
#> 6 exposure metabolite_3 0.0411 0.0231 0.123 ambiguous
#> 7 exposure metabolite_4 0.00344 0.0232 0.920 none
#> 8 exposure metabolite_5 0.0479 0.0232 0.0717 ambiguous
#> 9 exposure metabolite_6 -0.0189 0.0230 0.506 none
#> 10 exposure metabolite_7 -0.162 0.0229 0 direct
#> 11 exposure metabolite_8 -0.355 0.0216 0 direct
#> 12 exposure metabolite_9 0.0571 0.0230 0.0292 ambiguous
# Get results of all models used prior to classification