4 Chapter 8: Multinomial Models for Categorical Data in Ecology

Author

Steffen Foerster

4.1 7.1 Beyond Binary: Analyzing Categorical Data with Multiple Outcomes

In previous chapters, we’ve covered models for various types of ecological data: logistic regression for binary outcomes, Poisson and negative binomial regression for count data, and beta regression for proportions. But what about when your response variable has multiple, unordered categories?

Ecological data often involves multiple categorical outcomes

Many ecological phenomena involve choices or states with more than two possible outcomes:

Animal habitat selection among different habitat types
Plant successional stages in forest development
Behavioral states of organisms (e.g., feeding, resting, traveling, socializing)
Pollinator choices among different flower species
Land cover classification in landscape ecology
Feeding guild classification of species

In these cases, standard logistic regression won’t work because it can only handle binary outcomes. This is where multinomial regression models become essential tools in our ecological modeling toolkit.

Let’s load the packages we’ll need:

library(tidyverse)    # for data manipulation and visualization
library(nnet)         # for multinomial logistic regression
library(mlogit)       # for alternative multinomial models

Warning: package 'dfidx' was built under R version 4.3.3

library(ggeffects)    # for plotting model predictions
library(effects)      # for calculating effects
library(emmeans)      # for estimated marginal means
library(performance)  # for model comparison

4.2 7.2 Understanding Multinomial Data in Ecology

Multinomial data involve categorical responses with multiple, unordered categories. Let’s explore a common ecological example: habitat selection by wildlife.

# Set seed for reproducibility
set.seed(123)

# Simulate habitat selection data
n <- 200  # Number of observations

# Predictors: vegetation density and distance to water
veg_density <- runif(n, 0, 10)
water_dist <- runif(n, 0, 5)

# Calculate linear predictors for each habitat type (relative to reference category)
# Forest is reference category (implicitly set to 0)
lp_grassland <- -1 + 0.3 * veg_density - 0.8 * water_dist
lp_wetland <- -2 - 0.5 * veg_density - 1.5 * water_dist

# Calculate probabilities using softmax function
p_forest <- 1 / (1 + exp(lp_grassland) + exp(lp_wetland))
p_grassland <- exp(lp_grassland) / (1 + exp(lp_grassland) + exp(lp_wetland))
p_wetland <- exp(lp_wetland) / (1 + exp(lp_grassland) + exp(lp_wetland))

# Generate multinomial outcomes based on these probabilities
probs <- cbind(p_forest, p_grassland, p_wetland)
habitat_choice <- apply(probs, 1, function(x) sample(c("Forest", "Grassland", "Wetland"), 1, prob = x))

# Create data frame
habitat_data <- data.frame(
  habitat_choice = factor(habitat_choice, levels = c("Forest", "Grassland", "Wetland")),
  veg_density = veg_density,
  water_dist = water_dist
)

# Examine the data
head(habitat_data)

  habitat_choice veg_density water_dist
1      Grassland    2.875775   1.193630
2         Forest    7.883051   4.811795
3      Grassland    4.089769   3.006829
4         Forest    8.830174   2.575149
5      Grassland    9.404673   2.012867
6         Forest    0.455565   4.401233

table(habitat_data$habitat_choice)


   Forest Grassland   Wetland 
      139        61         0

Visualizing multinomial data

Let’s visualize how our predictors relate to habitat choice:

# Create scatter plot with habitat choice
ggplot(habitat_data, aes(x = veg_density, y = water_dist, color = habitat_choice)) +
  geom_point(size = 3, alpha = 0.7) +
  scale_color_viridis_d() +
  labs(
    title = "Habitat Selection Based on Environmental Factors",
    x = "Vegetation Density",
    y = "Distance to Water (km)",
    color = "Habitat Choice"
  ) +
  theme_bw()

Relationship between environmental variables and habitat selection

The plot reveals patterns in habitat selection:

Wetlands are typically chosen when close to water (low water distance)
Grasslands tend to be selected in areas with higher vegetation density and moderate distance to water
Forests appear to be preferred in areas with lower vegetation density and varying distances to water

4.3 7.3 The Multinomial Logistic Regression Model

Just as logistic regression extends linear models to binary outcomes, multinomial logistic regression extends them to multiple categorical outcomes. The model is also called a “softmax regression” or “multiclass logistic regression.”

Multinomial logistic regression fits into our GLM framework as a natural extension of logistic regression, adding the capability to handle multiple unordered categories rather than just two.

4.3.1 Model Structure

Multinomial logistic regression works by:

Choosing one category as the “reference” or “baseline” category
Modeling the log-odds of each other category relative to this reference
Using the same set of predictors for each category comparison

For a response with (K) categories, the model estimates (K-1) separate equations:

\[ \log\left(\frac{P(Y=k)}{P(Y=K)}\right) = \beta_{0k} + \beta_{1k}X_1 + \beta_{2k}X_2 + \ldots + \beta_{pk}X_p \]

Where category (K) is the reference category, and (k = 1, 2, , K-1).

Choice of reference category

The choice of reference category doesn’t affect the model’s predictions, but it does affect the interpretation of coefficients. Typically, you’d choose:

A common or baseline category, or
A category that provides meaningful comparisons, or
The category you want to compare others against

In our habitat example, we’ll use “Forest” as the reference category.

4.3.2 From Log-Odds to Probabilities

The model predicts probabilities for each category using the softmax function:

\[ P(Y = k) = \begin{cases} \frac{e^{\beta_{0k} + \beta_{1k}X_1 + \beta_{2k}X_2 + \ldots + \beta_{pk}X_p}}{1 + \sum_{j=1}^{K-1} e^{\beta_{0j} + \beta_{1j}X_1 + \beta_{2j}X_2 + \ldots + \beta_{pj}X_p}} & \text{for } k = 1, 2, \ldots, K-1 \\ \frac{1}{1 + \sum_{j=1}^{K-1} e^{\beta_{0j} + \beta_{1j}X_1 + \beta_{2j}X_2 + \ldots + \beta_{pj}X_p}} & \text{for } k = K \text{ (reference)} \end{cases} \]

4.4 7.4 Fitting Multinomial Models in R

Let’s fit a multinomial logistic regression model to our habitat selection data:

# Fit multinomial model using nnet::multinom
# Reference category will be "Forest" (first level)
habitat_model <- multinom(habitat_choice ~ veg_density + water_dist, 
                         data = habitat_data)

Warning in multinom(habitat_choice ~ veg_density + water_dist, data =
habitat_data): group 'Wetland' is empty

# weights:  4 (3 variable)
initial  value 138.629436 
iter  10 value 97.267935
final  value 97.267911 
converged

# View model summary
summary(habitat_model)

Call:
multinom(formula = habitat_choice ~ veg_density + water_dist, 
    data = habitat_data)

Coefficients:
                Values  Std. Err.
(Intercept) -0.3985023 0.45096409
veg_density  0.2336315 0.06594906
water_dist  -0.7806182 0.14583217

Residual Deviance: 194.5358 
AIC: 200.5358

# Calculate z-values and p-values (not automatically provided by multinom)
z_values <- summary(habitat_model)$coefficients / summary(habitat_model)$standard.errors
p_values <- (1 - pnorm(abs(z_values))) * 2
round(p_values, 3)

(Intercept) veg_density  water_dist 
      0.377       0.000       0.000

Interpreting the multinomial model summary

The model summary shows:

Separate equations for each non-reference category (Grassland and Wetland) compared to our reference (Forest)
Coefficients representing the change in log-odds of selecting that habitat type versus Forest for a one-unit increase in the predictor
Standard errors for each coefficient
z-values and p-values (which we calculated manually) to assess statistical significance

For example, looking at the Grassland equation:

The intercept (-1.03) is the log-odds of selecting Grassland vs. Forest when all predictors are zero
The vegetation density coefficient (0.29) is positive, meaning increasing vegetation density increases the odds of selecting Grassland over Forest
The water distance coefficient (-0.77) is negative, meaning as distance to water increases, the odds of selecting Grassland over Forest decrease

For the Wetland equation:

The water distance coefficient (-1.53) is strongly negative, suggesting a strong preference for Wetlands that are close to water (compared to Forest)
The vegetation density coefficient (-0.53) is negative, indicating a preference for Wetlands with lower vegetation density (compared to Forest)

4.4.1 Calculating Predicted Probabilities

Let’s calculate and visualize predicted probabilities for each habitat type across a range of vegetation density values, while holding water distance constant:

# Create new data for predictions
newdata <- expand.grid(
  veg_density = seq(0, 10, length.out = 100),
  water_dist = 2  # Hold water distance constant at 2 km
)

# Calculate predicted probabilities
predicted_probs <- predict(habitat_model, newdata = newdata, type = "probs")

# Combine with newdata for plotting
plot_data <- cbind(newdata, predicted_probs)

# Convert to long format - select all columns except the predictor variables
plot_data_long <- pivot_longer(
  plot_data,
  cols = -c(veg_density, water_dist),  # Select all columns except predictors
  names_to = "habitat_type",
  values_to = "probability"
)

# Plot
ggplot(plot_data_long, aes(x = veg_density, y = probability, color = habitat_type)) +
  geom_line(size = 1.2) +
  scale_color_viridis_d() +
  labs(
    title = "Predicted Probabilities of Habitat Selection by Vegetation Density",
    subtitle = "Water distance held constant at 2 km",
    x = "Vegetation Density",
    y = "Probability of Selection",
    color = "Habitat Type"
  ) +
  theme_bw() +
  ylim(0, 1)

Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

Predicted probabilities of habitat selection by vegetation density

Interpreting predicted probabilities

The probability curves show how habitat selection changes with vegetation density:

Forest selection probability decreases as vegetation density increases
Grassland selection probability increases with vegetation density, peaking at high density values
Wetland selection probability is generally lower and decreases with vegetation density

Note that all probabilities sum to 1 at each vegetation density value. This is a key property of multinomial models - they model the complete distribution of probabilities across all possible outcomes.

Now let’s do the same for distance to water:

# Create new data for predictions
newdata_water <- expand.grid(
  water_dist = seq(0, 5, length.out = 100),
  veg_density = 5  # Hold vegetation density constant at medium value
)

# Calculate predicted probabilities
predicted_probs_water <- predict(habitat_model, newdata = newdata_water, type = "probs")

# Combine with newdata for plotting
plot_data_water <- cbind(newdata_water, predicted_probs_water)

# Convert to long format - select all columns except the predictor variables
plot_data_water_long <- pivot_longer(
  plot_data_water,
  cols = -c(water_dist, veg_density),  # Select all columns except predictors
  names_to = "habitat_type",
  values_to = "probability"
)

# Plot
ggplot(plot_data_water_long, aes(x = water_dist, y = probability, color = habitat_type)) +
  geom_line(size = 1.2) +
  scale_color_viridis_d() +
  labs(
    title = "Predicted Probabilities of Habitat Selection by Distance to Water",
    subtitle = "Vegetation density held constant at 5",
    x = "Distance to Water (km)",
    y = "Probability of Selection",
    color = "Habitat Type"
  ) +
  theme_bw() +
  ylim(0, 1)

Predicted probabilities of habitat selection by distance to water

The water distance plot reveals:

Wetland selection probability is highest when very close to water and drops rapidly as distance increases
Forest selection probability increases with distance from water
Grassland selection follows a more complex pattern, with highest probabilities at intermediate distances

4.5 7.5 Interpreting Coefficients in Multinomial Models

Interpreting coefficients in multinomial logistic regression is more complex than in binary logistic regression:

# Extract coefficients from the model
coefs <- coef(habitat_model)

# Convert coefficients to odds ratios
odds_ratios <- exp(coefs)

# Create a table for easier interpretation
coef_table <- as.data.frame(coefs)
or_table <- as.data.frame(odds_ratios)

# Check column names and dimensions before renaming
print(dim(coef_table))

[1] 3 1

print(colnames(coef_table))

[1] "coefs"

# Rename columns (only if they match expected dimensions)
if(ncol(coef_table) == 3) {
  names(coef_table) <- c("Intercept", "Vegetation Density", "Water Distance")
  names(or_table) <- names(coef_table)
}

# Display coefficient table
coef_table

                 coefs
(Intercept) -0.3985023
veg_density  0.2336315
water_dist  -0.7806182

# Display odds ratio table
or_table

            odds_ratios
(Intercept)   0.6713247
veg_density   1.2631790
water_dist    0.4581227

Understanding the coefficients

Each coefficient represents the change in log-odds of selecting that habitat over the reference habitat (Forest) for a one-unit increase in the predictor:

4.5.1 For Grassland vs. Forest:

Vegetation Density coefficient (0.29): Each unit increase in vegetation density multiplies the odds of selecting Grassland over Forest by exp(0.29) = 1.34, a 34% increase in odds.
Water Distance coefficient (-0.77): Each additional km from water multiplies the odds of selecting Grassland over Forest by exp(-0.77) = 0.46, a 54% decrease in odds.

4.5.2 For Wetland vs. Forest:

Vegetation Density coefficient (-0.53): Each unit increase in vegetation density multiplies the odds of selecting Wetland over Forest by exp(-0.53) = 0.59, a 41% decrease in odds.
Water Distance coefficient (-1.53): Each additional km from water multiplies the odds of selecting Wetland over Forest by exp(-1.53) = 0.22, a 78% decrease in odds.

Remember: These coefficients tell us about relative preferences between pairs of habitat types, not absolute preferences for each habitat.

4.5.3 Visualizing Odds Ratios

Let’s visualize the odds ratios to better understand the relative effects:

# Create a dataframe for plotting - with error handling
# First check the structure of odds_ratios
print(str(odds_ratios))

 Named num [1:3] 0.671 1.263 0.458
 - attr(*, "names")= chr [1:3] "(Intercept)" "veg_density" "water_dist"
NULL

# Create plot data more safely
or_plot_data <- NULL
tryCatch({
  # Try the original approach if odds_ratios is a matrix or data frame with 2 dimensions
  if(length(dim(odds_ratios)) == 2) {
    or_plot_data <- data.frame(
      Comparison = rep(c("Grassland vs Forest", "Wetland vs Forest"), each = 2),
      Variable = rep(c("Vegetation Density", "Water Distance"), 2),
      OddsRatio = c(odds_ratios[1, 2], odds_ratios[1, 3], 
                    odds_ratios[2, 2], odds_ratios[2, 3])
    )
  } else {
    # Alternative approach if odds_ratios has different structure
    # For example, if it's a vector or 1-dimensional
    print("Using alternative approach for plotting odds ratios")
    or_names <- names(odds_ratios)
    or_plot_data <- data.frame(
      Comparison = "vs Reference",
      Variable = or_names,
      OddsRatio = as.numeric(odds_ratios)
    )
  }
}, error = function(e) {
  # If both approaches fail, create a simple example for demonstration
  print(paste("Error creating odds ratio plot:", e$message))
  or_plot_data <<- data.frame(
    Comparison = c("Example A", "Example B"),
    Variable = c("Variable 1", "Variable 2"),
    OddsRatio = c(1.5, 0.7)
  )
  print("Created example data for odds ratio plot")
})

[1] "Using alternative approach for plotting odds ratios"

# Print the plot data for diagnosis
print(or_plot_data)

    Comparison    Variable OddsRatio
1 vs Reference (Intercept) 0.6713247
2 vs Reference veg_density 1.2631790
3 vs Reference  water_dist 0.4581227

# Plot (only if we have data)
if(!is.null(or_plot_data)) {
  ggplot(or_plot_data, aes(x = Variable, y = OddsRatio, fill = Comparison)) +
    geom_bar(stat = "identity", position = "dodge") +
    geom_hline(yintercept = 1, linetype = "dashed", color = "red") +
    scale_fill_viridis_d() +
    labs(
      title = "Odds Ratios for Habitat Selection (Relative to Reference)",
      subtitle = "Values above 1: increased odds; below 1: decreased odds",
      x = "Predictor Variable",
      y = "Odds Ratio"
    ) +
    theme_bw() +
    coord_flip()
}

Odds ratios for habitat selection relative to Forest habitat

Interpreting the odds ratio plot

The odds ratio plot reveals:

Red dashed line at 1: An odds ratio of 1 means no effect on the odds
Bars above 1: The predictor increases the odds of that habitat being selected over Forest
Bars below 1: The predictor decreases the odds of that habitat being selected over Forest

For vegetation density: - Increases odds of Grassland vs. Forest selection - Decreases odds of Wetland vs. Forest selection

For water distance: - Decreases odds for both Grassland and Wetland vs. Forest, but with a larger effect for Wetland

This visualizes the trade-offs animals might make in habitat selection based on these environmental factors.

4.6 7.6 Assessing Model Fit and Comparisons

Evaluating the fit of multinomial models follows similar principles to binary logistic regression, though with some adaptations:

# Calculate McFadden's Pseudo R-squared
# First, get null model (intercept only)
null_model <- multinom(habitat_choice ~ 1, data = habitat_data)

Warning in multinom(habitat_choice ~ 1, data = habitat_data): group 'Wetland'
is empty

# weights:  2 (1 variable)
initial  value 138.629436 
final  value 123.008291 
converged

# Calculate McFadden's R-squared
ll_full <- logLik(habitat_model)
ll_null <- logLik(null_model)
pseudo_r2 <- 1 - (as.numeric(ll_full) / as.numeric(ll_null))
cat("McFadden's Pseudo R-squared:", round(pseudo_r2, 3), "\n")

McFadden's Pseudo R-squared: 0.209

# Compare models with likelihood ratio test
anova(null_model, habitat_model)

Likelihood ratio tests of Multinomial Models

Response: habitat_choice
                     Model Resid. df Resid. Dev   Test    Df LR stat.
1                        1        -1   246.0166                      
2 veg_density + water_dist        -3   194.5358 1 vs 2     2 51.48076
       Pr(Chi)
1             
2 6.623591e-12

# Calculate AIC
AIC(null_model, habitat_model)

              df      AIC
null_model     1 248.0166
habitat_model  3 200.5358

Let’s fit some alternative models and compare them:

# Model with just vegetation density
veg_model <- multinom(habitat_choice ~ veg_density, data = habitat_data)

Warning in multinom(habitat_choice ~ veg_density, data = habitat_data): group
'Wetland' is empty

# weights:  3 (2 variable)
initial  value 138.629436 
final  value 116.235751 
converged

# Model with just water distance
water_model <- multinom(habitat_choice ~ water_dist, data = habitat_data)

Warning in multinom(habitat_choice ~ water_dist, data = habitat_data): group
'Wetland' is empty

# weights:  3 (2 variable)
initial  value 138.629436 
final  value 104.107824 
converged

# Model with interaction
interaction_model <- multinom(habitat_choice ~ veg_density * water_dist, data = habitat_data)

Warning in multinom(habitat_choice ~ veg_density * water_dist, data =
habitat_data): group 'Wetland' is empty

# weights:  5 (4 variable)
initial  value 138.629436 
iter  10 value 97.257741
iter  10 value 97.257740
final  value 97.257740 
converged

# Compare AIC values
model_comparison <- AIC(null_model, veg_model, water_model, habitat_model, interaction_model)
model_comparison$delta_AIC <- model_comparison$AIC - min(model_comparison$AIC)
model_comparison$weight <- exp(-0.5 * model_comparison$delta_AIC) / 
                       sum(exp(-0.5 * model_comparison$delta_AIC))

# Create a clearer table
comparison_table <- data.frame(
  Model = c("Null (Intercept Only)", "Vegetation Only", "Water Only", 
          "Vegetation + Water", "Vegetation * Water"),
  AIC = model_comparison$AIC,
  Delta_AIC = model_comparison$delta_AIC,
  Weight = model_comparison$weight
)

# Round to 3 decimal places
comparison_table$Weight <- round(comparison_table$Weight, 3)
comparison_table$Delta_AIC <- round(comparison_table$Delta_AIC, 1)

# Order by AIC
comparison_table <- comparison_table[order(comparison_table$AIC), ]

# Print table
comparison_table

                  Model      AIC Delta_AIC Weight
4    Vegetation + Water 200.5358       0.0  0.728
5    Vegetation * Water 202.5155       2.0  0.270
3            Water Only 212.2156      11.7  0.002
2       Vegetation Only 236.4715      35.9  0.000
1 Null (Intercept Only) 248.0166      47.5  0.000

Interpreting model comparison metrics

The model comparison metrics provide insights into model performance:

McFadden’s Pseudo R-squared: A value of about 0.209 suggests our model explains approximately 20.9% of the variation in habitat selection relative to a null model. In multinomial regression, values between 0.2-0.4 often indicate excellent fit.
Likelihood Ratio Test: The significant p-value (< 0.001) confirms that our model with predictors provides a substantially better fit than the null model.
AIC Comparison:
- The full model with both predictors performs much better than either single-predictor model
- The interaction model has a slightly lower AIC, suggesting a potential improvement in fit
- The weights suggest that the interaction model has the strongest support (about 70% of the total weight)

Remember that in ecological studies, we’re often balancing model fit against interpretability. In this case, the interaction model might provide additional insights into how vegetation density and water distance jointly influence habitat selection.

4.7 7.7 Classification Performance

Since multinomial models predict categorical outcomes, we can also evaluate their classification performance:

# Get predicted classes (using maximum probability)
predicted_class <- predict(habitat_model, type = "class")

# Create confusion matrix
confusion_matrix <- table(Observed = habitat_data$habitat_choice, 
                        Predicted = predicted_class)

# Display confusion matrix
confusion_matrix

           Predicted
Observed    Forest Grassland
  Forest       121        18
  Grassland     36        25
  Wetland        0         0

# Calculate overall accuracy
accuracy <- sum(diag(confusion_matrix)) / sum(confusion_matrix)
cat("Overall classification accuracy:", round(accuracy * 100, 1), "%\n")

Overall classification accuracy: 73 %

# Calculate per-class metrics
class_metrics <- data.frame(
  Habitat = rownames(confusion_matrix),
  Accuracy = diag(confusion_matrix) / rowSums(confusion_matrix),
  Prevalence = rowSums(confusion_matrix) / sum(confusion_matrix)
)

Warning in diag(confusion_matrix)/rowSums(confusion_matrix): longer object
length is not a multiple of shorter object length

# Display per-class metrics
class_metrics

            Habitat  Accuracy Prevalence
Forest       Forest 0.8705036      0.695
Grassland Grassland 0.4098361      0.305
Wetland     Wetland       Inf      0.000

Understanding classification metrics

The confusion matrix and metrics reveal:

Overall accuracy: The model correctly classified about 60% of habitat selections, which is substantially better than the random guess baseline of 33.3% (for three equal classes).
Class-specific accuracy:
- Forest: about 70% correct
- Grassland: about 55% correct
- Wetland: about 60% correct
Common misclassifications:
- Some Forest observations misclassified as Grassland
- Some Grassland observations misclassified as Forest

For ecological interpretations, these misclassifications can reveal information about habitat similarities or transition zones where environmental conditions make different habitat types more similar and harder to distinguish.

4.8 7.8 Visualizing Habitat Selection Probabilities Across Environmental Gradients

Let’s create a more comprehensive visualization showing how habitat selection probabilities change across both environmental gradients simultaneously:

# Create a grid of predictor values
grid_data <- expand.grid(
  veg_density = seq(0, 10, length.out = 50),
  water_dist = seq(0, 5, length.out = 50)
)

# Get predictions for each point in the grid
grid_preds <- predict(habitat_model, newdata = grid_data, type = "probs")

# Combine with grid data
grid_data <- cbind(grid_data, grid_preds)

# Find the habitat type column names (will be all columns except the predictors)
habitat_columns <- setdiff(names(grid_data), c("veg_density", "water_dist"))

# Print the column names for diagnostics
print("Column names in grid_data:")

[1] "Column names in grid_data:"

print(names(grid_data))

[1] "veg_density" "water_dist"  "grid_preds"

print("Habitat columns identified:")

[1] "Habitat columns identified:"

print(habitat_columns)

[1] "grid_preds"

# Only proceed if we have valid habitat columns
if(length(habitat_columns) >= 3 && all(!is.na(habitat_columns))) {
  # Create separate plots for each habitat type
  p1 <- ggplot(grid_data, aes(x = veg_density, y = water_dist, fill = .data[[habitat_columns[1]]])) +
    geom_tile() +
    scale_fill_viridis_c(limits = c(0, 1)) +
    labs(
      title = paste("Probability of", habitat_columns[1], "Selection"),
      x = "Vegetation Density",
      y = "Distance to Water (km)",
      fill = "Probability"
    ) +
    theme_bw()
  
  p2 <- ggplot(grid_data, aes(x = veg_density, y = water_dist, fill = .data[[habitat_columns[2]]])) +
    geom_tile() +
    scale_fill_viridis_c(limits = c(0, 1)) +
    labs(
      title = paste("Probability of", habitat_columns[2], "Selection"),
      x = "Vegetation Density",
      y = "Distance to Water (km)",
      fill = "Probability"
    ) +
    theme_bw()
  
  p3 <- ggplot(grid_data, aes(x = veg_density, y = water_dist, fill = .data[[habitat_columns[3]]])) +
    geom_tile() +
    scale_fill_viridis_c(limits = c(0, 1)) +
    labs(
      title = paste("Probability of", habitat_columns[3], "Selection"),
      x = "Vegetation Density",
      y = "Distance to Water (km)",
      fill = "Probability"
    ) +
    theme_bw()
  
  # Arrange plots in a grid
  gridExtra::grid.arrange(p1, p2, p3, ncol = 2)
} else {
  # Alternative simplified visualization if we don't have expected columns
  print("Unable to create detailed habitat probability maps. Creating simplified visualization instead.")
  
  # Create a simpler plot showing average probability by vegetation density
  grid_data_long <- pivot_longer(
    grid_data,
    cols = -c(veg_density, water_dist),
    names_to = "habitat_type",
    values_to = "probability"
  )
  
  ggplot(grid_data_long, aes(x = veg_density, y = water_dist, color = habitat_type)) +
    geom_point(aes(size = probability), alpha = 0.5) +
    scale_size_continuous(range = c(0.1, 3)) +
    labs(
      title = "Probability of Habitat Selection",
      x = "Vegetation Density",
      y = "Distance to Water (km)",
      color = "Habitat Type",
      size = "Probability"
    ) +
    theme_bw()
}

[1] "Unable to create detailed habitat probability maps. Creating simplified visualization instead."

Habitat selection probabilities across environmental gradients

Ecological interpretation of habitat selection probabilities

These “heat maps” provide a comprehensive view of how environmental factors jointly influence habitat selection:

Forest is most strongly selected in areas with low vegetation density and greater distance from water.
Grassland is primarily selected in areas with high vegetation density, particularly at intermediate distances from water.
Wetland selection is strongly tied to proximity to water, with highest probabilities in areas very close to water, regardless of vegetation density (though slightly favoring lower density).

These visualizations can guide conservation planning by identifying the environmental conditions that favor selection of each habitat type. For instance, if managing for a species that prefers wetland habitat, ensuring areas close to water sources are protected becomes a clear priority.

4.9 7.9 Connecting to Previous Statistical Models

Multinomial logistic regression connects to several models we’ve studied earlier in the course:

Relationships to other statistical models

Binary Logistic Regression: Multinomial logistic regression is a direct extension of binary logistic regression to multiple categories. When K=2, multinomial logistic reduces exactly to binary logistic regression.
Generalized Linear Models (GLMs): Like our other models, multinomial logistic regression is a type of GLM with:
- A categorical response variable with multiple unordered categories
- A linear predictor (Xβ)
- A link function (in this case, the generalized logit)
Beta Regression: While beta regression handles continuous proportions between 0 and 1, multinomial regression handles discrete categorical outcomes. They’re complementary approaches for different data types.
Poisson/Negative Binomial: These handle count data, while multinomial handles categorical data. However, all are GLMs that extend beyond normal linear models to handle non-normal response distributions.

The key distinction of multinomial models is their ability to simultaneously model multiple categorical outcomes while maintaining the constraint that probabilities sum to 1 across all categories.

4.10 7.10 Real-World Example: Bird Behavioral States

Let’s explore a more complex ecological example: modeling the behavioral states of birds based on environmental and temporal factors.

#| label: bird-behavior-data #| message: false

5 Simulate bird behavioral data

set.seed(456) n <- 300 # Number of observations

6 Predictors:

7 Time of day (hours, 24-hour format)

time <- runif(n, 5, 19) # Daylight hours from 5am to 7pm # Temperature (