Creel surveys allow fisheries scientists and managers to collect data on catch and harvest, an angler population (including effort expended), and, depending on survey design, biological data on fish populations. Though important methods of collecting data on the user base of the fishery, creel surveys are difficult to implement and, in graduate fisheries programs, creel surveys are paid little attention. As a result, fisheries managers–the first job for many fisheries-program graduates–often inherit old surveys or are told to institute new surveys with little knowledge of how to do so.
Fisheries can cover large spatial extents: large reservoirs, coast-lines, and river systems. A creel survey has to be statistically valid, adaptable to the geographic challenges of the fishery, and cost efficient. Limited budgets can prevent agencies from implementing creel surveys; the AnglerCreelSurveySimulation was designed to help managers explore the type of creel survey that is most appropriate for their fishery, including fisheries with multiple access points, access points that are more popular than others, variation in catch rate, the number of surveyors, and seasonal variation in day-lengths.
The AnglerCreelSurveySimulation package does require
that users know something about their fishery and the human dimensions
of that fishery. A prior knowledge includes mean trip length
for a party (or individual), the mean catch rate of the
The AnglerCreelSurveySimulation package is simple, but
powerful. Four functions provide the means for users to create a
population of anglers, limit the length of the fishing day to any value,
and provide a mean trip length for the population. Ultimately, the user
only needs to know the final function
ConductMultipleSurveys but because I’d rather this
not be a black box of functions, this brief
introduction will be a step-by-step process through the package.
This tutorial assumes that we have a very simple, small fishery with only one access point that, on any given day, is visited by 100 anglers. The fishing day length for our theoretical fishery is 12 hours (say, from 6 am to 6pm) and all anglers are required to have completed their trip by 6pm. Lastly, the mean trip length is known to be 3.5 hours.
For the purposes of this package, all times are functions of the
fishing day. In other words, if a fishing day length is 12 hours (e.g.,
from 6 am to 6pm) and an angler starts their trip at 2 and
ends at 4 that means that they started their trip at 8 am
and ended at 10 am.
The make_anglers() function builds a population of
anglers:
library(AnglerCreelSurveySimulation)
anglers <- make_anglers(n_anglers = 100, mean_trip_length = 3.5, fishing_day_length = 12)make_anglers() returns a dataframe with
start_time, trip_length, and
departure_time for all anglers.
head(anglers)
#> start_time trip_length departure_time
#> 1 0.02782571 5.583119 5.6109447
#> 2 0.33442276 0.526787 0.8612097
#> 3 7.62885286 1.859801 9.4886534
#> 4 0.60615188 2.747347 3.3534985
#> 5 4.94777591 1.880994 6.8287702
#> 6 6.99811419 2.463352 9.4614660In the head(anglers) statement, you can see that
starttime, triplength, and
departureTime are all available for each angler. The first
angler started their trip roughly 0.03 hours into the fishing day,
continued to fish for 5.58 hours, and left the access point at 5.61
hours into the fishing day. Angler start times are assigned by the
uniform distribution and trip lengths are assigned by the
gamma distribution. To get true effort of all the anglers
for this angler population, summing trip_length is all
that’s needed: 0.
The distribution of angler trip lengths can be easily visualized:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(ggplot2)
# Histogram overlaid with kernel density curve
anglers %>%
ggplot(aes(x=trip_length)) +
geom_histogram(aes(y=..density..),
binwidth=.1,
colour="black", fill="white") +
geom_density(alpha=.2, fill="#FF6666")Once the population of anglers has been created, the next function to
apply is the get_total_values() function. In
get_total_values(), the user specifies the start time of
the creel surveyor, the end time of the surveyor, and the wait time of
the surveyor. Here is where the user also specifies the sampling
probability of the anglers (in most cases, equal to \(\frac{waitTime}{fishingDayLength}\)) and
the mean catch rate of the fishery. There are a number of a default
settings in the get_total_values() function; see
?get_total_values for a description of how the function
handles NULL values for startTime,
endTime, and waitTime. startTime
and waitTime are the times that the surveyor started and
waited at the access point. totalCatch and
trueEffort are the total (or real) values for
catch and effort. meanLambda is the mean catch rate for all
anglers. Even though we assigned meanCatchRate to
get_total_values(), individual mean catch rates are
simulated by rgamma() with shape equal to
meanCatchRate and rate equal to 1.
For this walk through, we’ll schedule the surveyor to work for a total of eight hours at the sole access point in our fishery:
anglers %>%
get_total_values(start_time = 0, wait_time = 8, circuit_time = 8, mean_catch_rate = 2.5,
fishing_day_length = 12)
#> n_observed_trips total_observed_trip_effort n_completed_trips
#> 1 90 243.4483 57
#> total_completed_trip_effort total_completed_trip_catch start_time wait_time
#> 1 170.3483 410.1601 0 8
#> total_catch true_effort mean_lambda
#> 1 838.2736 314.7418 2.734755get_total_values() returns a single row data frame with
several columns. The output of get_total_values() is the
catch and effort data observed by the surveyor during their wait at the
access point along with the “true” values for catch and effort.
(Obviously, we can’t simulate biological data but, if an agency’s
protocol directed the surveyor to collect biological data, that could be
analyzed with other R functions.)
In the output from get_total_values(),
n_observed_trips is the number of trips that the surveyor
observed, including anglers that arrived after she started her day and
anglers that were there for the duration of her time at the access
point. total_observed_trip_effort is the effort expended by
those parties; because the observed trips were not complete, she did not
count their catch. n_completed_trips is the number of
anglers that completed their trips while she was onsite,
total_completed_trip_effort is the effort expended by those
anglers, and total_completed_trip_catch is the number of
fish caught by those parties. Catch is both the number of fish harvested
and those caught and released.
Effort and catch are estimated from the Bus Route Estimator:
\[ \widehat{E} = T\sum\limits_{i=1}^n{\frac{1}{w_{i}}}\sum\limits_{j=1}^m{\frac{e_{ij}}{\pi_{j}}} \]
where
and
Catch rate is calculated from the Ratio of Means equation:
\[ \widehat{R_1} = \frac{\sum\limits_{i=1}^n{c_i/n}}{\sum\limits_{i=1}^n{L_i/n}} \]
where
and
* Li is the length of the fishing trip at the tie of
the interview.
For incomplete surveys, Li represents an incomplete trip.
simulate_bus_route() calculates effort and catch based
upon these equations. See ?simulate_bus_route for
references that include a more detailed discussion of these
equations.
simulate_bus_route() calls make_anglers()
and get_total_values() so many of the same arguments we
passed in the previous functions will need to be passed to
simulate_bus_route(). The new argument,
nsites, is the number of sites visited by the surveyor. In
more advanced simulations (see the examples in
?simulate_bus_route), you can pass strings of values for
startTime, waitTime, nsites, and
nanglers to simulate a bus route-type survey rather than
just a single access-point survey.
sim <- simulate_bus_route(start_time = 0, wait_time = 8, n_sites = 1, n_anglers = 100,
mean_catch_rate = 2.5, fishing_day_length = 12)
sim
#> Ehat catch_rate_ROM true_catch true_effort mean_lambda
#> 1 254.1131 2.609279 849.2175 332.9894 2.554031The output from simulate_bus_route() is a dataframe with
values for Ehat, catchRateROM (the ratio of
means catch rate), trueCatch, trueEffort, and
meanLambda. Ehat is the estimated total effort
from the Bus Route Estimator above and catchRateROM is
catch rate estimated from the Ratio of Means equation.
trueCatch, trueEffort, and
meanLambda are the same as before. Multiplying
Ehat by catchRateROM gives an estimate of
total catch: 663.0519874.
With information about the fishery, the start and wait times of the
surveyor, the sampling probability, mean catch rate, and fishing day
length, we can run multiple simulations with
conduct_multiple_surveys().
conduct_multiple_surveys() is a wrapper that calls the
other three functions in turn and compiles the values into a data frame
for easy plotting or analysis. The only additional argument needed is
the nsims value which tells the function how many
simulations to conduct. For the sake of this simple simulation, let’s
assume that the creel survey works five days a week for four weeks
(i.e. 20 days):
sim <- conduct_multiple_surveys(n_sims = 20, start_time = 0, wait_time = 8, n_sites = 1,
n_anglers = 100,
mean_catch_rate = 2.5, fishing_day_length = 12)
sim
#> Ehat catch_rate_ROM true_catch true_effort mean_lambda
#> 1 283.9901 2.624168 957.2048 371.8270 2.591832
#> 2 257.3739 2.274775 865.4242 359.7312 2.422283
#> 3 263.3519 2.531531 840.7083 341.1700 2.502158
#> 4 286.3035 2.636459 847.5561 358.2680 2.493317
#> 5 241.0059 2.091205 746.8896 320.3706 2.360773
#> 6 283.4753 2.035783 745.4144 365.9348 2.136867
#> 7 231.0606 2.425965 858.2096 341.9932 2.431244
#> 8 256.9581 2.441059 845.8885 346.7909 2.536444
#> 9 245.2889 2.665434 820.1221 326.3658 2.419219
#> 10 272.9459 2.622623 939.6975 362.7497 2.639500
#> 11 242.4129 2.455266 940.5788 341.4352 2.726100
#> 12 244.3567 2.472028 872.7614 339.9551 2.442740
#> 13 258.6222 2.548895 894.0612 348.0127 2.602706
#> 14 256.5682 3.472415 1068.3938 361.1228 2.832822
#> 15 249.7969 2.632862 1023.9092 359.4603 2.679270
#> 16 223.9953 3.015796 867.6795 311.0071 2.816511
#> 17 261.3882 2.189277 762.6096 351.2963 2.163427
#> 18 277.8554 2.414402 872.5733 351.8118 2.570118
#> 19 270.4510 2.458694 826.1457 352.3857 2.411958
#> 20 275.3377 2.704482 842.6900 332.3657 2.399357With the output from multiple simulations, an analyst can evaluate
how closely the creel survey they’ve designed mirrors reality. A
lm() of estimated catch as a function of
trueCatch can tell us if the survey will over or under
estimate reality:
mod <-
sim %>%
lm((Ehat * catch_rate_ROM) ~ true_catch, data = .)
summary(mod)
#>
#> Call:
#> lm(formula = (Ehat * catch_rate_ROM) ~ true_catch, data = .)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -107.88 -56.21 10.17 31.93 115.13
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 62.3735 163.8891 0.381 0.70797
#> true_catch 0.6812 0.1872 3.639 0.00188 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 67.61 on 18 degrees of freedom
#> Multiple R-squared: 0.4239, Adjusted R-squared: 0.3919
#> F-statistic: 13.25 on 1 and 18 DF, p-value: 0.001875Plotting the data and the model provide a good visual means of evaluating how close our estimates are to reality:
#Create a new vector of the estimated effort multiplied by estimated catch rate
sim <-
sim %>%
mutate(est_catch = Ehat * catch_rate_ROM)
sim %>%
ggplot(aes(x = true_catch, y = est_catch)) +
geom_point() +
geom_abline(intercept = mod$coefficients[1], slope = mod$coefficients[2],
colour = "red", size = 1.01)
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` instead.
#> This warning is displayed once per session.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.The closer the slope parameter estimate is to 1 and the intercept parameter estimate is to 0, the closer our estimate of catch is to reality.
We can create a model and plot of our effort estimates, too:
mod <-
sim %>%
lm(Ehat ~ true_effort, data = .)
summary(mod)
#>
#> Call:
#> lm(formula = Ehat ~ true_effort, data = .)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -23.9880 -9.7125 0.3522 7.6868 27.8261
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -12.6866 66.9324 -0.190 0.851788
#> true_effort 0.7829 0.1926 4.065 0.000727 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 13.23 on 18 degrees of freedom
#> Multiple R-squared: 0.4786, Adjusted R-squared: 0.4497
#> F-statistic: 16.52 on 1 and 18 DF, p-value: 0.0007267
#Create a new vector of the estimated effort multiplied by estimated catch rate
sim %>%
ggplot(aes(x = true_effort, y = Ehat)) +
geom_point() +
geom_abline(intercept = mod$coefficients[1], slope = mod$coefficients[2],
colour = "red", size = 1.01)If the start and wait time equals 0 and the length of the fishing day, respectively, the creel surveyor can observe all completed trips, though she’d likely be unhappy having to work 12 hours. The inputs have to be adjusted to allow her to arrive at time 0, stay for all 12 hours, and have a probability of 1.0 at catching everyone:
start_time <- 0
wait_time <- 12
sampling_prob <- 1
sim <- conduct_multiple_surveys(n_sims = 20, start_time = start_time, wait_time = wait_time,
n_sites = 1, n_anglers = 100,
mean_catch_rate = 2.5, fishing_day_length = wait_time)
sim
#> Ehat catch_rate_ROM true_catch true_effort mean_lambda
#> 1 338.4188 2.526191 854.9105 338.4188 2.492846
#> 2 335.3774 2.533732 849.7564 335.3774 2.662725
#> 3 322.1603 2.704045 871.1359 322.1603 2.683248
#> 4 332.8148 2.244621 747.0431 332.8148 2.211713
#> 5 344.1474 2.375886 817.6551 344.1474 2.504725
#> 6 330.3224 2.228176 736.0166 330.3224 2.327166
#> 7 351.7334 2.439408 858.0215 351.7334 2.428026
#> 8 350.9096 2.672565 937.8286 350.9096 2.691987
#> 9 317.5808 2.575393 817.8952 317.5808 2.586919
#> 10 343.8947 2.563955 881.7305 343.8947 2.565945
#> 11 305.1314 2.320839 708.1607 305.1314 2.409243
#> 12 347.5091 2.661144 924.7719 347.5091 2.552094
#> 13 324.4478 2.470399 801.5155 324.4478 2.453708
#> 14 305.9623 2.464212 753.9557 305.9623 2.464376
#> 15 341.9831 2.595890 887.7505 341.9831 2.469572
#> 16 340.1130 2.283058 776.4978 340.1130 2.316221
#> 17 336.3245 2.378349 799.8971 336.3245 2.438951
#> 18 340.3461 2.556218 869.9990 340.3461 2.511172
#> 19 359.5779 2.360543 848.7990 359.5779 2.399526
#> 20 345.1977 2.205940 761.4852 345.1977 2.363665#> Warning in summary.lm(mod): essentially perfect fit: summary may be unreliable
#>
#> Call:
#> lm(formula = Ehat ~ true_effort, data = .)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -1.847e-13 3.250e-16 1.398e-14 2.117e-14 7.051e-14
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -2.542e-13 2.754e-13 -9.23e-01 0.368
#> true_effort 1.000e+00 8.195e-16 1.22e+15 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 5.189e-14 on 18 degrees of freedom
#> Multiple R-squared: 1, Adjusted R-squared: 1
#> F-statistic: 1.489e+30 on 1 and 18 DF, p-value: < 2.2e-16
If our hypothetical fishery suddenly gained another access point and the original 100 anglers were split between the two access points equally, what kind of information would a creel survey capture? We could ask our surveyor to split her eight-hour work day between both access points, but she’ll have to drive for 0.5 hours to get from one to another. Of course, that 0.5 hour of drive time will be a part of her work day so she’ll effectively have 7.5 hours to spend at access points counting anglers and collecting data.
start_time <- c(0, 4.5)
wait_time <- c(4, 3.5)
n_sites = 2
n_anglers <- c(50, 50)
fishing_day_length <- 12
# sampling_prob <- sum(wait_time)/fishing_day_length
sim <- conduct_multiple_surveys(n_sims = 20, start_time = start_time, wait_time = wait_time,
n_sites = n_sites, n_anglers = n_anglers,
mean_catch_rate = 2.5,
fishing_day_length = fishing_day_length)
sim
#> Ehat catch_rate_ROM true_catch true_effort mean_lambda
#> 1 1249.4907 2.180083 828.6670 341.1333 2.408367
#> 2 866.1782 2.623012 738.3780 303.4722 2.491475
#> 3 1007.4818 2.081340 752.3658 320.7494 2.440070
#> 4 1099.9839 2.597177 852.4705 325.7298 2.557588
#> 5 923.0335 2.045153 895.2873 336.2774 2.582202
#> 6 1203.7533 2.392358 917.6269 360.8331 2.564378
#> 7 1037.8636 2.696484 912.5469 345.1254 2.702024
#> 8 1012.5147 2.777855 958.4809 350.9344 2.620072
#> 9 919.5822 2.982438 826.7231 352.0860 2.347174
#> 10 999.7040 2.516509 763.0466 317.6327 2.440649
#> 11 1076.5503 2.149722 831.2845 356.9924 2.316774
#> 12 877.6668 3.112014 876.8068 348.6745 2.507499
#> 13 1101.8551 2.378960 824.6176 350.8859 2.336300
#> 14 1054.7562 2.108180 768.8328 336.9540 2.289195
#> 15 998.6178 2.404390 816.3672 324.1411 2.502753
#> 16 1116.4112 2.412224 859.0256 350.7947 2.325477
#> 17 909.0606 3.029545 846.3970 324.6751 2.503748
#> 18 1034.8630 2.217336 869.4471 353.8973 2.529287
#> 19 974.7500 1.803500 778.4843 334.9460 2.426595
#> 20 907.3551 2.342040 709.5490 315.6153 2.200094#>
#> Call:
#> lm(formula = Ehat ~ true_effort, data = .)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -174.873 -47.208 -2.631 42.270 220.033
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -14.701 451.845 -0.033 0.9744
#> true_effort 3.061 1.337 2.289 0.0344 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 93.98 on 18 degrees of freedom
#> Multiple R-squared: 0.2255, Adjusted R-squared: 0.1825
#> F-statistic: 5.241 on 1 and 18 DF, p-value: 0.03437
Ultimately, the creel survey simulation can be as complicated as a creel survey. If a survey requires multiple clerks, several simulations can be coupled together to act as multiple surveyors. To accommodate weekends or holidays (i.e., increased angler pressure), additional simulations with different wait times and more anglers (to simulate higher pressure) can be built into the simulation. For example, if we know that angler pressure is 50% higher at the two access points on weekends, we can hire a second clerk to sample 8 hours a day on the weekends–one day at each access point–and add the weekend data to the weekday data.
#Weekend clerks
start_time_w <- 2
wait_time_w <- 10
n_sites <- 1
n_anglers_w <- 75
fishing_day_length <- 12
sampling_prob <- 8/12
sim_w <- conduct_multiple_surveys(n_sims = 8, start_time = start_time_w,
wait_time = wait_time_w, n_sites = n_sites,
n_anglers = n_anglers_w,
mean_catch_rate = 2.5,
fishing_day_length = fishing_day_length)
sim_w
#> Ehat catch_rate_ROM true_catch true_effort mean_lambda
#> 1 339.0032 2.373315 561.7161 236.2135 2.431810
#> 2 358.9239 2.818435 708.0025 251.7850 2.844418
#> 3 339.4585 2.168630 513.7869 239.0527 2.231830
#> 4 340.7627 2.849649 685.8428 240.4456 2.642581
#> 5 359.4840 2.714199 677.5771 249.6417 2.636589
#> 6 362.2580 2.436514 615.1291 252.7900 2.448829
#> 7 350.5191 2.848595 700.9075 246.6987 2.799248
#> 8 390.8831 2.607164 707.7058 271.4466 2.621865
#Add the weekday survey and weekend surveys to the same data frame
mon_survey <-
sim_w %>%
bind_rows(sim)
mod <-
mon_survey %>%
lm(Ehat ~ true_effort, data = .)
summary(mod)
#>
#> Call:
#> lm(formula = Ehat ~ true_effort, data = .)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -204.421 -57.684 -1.193 53.919 219.623
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -1332.3482 142.0247 -9.381 7.88e-10 ***
#> true_effort 6.9246 0.4508 15.360 1.48e-14 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 101.9 on 26 degrees of freedom
#> Multiple R-squared: 0.9007, Adjusted R-squared: 0.8969
#> F-statistic: 235.9 on 1 and 26 DF, p-value: 1.477e-14Hopefully, this vignette has shown you how to build and simulate your
own creel survey. It’s flexible enough to estimate monthly or seasonal
changes in fishing day length, changes in the mean catch rate, increased
angler pressure on weekends, and any number of access sites, start
times, wait times, and sampling probabilities. The output from
conduct_multiple_surveys() allows the user to estimate
variability in the catch and effort estimates (e.g., relative standard
error) to evaluate the most efficient creel survey for their
fishery.