Skip to content

Errors thrown by camera trap data set #83

@erex

Description

@erex

From a message to the list on 09Feb21 regarding challenges with CTDS analysis.

I think there are 3 errors generated by a data set with these characteristics:

  • flatfile without an object field
  • and truncation that causes transects with detections to become transects without detections

First problem is easily remedied by adding 'object' field to flatfile, however, I don't understand the need to have this line of code in safetruncate; object is not a mandatory field in flatfile I presume

  • incidentally this line could also be simplified because the last expression is equivalent to fsl defined 5 lines earlier. Just for simplification, unrelated to the problem at hand.

Second problem seems deeper, but possibly related to the handling of these "phantom transects" that arise when truncation robs them of their detections. After assigning an object field to the offending flatfile, the code runs without generating the previous warnings, but returns this result from dht2

> deer_dht_act
Summary statistics:
 .Label Area CoveredArea    Effort   n  k ER se.ER cv.ER
  Total   16    4395.157 196934400 666 57  0   NaN   NaN

ER is reported as zero, but it really is not (characteristic of CTDS analyses, when effort is measured in seconds, encounter rates are on the order of 10^-6, hence do not format well. The more worrying matter is the NaN for SE(ER).

The flow of function calls is

dht2 -> er_var_f -> varn

er_var_f is using the "classic" encounter rate variance formula "P2" by default for CTDS analysis. Hence, er_var_f is calling varn at this location in this manner

        mutate(ER_var = varn(.data$Effort, .data$transect_n_observations,
                             type=er_est)) %>%

My (unproven) hypothesis is the cause of the NaN associated with encounter rate variances comes from the manner in which .data$transect_n_observations are computed for the "phantom" transects here.

             transect_n_observations = length(na.omit(unique(.data$object))),

My guess is that from this line of code, transect_n_observations receives a value that might create problems for varn, but that is only a guess. Maybe this is not where the problem lies, but I didn't chase the rabbit any further down the hole than this.


The final problem I encountered when passing through this analysis is a surprising change in degrees of freedom associated with abundance estimate confidence intervals computed by dht2

> deer_dht <- dht2(deer_hr, flatfile=deer,strat_formula=~1,sample_fraction = 0.111, convert_units = conunits)
> deer_dht
Summary statistics:
 .Label Area CoveredArea    Effort   n  k ER se.ER cv.ER
  Total   16    4395.157 196934400 666 57  0   NaN   NaN

Abundance estimates:
 .Label Estimate    se    cv LCI UCI  df
  Total       10 1.096 0.109   8  12 663

Component percentages of variance:
 .Label Detection  ER
  Total       NaN NaN

> deer$activity <- 0.46734925
> deer$activity.SE <- 0.03099745
> activity <- unique(deer[ , c("activity","activity.SE")])
> names(activity) <- c("rate", "SE")
> (mult <- list(creation=activity))

> deer_dht_act <- dht2(deer_hr, flatfile=deer,strat_formula=~1,
+                      sample_fraction = 0.111,multipliers = mult, convert_units = conunits)
> deer_dht_act
Summary statistics:
 .Label Area CoveredArea    Effort   n  k ER se.ER cv.ER
  Total   16    4395.157 196934400 666 57  0   NaN   NaN

Abundance estimates:
 .Label Estimate    se    cv LCI UCI       df
  Total       21 2.742 0.128  17  28 1240.099

Two calls to dht2 first without a multiplier, second with a multiplier, in which the degrees of freedom for the multiplier is unspecified. Notice the df is 663 without multipliers and 1240 when the multiplier is included. Perhaps that is correct, and matters little in this situation where the number of detections is so large, but it struck me as suspicious.


Of the three, the second is the most troubling. I do not include the data causing the problem as they were shared by the user. I think the phenomenon can be duplicated through the use of the DuikerCameraTraps data set in the Distance package if truncation is sufficiently extreme, as in

safetruncation(DuikerCameraTraps, 15, 3)

strong right truncation causes camera station B3 to lose all 8 of its detections triggering the warning:

Warning message:
In `[<-.data.frame`(`*tmp*`, flatfile$Sample.Label %in% sl_diff,  :
  provided 7 variables to replace 6 variables

because this line of code tries to assign NA to the object field that does not exist

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions