Identifies absolute outliers and their proportions for a single species.
Source:R/similaritytests.R
ocindex.Rd
Identifies absolute outliers and their proportions for a single species.
Usage
ocindex(
x,
sp = NULL,
threshold = NULL,
absolute = FALSE,
props = FALSE,
warn = FALSE,
autothreshold = FALSE
)
Arguments
- x
datacleaner
class for each methods used to identify outliers inmultidetect
function.- sp
string
. Species name or index if multiple species are considered during outlier detection.- threshold
numeric
. Maximum value to denote an absolute outlier. The threshold ranges from0
, which indicates a point has not been flagged by any outlier detection method as anoutlier
, to1
, which means the record is an absolute or true outlier since all methods have identified it. At both extremes, many records are classified at low threshold values, which may be due to individual method weakness or strength and data distribution. Also, at higher threshold values, the true outliers are retained. For example, if ten methods are considered and 9 methods flag a record as an outlier, If a cutoff of 1 is used, then that particular record is retained. Therefore, thedefault
cutoff is0.6
, butautothreshold
can be used to select the appropriate threshold.- absolute
logical
. To output absolute outliers for a species.- props
dataframe
. To output the proportional absoluteness for each outlier.- warn
logical
. If TRUE, warning on whether absolute outliers obtained at a low threshold is indicated. Default TRUE.- autothreshold
vector
. Identifies the threshold with mean number of absolute outliers.The search is limited within 0.51 to 1 since thresholds less than are deemed inappropriate for identifying absolute outliers. The autothreshold is used whenthreshold
is set toNULL
.
Value
vector
or dataframe
of absolute outliers, best outlier detection method or data frame of absolute outliers and their
proportions
Examples
if (FALSE) { # \dontrun{
data(efidata)
db <- sf::read_sf(system.file('extdata/danube/basinfinal.shp', package = "specleanr"), quiet = TRUE)
wcd <- terra::rast(system.file('extdata/worldclim.tiff', package = "specleanr"))
checkname <- check_names(data=efidata, colsp ='scientificName', pct = 90, merge = T)
extdf <- pred_extract(data = checkname, raster = wcd,
lat = 'decimalLatitude', lon = 'decimalLongitude',
colsp = 'speciescheck',
list = TRUE,verbose = F,
minpts = 6,merge = F)#basin removed
#outlier detection
outliersdf <- multidetect(data = extdf, output='outlier', var = 'bio6',
exclude = c('x','y'), multiple = TRUE,
methods = c('mixediqr', "iqr", "mahal", "iqr", "logboxplot"),
showErrors = FALSE, warn = TRUE, verbose = FALSE, sdm = TRUE)
ociss <- ocindex(x = outliersdf, sp= 8, threshold = 0.2, absolute = TRUE)#
#No outliers detected in more than two methods
} # }