Check for environmental outliers using species optimal ranges.
Source:R/outliermethods.R
ecological_ranges.RdCheck for environmental outliers using species optimal ranges.
Usage
ecological_ranges(
data,
var,
output = "outlier",
species = NULL,
optimumSettings = list(optdf = NULL, optspcol = NULL, mincol = NULL, maxcol = NULL,
ecoparam = NULL, direction = NULL),
minval = NULL,
maxval = NULL,
lat = NULL,
lon = NULL,
ecoparam = NULL,
direction = NULL,
pct = 80,
checkfishbase = FALSE,
mode = NULL,
warn = TRUE
)Arguments
- data
Dataframe with environmental predictors for a species or multiple species.
- var
Environmental parameter considered in flagging suspicious outliers.
- output
output Either clean: for dataframe with no suspicious outliers or outlier: to retrun dataframe with only outliers.
- species
The species should be indicated if the minimum
minvaland maximum valuesmaxvalare provided.- optimumSettings
A list of optimal parameters are provided mostly when multiple species are examined.
optdf: is the dataframe with species optimal values (min, max, or ecoparam). This dataset can be generated from literature for different species.optspcol: Is the column with species names in theoptdfdataset.mincol: Is the column name in theoptdfwith minimum values.maxcol: Is the column name in theoptdfwith maximum values.ecoparam: If in theoptdfthe minimum and maximum values are not found, then the the column with ecoparam should be provided.direction: If ecoparam is provided in theoptdf, then column for direction should be provided.
- minval, maxval
Minimum and maximum values (ranges) for a particular that are used to flag out values outside the ranges.
- lat, lon
If the
checkfishbaseandmodeare set, then the columns for latitude longitude should be provided.- ecoparam
This parameter is used only when the lower bound (minimum) and upper bound maximum or ranges are absent. For example, if only minimum value is present for a particular species, then ecoparam is set and the direction is provided whether lower, greater, equal, less/equal or greater/equal the ecoparam value provided.
- direction
This indicates if the provided ecological threshold
ecoparamor ranges is greater thangreater, less thanless, equalequal, less or equalleor greater or equalge. Íf the minimum and maximum values are known, then theecoparamanddirectionshould not be used.- pct
The percentage similarity of the species name provided by the user and the one in FishBase. Only fish species names are checked with Fishbase but other taxa can be checked using
taxizepackage.- checkfishbase
Either
TRUEto check for both temperaturestempand latitudinal or geographical rangesgeo. If thecheckfishbaseis set toTRUEthen themodeparameter must be set to eithergeo or temp. This function applies for only fish species.- mode
Either
geoortempfor latitudinal ranges or temperature ranges respectively. Seethermal_rangesorgeo_rangeson how to obtain the data.- warn
Either
TRUEto return warning messages orFALSEfor no warning messages. the defualt isFALSE:
Examples
if (FALSE) { # \dontrun{
data("efidata")
data("jdsdata")
datafinal <- match_datasets(datasets = list(jds = jdsdata, efi=efidata),
lats = 'lat',
lons = 'lon',
species = c('speciesname','scientificName'),
date = c('Date', 'sampling_date'),
country = c('JDS4_site_ID'))
efidata <- check_names(data = datafinal, colsp='species', pct=90, merge=TRUE)
danube <- system.file('extdata/danube.shp.zip', package='specleanr')
db <- sf::st_read(danube, quiet=TRUE)
wcd <- terra::rast(system.file('extdata/worldclim.tiff', package='specleanr'))
refdata <- pred_extract(data = efidata, raster= wcd ,
lat = 'decimalLatitude', lon= 'decimalLongitude',
colsp = "scientificName",
bbox = db,
minpts = 10)
saldata <- refdata[["Thymallus thymallus"]]
#1. checking the annual maean temperature (bio1) are within the ranges in FishBase
salmotherange <- thermal_ranges(x = "Salmo trutta")
sdatatemp <- ecological_ranges(data = saldata, var = 'bio1', species = "Salmo trutta",
checkfishbase = TRUE, mode = 'temp', output = 'outlier')
#zero record no outliers
#====
#2. geographical ranges: latitude longitude
#geo ranges in fishbase
salgeorange <- geo_ranges(data = "Salmo trutta")
sdatageo <- ecological_ranges(data = saldata, lat = 'y', lon = 'x', output = 'outlier',
species = "Salmo trutta",
checkfishbase = TRUE, mode = 'geo')
#3. GENERAL LITERATURE RANGES
#======
#1. when the min and and max are provided
#multiple FALSE SHOULD BE SET
#3.1: If only the minimum value is present: assuming minimum temperature is 6, varible: bio1
#direction less than 6.0 is outlier and greater is not
sdata <- ecological_ranges(data = saldata, ecoparam = 6.0, var = 'bio1',
direction = 'greater' )
#3.2
sdata2 <- ecological_ranges(data = saldata, var = 'bio1', minval = 2,
maxval = 24, species = "Salmo trutta" )
#4. Multiple TRUE
#the optimal parameters should be provided in a dataframe format with min max, or ecoparam
#4.1 optimal dataset
optdata <- data.frame(species= c("Salmo trutta", "Abramis brama"),
mintemp = c(6, 1.6),maxtemp = c(20, 21),
meantemp = c(8.5, 10.4), #ecoparam
direction = c('greater', 'greater'))
#parameter used is annual mean temperature (WORLDCLIM)
#provide the column with species names in the environment dataset
#set optimal list parameter
#
# #optimal parameters
sdata3 <- ecological_ranges(data = saldata, species = 'Salmo trutta',
var = 'bio1', output = "outlier",
optimumSettings = list(optdf = optdata,maxcol = "maxtemp",
mincol ="mintemp",optspcol = "species"))
#
#
#only one ecological parameter (ecoparam is provided) and direction
sdata4 <- ecological_ranges(data = saldata, species = 'Salmo trutta', var = 'bio1',
output = "outlier",
optimumSettings = list(optdf = optdata,
ecoparam = "meantemp",
optspcol = "species",
direction= "direction"))
} # }