Check for environmental outliers using species optimal ranges.
Source:R/outliermethods.R
ecological_ranges.Rd
Check for environmental outliers using species optimal ranges.
Usage
ecological_ranges(
data,
var,
output = "outlier",
species = NULL,
optimumSettings = list(optdf = NULL, optspcol = NULL, mincol = NULL, maxcol = NULL,
ecoparam = NULL, direction = NULL),
minval = NULL,
maxval = NULL,
lat = NULL,
lon = NULL,
ecoparam = NULL,
direction = NULL,
pct = 80,
checkfishbase = FALSE,
mode = NULL,
warn = TRUE
)
Arguments
- data
Dataframe with environmental predictors for a species or multiple species.
- var
Environmental parameter considered in flagging suspicious outliers.
- output
output Either clean: for dataframe with no suspicious outliers or outlier: to retrun dataframe with only outliers.
- species
The species should be indicated if the minimum
minval
and maximum valuesmaxval
are provided.- optimumSettings
A list of optimal parameters are provided mostly when multiple species are examined.
optdf
: is the dataframe with species optimal values (min, max, or ecoparam). This dataset can be generated from literature for different species.optspcol
: Is the column with species names in theoptdf
dataset.mincol
: Is the column name in theoptdf
with minimum values.maxcol
: Is the column name in theoptdf
with maximum values.ecoparam
: If in theoptdf
the minimum and maximum values are not found, then the the column with ecoparam should be provided.direction
: If ecoparam is provided in theoptdf
, then column for direction should be provided.
- minval, maxval
Minimum and maximum values (ranges) for a particular that are used to flag out values outside the ranges.
- lat, lon
If the
checkfishbase
andmode
are set, then the columns for latitude longitude should be provided.- ecoparam
This parameter is used only when the lower bound (minimum) and upper bound maximum or ranges are absent. For example, if only minimum value is present for a particular species, then ecoparam is set and the direction is provided whether lower, greater, equal, less/equal or greater/equal the ecoparam value provided.
- direction
This indicates if the provided ecological threshold
ecoparam
or ranges is greater thangreater
, less thanless
, equalequal
, less or equalle
or greater or equalge
. Íf the minimum and maximum values are known, then theecoparam
anddirection
should not be used.- pct
The percentage similarity of the species name provided by the user and the one in FishBase. Only fish species names are checked with Fishbase but other taxa can be checked using
taxize
package.- checkfishbase
Either
TRUE
to check for both temperaturestemp
and latitudinal or geographical rangesgeo
. If thecheckfishbase
is set toTRUE
then themode
parameter must be set to eithergeo or temp
. This function applies for only fish species.- mode
Either
geo
ortemp
for latitudinal ranges or temperature ranges respectively. Seethermal_ranges
orgeo_ranges
on how to obtain the data.- warn
Either
TRUE
to return warning messages orFALSE
for no warning messages. the defualt isFALSE
:
Examples
if (FALSE) { # \dontrun{
data("efidata")
data("jdsdata")
datafinal <- match_datasets(datasets = list(jds = jdsdata, efi=efidata),
lats = 'lat',
lons = 'lon',
species = c('speciesname','scientificName'),
date = c('Date', 'sampling_date'),
country = c('JDS4_site_ID'))
gbd <- check_names(data = datafinal, colsp='species', pct=90, merge=TRUE)
danube <- system.file('extdata/danube.shp.zip', package='specleanr')
db <- sf::st_read(danube, quiet=TRUE)
wcd <- terra::rast(system.file('extdata/worldclim.tiff', package='specleanr'))
refdata <- pred_extract(data = gbd, raster= wcd , lat = 'decimalLatitude', lon= 'decimalLongitude',
colsp = 'speciescheck',
bbox = db,
minpts = 10)
saldata <- refdata[['Salmo trutta']]
#1. checking the annual maean temperature (bio1) are within the ranges in FishBase
salmotherange <- thermal_ranges(x = "Salmo trutta")
sdatatemp <- ecological_ranges(data = saldata, var = 'bio1', species = "Salmo trutta",
checkfishbase = TRUE, mode = 'temp', output = 'outlier')
#zero record no outliers
#====
#2. geographical ranges: latitude longitude
#geo ranges in fishbase
salgeorange <- geo_ranges(data = "Salmo trutta")
sdatageo <- ecological_ranges(data = saldata, lat = 'y', lon = 'x', output = 'outlier',
species = "Salmo trutta",
checkfishbase = TRUE, mode = 'geo')
#3. GENERAL LITERATURE RANGES
#======
#1. when the min and and max are provided
#multiple FALSE SHOULD BE SET
#3.1: If only the minimum value is present: assuming minimum temperature is 6, varible: bio1
#direction less than 6.0 is outlier and greater is not
sdata <- ecological_ranges(data = saldata, ecoparam = 6.0, var = 'bio1',
direction = 'greater' )
#3.2
sdata2 <- ecological_ranges(data = saldata, var = 'bio1', minval = 2,
maxval = 24, species = "Salmo trutta" )
#4. Multiple TRUE
#the optimal parameters should be provided in a dataframe format with min max, or ecoparam
#4.1 optimal dataset
optdata <- data.frame(species= c("Salmo trutta", "Abramis brama"),
mintemp = c(6, 1.6),maxtemp = c(20, 21),
meantemp = c(8.5, 10.4), #ecoparam
direction = c('greater', 'greater'))
#parameter used is annual mean temperature (WORLDCLIM)
#provide the column with species names in the environment dataset
#set optimal list parameter
#
# #optimal parameters
sdata3 <- ecological_ranges(data = saldata, species = 'Salmo trutta',
var = 'bio1', output = "outlier",
optimumSettings = list(optdf = optdata,maxcol = "maxtemp",
mincol ="mintemp",optspcol = "species"))
#
#
#only one ecological parameter (ecoparam is provided) and direction
sdata4 <- ecological_ranges(data = saldata, species = 'Salmo trutta', var = 'bio1',
output = "outlier",
optimumSettings = list(optdf = optdata,
ecoparam = "meantemp",
optspcol = "species",
direction= "direction"))
} # }