Skip to contents

All functions

abdata
Alburnoides bipunctatus species data from GBIF and iNaturalist
adjustboxplots()
Adjust the boxplots bounding fences using medcouple to flag suspicious outliers.
bestmethod()
Identifies the best method for outlier detection for a single species.
boots()
To implement bootstrapping procedures. Sampling with replacement.
broad_classify()
Outlier detection method broad classification.
check.exclude()
indicate excluded columns.
check_names()
Check species names for inconsistencies
check_packages()
Check for packages to install and respond to use
classify_data()
Extract final clean data using either absolute or best method generated outliers.
cosine()
Cosine similarity index based on (Gautam & Kulkarni 2014; Joy & Renumol 2020)
datacleaner-class
Outlier detection class for multiple methods
distboxplot()
Distribution boxplot
ecological_ranges()
Check for environmental outliers using species optimal ranges.
efidata
EFIPLUS data used to develop ecological sensitivity parameters for riverine species in European streams and rivers.
eif()
Computes the empirical influence function for each values in the dataset
extentvalues()
To check for a bounding box
extractMethods()
List of outlier detection methods implemented in this package.
extract_clean_data()
Extract final clean data using either absolute or best method generated outliers.
extractoutliers()
Extract outliers for a one species
geo_ranges()
Checks for geographic ranges from FishBase
getdata()
Download species records from online database.
getdiff()
get dataframe from the large dataframe.
ggenvironmentalspace()
Title Plotting to show the quality controlled data in environmental space.
ggoutlieraccum()
Identify if enough methods are selected for the outlier detection.
ggoutliers()
Visualize the outliers identified by each method
hamming()
Identify best outlier detection method using Hamming distance.
hampel()
Flag suspicious outliers based on the Hampel filter method..
handle_true_errors()
Catch errors during methods implementation.
interquartile()
Computes interquartile range to flag environmental outliers
isoforest()
Identify outliers using isolation forest model.
jaccard()
Identifies the best outlier detection method using Jaccard coefficient.
jdsdata
Joint Danube Survey Data
jknife()
Identifies outliers using Reverse Jackknifing method based on Chapman et al., (2005).
kdat
Sequential fences constants
logboxplot()
Log boxplot based for outlier detection.
mahal()
Flags outliers based on Mahalanobis distance matrix for all records.
match.argc()
Customized match function
match_datasets()
Data harmonizing for offline data based on Darwin Core terms .
medianrule()
Median rule method
mixediqr()
Mixed Interquartile range and semiInterquartile range Walker et al., 2018
mth
mth datasets with constant at each confidence interval levels.
multiabsolute()
Identifies absolute outliers for multiple species.
multibestmethod()
Identify best method for outlier removal for multiple species using majority votes.
multidetect()
Ensemble multiple outlier detection methods.
ocindex()
Identifies absolute outliers and their proportions for a single species.
onesvm()
Identify outliers using One Class Support Vector Machines
optimal_threshold()
Optimize threshold for clean data extraction.
overlap()
Identifies best outlier detection method using Overlap coefficient.
pca()
Implement principal component analysis for dimension reduction
pcboot()
To package both principal component analysis and bootstrapping.
pred_extract()
Preliminary data cleaning including removing duplicates, records outside a particular basin, and NAs.
search_threshold()
Determine the threshold using Locally estimated or weighted Scatterplot Smoothing.
semiIQR()
Computes semi-interquantile range to flag suspicious outliers
seqfences()
Sequential fences method
show(<datacleaner>)
set method for displaying output details after outlier detection.
smc()
Identify best outlier detection method using simple matching coefficient.
sorensen()
Identifies best outlier detection method suing Sorensen Similarity Index.
thermal_ranges()
Collates minimum, maximum, and preferable temperatures from FishBase.
xglosh()
Global-Local Outlier Score from Hierarchies
xkmeans()
Flags outliers using kmeans clustering method
xknn()
k-nearest neighbors for outlier detection
xlof()
Flags suspicious using the local outlier factor or Density-Based Spatial Clustering of Applications with Noise.
zscore()
Computes z-scores to flag environmental outliers.