Skip to contents
-
abdata
- Alburnoides bipunctatus species data from GBIF and iNaturalist
-
adjustboxplots()
- Adjust the boxplots bounding fences using medcouple to flag suspicious outliers.
-
bestmethod()
- Identifies the best method for outlier detection for a single species.
-
boots()
- To implement bootstrapping procedures. Sampling with replacement.
-
broad_classify()
- Outlier detection method broad classification.
-
check.exclude()
- indicate excluded columns.
-
check_names()
- Check species names for inconsistencies
-
check_packages()
- Check for packages to install and respond to use
-
classify_data()
- Extract final clean data using either absolute or best method generated outliers.
-
cosine()
- Cosine similarity index based on (Gautam & Kulkarni 2014; Joy & Renumol 2020)
-
datacleaner-class
- Outlier detection class for multiple methods
-
distboxplot()
- Distribution boxplot
-
ecological_ranges()
- Check for environmental outliers using species optimal ranges.
-
efidata
- EFIPLUS data used to develop ecological sensitivity parameters for riverine species in European streams and rivers.
-
eif()
- Computes the empirical influence function for each values in the dataset
-
extentvalues()
- To check for a bounding box
-
extractMethods()
- List of outlier detection methods implemented in this package.
-
extract_clean_data()
- Extract final clean data using either absolute or best method generated outliers.
-
extractoutliers()
- Extract outliers for a one species
-
geo_ranges()
- Checks for geographic ranges from FishBase
-
getdata()
- Download species records from online database.
-
getdiff()
- get dataframe from the large dataframe.
-
ggenvironmentalspace()
- Title Plotting to show the quality controlled data in environmental space.
-
ggoutlieraccum()
- Identify if enough methods are selected for the outlier detection.
-
ggoutliers()
- Visualize the outliers identified by each method
-
hamming()
- Identify best outlier detection method using Hamming distance.
-
hampel()
- Flag suspicious outliers based on the Hampel filter method..
-
handle_true_errors()
- Catch errors during methods implementation.
-
interquartile()
- Computes interquartile range to flag environmental outliers
-
isoforest()
- Identify outliers using isolation forest model.
-
jaccard()
- Identifies the best outlier detection method using Jaccard coefficient.
-
jdsdata
- Joint Danube Survey Data
-
jknife()
- Identifies outliers using Reverse Jackknifing method based on Chapman et al., (2005).
-
kdat
- Sequential fences constants
-
logboxplot()
- Log boxplot based for outlier detection.
-
mahal()
- Flags outliers based on Mahalanobis distance matrix for all records.
-
match.argc()
- Customized match function
-
match_datasets()
- Data harmonizing for offline data based on Darwin Core terms .
-
medianrule()
- Median rule method
-
mixediqr()
- Mixed Interquartile range and semiInterquartile range
Walker et al., 2018
-
mth
- mth datasets with constant at each confidence interval levels.
-
multiabsolute()
- Identifies absolute outliers for multiple species.
-
multibestmethod()
- Identify best method for outlier removal for multiple species using majority votes.
-
multidetect()
- Ensemble multiple outlier detection methods.
-
ocindex()
- Identifies absolute outliers and their proportions for a single species.
-
onesvm()
- Identify outliers using One Class Support Vector Machines
-
optimal_threshold()
- Optimize threshold for clean data extraction.
-
overlap()
- Identifies best outlier detection method using Overlap coefficient.
-
pca()
- Implement principal component analysis for dimension reduction
-
pcboot()
- To package both principal component analysis and bootstrapping.
-
pred_extract()
- Preliminary data cleaning including removing duplicates, records outside a particular basin, and NAs.
-
search_threshold()
- Determine the threshold using Locally estimated or weighted Scatterplot Smoothing.
-
semiIQR()
- Computes semi-interquantile range to flag suspicious outliers
-
seqfences()
- Sequential fences method
-
show(<datacleaner>)
- set method for displaying output details after outlier detection.
-
smc()
- Identify best outlier detection method using simple matching coefficient.
-
sorensen()
- Identifies best outlier detection method suing Sorensen Similarity Index.
-
thermal_ranges()
- Collates minimum, maximum, and preferable temperatures from FishBase.
-
xglosh()
- Global-Local Outlier Score from Hierarchies
-
xkmeans()
- Flags outliers using kmeans clustering method
-
xknn()
- k-nearest neighbors for outlier detection
-
xlof()
- Flags suspicious using the local outlier factor or Density-Based Spatial Clustering of Applications with Noise.
-
zscore()
- Computes z-scores to flag environmental outliers.