Tools for Spell Checking in R
Spell checking common document formats including latex, markdown, manual pages, and description files. Includes utilities to automate checking of documentation and vignettes as a unit test during R CMD check. Both British and American English are supported out of the box and other languages can be added. In addition, packages may define a wordlist to allow custom terminology without having to abuse punctuation.
Scientific use casesFull Text of Scholarly Articles Across Many Data Sources
Provides a single interface to many sources of full text scholarly data, including Biomed Central, Public Library of Science, Pubmed Central, eLife, F1000Research, PeerJ, Pensoft, Hindawi, arXiv preprints, and more. Functionality included for searching for articles, downloading full or partial text, downloading supplementary materials, converting to various data formats.
Scientific use casesCreate Lightweight Schema.org Descriptions of Data
The goal of dataspice is to make it easier for researchers to create basic, lightweight, and concise metadata files for their datasets. These basic files can then be used to make useful information available during analysis, create a helpful dataset “README” webpage, and produce more complex metadata formats to aid dataset discovery. Metadata fields are based on the Schema.org and Ecological Metadata Language standards.
View DocumentationHandling Taxonomic Lists
Handling taxonomic lists through objects of class taxlist. This package provides functions to import species lists from Turboveg (https://www.synbiosys.alterra.nl/turboveg/) and the possibility to create backups from resulting R-objects. Also quick displays are implemented as summary-methods.
View DocumentationAdvanced Graphics and Image-Processing in R
Bindings to ImageMagick: the most comprehensive open-source image processing library available. Supports many common formats (png, jpeg, tiff, pdf, etc) and manipulations (rotate, scale, crop, trim, flip, blur, etc). All operations are vectorized via the Magick++ STL meaning they operate either on a single frame or a series of frames for working with layers, collages, or animation. In RStudio images are automatically previewed when printed to the console, resulting in an interactive editing environment. The latest version of the package includes a native graphics device for creating in-memory graphics or drawing onto images using pixel coordinates.
Scientific use casesComprehensive TIFF I/O with Full Support for ImageJ TIFF Files
General purpose TIFF file I/O for R users. Currently the only such package with read and write support for TIFF files with floating point (real-numbered) pixels, and the only package that can correctly import TIFF files that were saved from ImageJ and write TIFF files than can be correctly read by ImageJ https://imagej.nih.gov/ij/. Also supports text image I/O.
Scientific use casesDownload Weather Data from Environment and Climate Change Canada
Provides means for downloading historical weather data from the Environment and Climate Change Canada website (https://climate.weather.gc.ca/historical_data/search_historic_data_e.html). Data can be downloaded from multiple stations and over large date ranges and automatically processed into a single dataset. Tools are also provided to identify stations either by name or proximity to a location.
Scientific use casesCasts (R)Markdown files to XML and back
Casts (R)Markdown files to XML and back to allow their editing via XPat.
View DocumentationDownload Time Series Data from Waterinfo.be
wateRinfo facilitates access to waterinfo.be (https://www.waterinfo.be), a website managed by the Flanders Environment Agency (VMM) and Flanders Hydraulics Research. The website provides access to real-time water and weather related environmental variables for Flanders (Belgium), such as rainfall, air pressure, discharge, and water level. The package provides functions to search for stations and variables, and download time series.
View DocumentationCall BEAST2
BEAST2 (https://www.beast2.org) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. BEAST2 is a command-line tool. This package provides a way to call BEAST2 from an R function call.
View DocumentationAutomatic Package Testing
Automatic testing of R packages via a simple YAML schema.
View DocumentationModel Comparison Using babette
BEAST2 (https://www.beast2.org) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. mcbette allows to do a Bayesian model comparison over some site and clock models, using babette (https://www.github.com/ropensci/babette/).
View DocumentationInstall BEAST2 Packages
BEAST2 (https://www.beast2.org) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. BEAST2 is commonly accompanied by BEAUti 2 (https://www.beast2.org), which, among others, allows one to install BEAST2 package. This package allows to install BEAST2 packages from R.
View DocumentationBEAUti from R
BEAST2 (https://www.beast2.org) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. BEAUti 2 (which is part of BEAST2) is a GUI tool that allows users to specify the many possible setups and generates the XML file BEAST2 needs to run. This package provides a way to create BEAST2 input files without active user input, but using R function calls instead.
View DocumentationAn R Client for the ODK Central API
Utilities to access and tidy up data from ODK Centrals API. ODK Central is OpenDataKits clearinghouse for digitally captured data https://docs.opendatakit.org/central-intro/. ODK Central’s API is documented at https://odkcentral.docs.apiary.io/.
View DocumentationCatalogue of Life Plus Client
Client for the Catalogue of Life Plus (CoL+) webservice (https://github.com/CatalogueOfLife/general). The CoL+ webservice is a new interface to the Catalogue of Life. Includes functions for each of the API methods, including searching for names, and more.
View DocumentationArchive and Unarchive Databases Using Flat Files
Flat text files provide a robust, compressible, and portable way to store tables from databases. This package provides convenient functions for exporting tables from relational database connections into compressed text files and streaming those text files back into a database without requiring the whole table to fit in working memory.
View DocumentationEasily Download and Visualise Climate Data from CliFlo
CliFlo is a web portal to the New Zealand National Climate Database and provides public access (via subscription) to around 6,500 various climate stations (see https://cliflo.niwa.co.nz/ for more information). Collating and manipulating data from CliFlo (hence clifro) and importing into R for further analysis, exploration and visualisation is now straightforward and coherent. The user is required to have an internet connection, and a current CliFlo subscription (free) if data from stations, other than the public Reefton electronic weather station, is sought.
Scientific use casesCI-Agnostic Workflow Definitions
Provides a way to describe common build and deployment workflows for R-based projects: packages, websites (e.g. blogdown, pkgdown), or data processing (e.g. research compendia). The recipe is described independent of the continuous integration tool used for processing the workflow (e.g. Travis CI or AppVeyor). This package has been peer-reviewed by rOpenSci (v. 0.3.0.9004).
View DocumentationA Pipeline Toolkit for Reproducible Computation at Scale
A general-purpose computational engine for data analysis, drake rebuilds intermediate data objects when their dependencies change, and it skips work when the results are already up to date. Not every execution starts from scratch, there is native support for parallel and distributed computing, and completed projects have tangible evidence that they are reproducible. Extensive documentation, from beginner-friendly tutorials to practical examples and more, is available at the reference website https://docs.ropensci.org/drake/ and the online manual https://books.ropensci.org/drake/.
View DocumentationIUCN Red List Client
IUCN Red List (http://apiv3.iucnredlist.org/api/v3/docs) client. The IUCN Red List is a global list of threatened and endangered species. Functions cover all of the Red List API routes. An API key is required.
Scientific use casesA Flexible Container to Transport and Manipulate Data and Associated Resources
Provides a flexible container to transport and manipulate complex sets of data. These data may consist of multiple data files and associated meta data and ancillary files. Individual data objects have associated system level meta data, and data files are linked together using the OAI-ORE standard resource map which describes the relationships between the files. The OAI- ORE standard is described at https://www.openarchives.org/ore. Data packages can be serialized and transported as structured files that have been created following the BagIt specification. The BagIt specification is described at https://tools.ietf.org/html/draft-kunze-bagit-08.
View DocumentationConvert Among Citation Formats
Converts among many citation formats, including BibTeX, Citeproc, Codemeta, RDF XML, RIS, Schema.org, and Citation File Format. A low level R6 class is provided, as well as stand-alone functions for each citation format for both read and write.
View DocumentationSimple Git Client for R
Simple git client for R based on libgit2 with support for SSH and HTTPS remotes. All functions in gert use basic R data types (such as vectors and data-frames) for their arguments and return values. User credentials are shared with command line git through the git-credential store and ssh keys stored on disk or ssh-agent. On Linux, a somewhat recent version of libgit2 is required; we provide a PPA for older Ubuntu LTS versions.
View DocumentationAutomated Cleaning of Occurrence Records from Biological Collections
Automated flagging of common spatial and temporal errors in biological and paleontological collection data, for the use in conservation, ecology and paleontology. Includes automated tests to easily flag (and exclude) records assigned to country or province centroid, the open ocean, the headquarters of the Global Biodiversity Information Facility, urban areas or the location of biodiversity institutions (museums, zoos, botanical gardens, universities). Furthermore identifies per species outlier coordinates, zero coordinates, identical latitude/longitude and invalid coordinates. Also implements an algorithm to identify data sets with a significant proportion of rounded coordinates. Especially suited for large data sets. The reference for the methodology is: Zizka et al. (2019) doi:10.1111/2041-210X.13152.
Scientific use casesParse Scientific Names
Parse scientific names using gnparser (https://gitlab.com/gogna/gnparser), written in Go. gnparser parses scientific names into their component parts; it utilizes a Parsing Expression Grammar specifically for scientific names.
View DocumentationNOAA Weather Data from R
Client for many NOAA data sources including the NCDC climate API at https://www.ncdc.noaa.gov/cdo-web/webservices/v2, with functions for each of the API endpoints: data, data categories, data sets, data types, locations, location categories, and stations. In addition, we have an interface for NOAA sea ice data, the NOAA severe weather inventory, NOAA Historical Observing Metadata Repository (HOMR) data, NOAA storm data via IBTrACS, tornado data via the NOAA storm prediction center, and more.
Scientific use caseseBird Data Extraction and Processing in R
Extract and process bird sightings records from eBird (http://ebird.org), an online tool for recording bird observations. Public access to the full eBird database is via the eBird Basic Dataset (EBD; see http://ebird.org/ebird/data/download for access), a downloadable text file. This package is an interface to AWK for extracting data from the EBD based on taxonomic, spatial, or temporal filters, to produce a manageable file size that can be imported into R.
View DocumentationA Tool for Automating Download and Preprocessing of MODIS Land Products Data
Allows automating the creation of time series of rasters derived from MODIS Satellite Land Products data. It performs several typical preprocessing steps such as download, mosaicking, reprojection and resize of data acquired on a specified time period. All processing parameters can be set using a user-friendly GUI. Users can select which layers of the original MODIS HDF files they want to process, which additional Quality Indicators should be extracted from aggregated MODIS Quality Assurance layers and, in the case of Surface Reflectance products , which Spectral Indexes should be computed from the original reflectance bands. For each output layer, outputs are saved as single-band raster files corresponding to each available acquisition date. Virtual files allowing access to the entire time series as a single file are also created. Command-line execution exploiting a previously saved processing options file is also possible, allowing to automatically update time series related to a MODIS product whenever a new image is available.
Scientific use casesInteractive, Complex Heatmaps
Make complex, interactive heatmaps. iheatmapr includes a modular system for iteratively building up complex heatmaps, as well as the iheatmap() function for making relatively standard heatmaps.
Scientific use casesInterface to the Global Biodiversity Information Facility API
A programmatic interface to the Web Service methods provided by the Global Biodiversity Information Facility (GBIF; https://www.gbif.org/developer/summary). GBIF is a database of species occurrence records from sources all over the globe. rgbif includes functions for searching for taxonomic names, retrieving information on data providers, getting species occurrence records, getting counts of occurrence records, and using the GBIF tile map service to make rasters summarizing huge amounts of data.
Scientific use casesAccess iNaturalist Data Through APIs
A programmatic interface to the API provided by the iNaturalist website https://www.inaturalist.org/ to download species occurrence data submitted by citizen scientists.
View DocumentationBielefeld Academic Search Engine (BASE) Client
Interface to the API for the Bielefeld Academic Search Engine (BASE) (https://www.base-search.net/). BASE is a search engine for more than 150 million scholarly documents from more than 7000 sources. Methods are provided for searching for documents, as well as getting information on higher level groupings of documents: collections and repositories within collections. Search includes faceting, so you can get a high level overview of number of documents across a given variable (e.g., year). BASE asks users to respect a rate limit, but does not enforce it themselves; we enforce that rate limit.
View DocumentationParse Messy Geographic Coordinates
Parse geographic coordinates from various formats to decimal degree numeric values. Parse coordinates into their parts (degree, minutes, seconds); calculate hemisphere from coordinates; pull out individually degrees, minutes, or seconds; add and subtract degrees, minutes, and seconds. C++ code herein originally inspired from code written by Jeffrey D. Bogan, but then completely re-written.
View DocumentationWeb scraper for Atlantic and east Pacific hurricanes and tropical storms
Get archived data of past and current hurricanes and tropical
storms for the Atlantic and eastern Pacific oceans. Data is available for
storms since 1998. Datasets are updated via the rrricanesdata package.
Currently, this package is about 6MB of datasets. See the README or view
vignette("drat")
for more information.
Visualize Species Occurrence Data
Utilities for visualizing species occurrence data. Includes functions to visualize occurrence data from spocc, rgbif, and other packages. Mapping options included for base R plots, ggplot2, leaflet and GitHub gists.
View DocumentationClean Biological Occurrence Records
Clean biological occurrence records. Includes functionality for cleaning based on various aspects of spatial coordinates, unlikely values due to political centroids, coordinates based on where collections of specimens are held, and more.
Scientific use casesMangal Client
An interface to the Mangal database - a collection of ecological networks. This package includes functions to work with the Mangal RESTful API methods (https://mangal.io/doc/api/).
View DocumentationTaxonomic Information from Around the Web
Interacts with a suite of web APIs for taxonomic tasks, such as getting database specific taxonomic identifiers, verifying species names, getting taxonomic hierarchies, fetching downstream and upstream taxonomic names, getting taxonomic synonyms, converting scientific to common names and vice versa, and more.
Scientific use casesAccesses Weather Data from the Iowa Environment Mesonet
Allows to get weather data from Automated Surface Observing System (ASOS) stations (airports) in the whole world thanks to the Iowa Environment Mesonet website.
Scientific use casesAustralian Government Bureau of Meteorology (BOM) Data Client
Provides functions to interface with Australian Government Bureau of Meteorology (BOM) data, fetching data and returning a data frame of precis forecasts, historical and current weather data from stations, agriculture bulletin data, BOM 0900 or 1500 weather bulletins and downloading and importing radar and satellite imagery files. Data (c) Australian Government Bureau of Meteorology Creative Commons (CC) Attribution 3.0 licence or Public Access Licence (PAL) as appropriate. See http://www.bom.gov.au/other/copyright.shtml for further details.
Scientific use casesNASA POWER API Client
Client for NASA POWER global meteorology, surface solar energy and climatology data API. POWER (Prediction Of Worldwide Energy Resource) data are freely available global meteorology and surface solar energy climatology data for download with a resolution of 1/2 by 1/2 arc degree longitude and latitude and are funded through the NASA Earth Science Directorate Applied Science Program. For more on the data themselves, a web-based data viewer and web access, please see https://power.larc.nasa.gov/.
Scientific use casesGlobal Surface Summary of the Day (GSOD) Weather Data Client
Provides automated downloading, parsing, cleaning, unit conversion and formatting of Global Surface Summary of the Day (GSOD) weather data from the from the USA National Centers for Environmental Information (NCEI). Units are converted from from United States Customary System (USCS) units to International System of Units (SI). Stations may be individually checked for number of missing days defined by the user, where stations with too many missing observations are omitted. Only stations with valid reported latitude and longitude values are permitted in the final data. Additional useful elements, saturation vapour pressure (es), actual vapour pressure (ea) and relative humidity (RH) are calculated from the original data using the improved August-Roche-Magnus approximation (Alduchov & Eskridge 1996) and included in the final data set. The resulting metadata include station identification information, country, state, latitude, longitude, elevation, weather observations and associated flags. For information on the GSOD data from NCEI, please see the GSOD readme.txt file available from, https://www1.ncdc.noaa.gov/pub/data/gsod/readme.txt.
Scientific use casesAPI Client for CHIRPS
API Client for the Climate Hazards Group InfraRed Precipitation with Station Data CHIRPS. The CHIRPS data is a 35+ year quasi-global rainfall data set, which incorporates 0.05 arc-degrees resolution satellite imagery, and in-situ station data to create gridded rainfall time series for trend analysis and seasonal drought monitoring. For more details on CHIRPS data please visit its official home page https://www.chc.ucsb.edu/data/chirps. Requests from large time series (> 10 years) and large geographic coverage (global scale) may take several minutes.
View DocumentationCRU CL v. 2.0 Climatology Client
Provides functions that automate downloading and importing University of East Anglia Climate Research Unit (CRU) CL v. 2.0 climatology data, facilitates the calculation of minimum temperature and maximum temperature and formats the data into a tidy data frame as a tibble or a list of raster stack objects for use. CRU CL v. 2.0 data are a gridded climatology of 1961-1990 monthly means released in 2002 and cover all land areas (excluding Antarctica) at 10 arcminutes (0.1666667 degree) resolution. For more information see the description of the data provided by the University of East Anglia Climate Research Unit, https://crudata.uea.ac.uk/cru/data/hrg/tmc/readme.txt.
View DocumentationA Test Environment for Database Requests
Testing and documenting code that communicates with remote databases can be painful. Although the interaction with R is usually relatively simple (e.g. data(frames) passed to and from a database), because they rely on a separate service and the data there, testing them can be difficult to set up, unsustainable in a continuous integration environment, or impossible without replicating an entire production cluster. This package addresses that by allowing you to make recordings from your database interactions and then play them back while testing (or in other contexts) all without needing to spin up or have access to the database your code would typically connect to.
View DocumentationAn R Client to the PatentsView API
Provides functions to simplify the PatentsView API (http://www.patentsview.org/api/doc.html) query language, send GET and POST requests to the API’s seven endpoints, and parse the data that comes back.
View DocumentationAccess and Search MedRxiv and BioRxiv Preprint Data
An increasingly important source of health-related bibliographic content are preprints - preliminary versions of research articles that have yet to undergo peer review. The two preprint repositories most relevant to health-related sciences are medRxiv https://www.medrxiv.org/ and bioRxiv https://www.biorxiv.org/, both of which are operated by the Cold Spring Harbor Laboratory. medrxivr provides programmatic access to the Cold Spring Harbour Laboratory (CSHL) API https://api.biorxiv.org/, allowing users to easily download medRxiv and bioRxiv preprint metadata (e.g. title, abstract, publication date, author list, etc) into R. medrxivr also provides functions to search the downloaded preprint records using regular expressions and Boolean logic, as well as helper functions that allow users to export their search results to a .BIB file for easy import to a reference manager and to download the full-text PDFs of preprints matching their search criteria.
View DocumentationExtract and Tidy Canadian Hydrometric Data
Provides functions to access historical and real-time national hydrometric data from Water Survey of Canada data sources (https://dd.weather.gc.ca/hydrometric/csv/ and https://collaboration.cmc.ec.gc.ca/cmc/hydrometrics/www/) and then applies tidy data principles.
Scientific use casesControl BEAST2
BEAST2 (https://www.beast2.org) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. BEAST2 is commonly accompanied by BEAUti 2, Tracer and DensiTree. babette provides for an alternative workflow of using all these tools separately. This allows doing complex Bayesian phylogenetics easily and reproducibly from R.
View DocumentationR client package for the Circle CI API
Tools for interacting with the Circle CI API. Besides executing common tasks such as querying build logs and restarting builds, this package also helps setting up permissions to deploy from builds.
View DocumentationInterface to the OpenCage API
Tool for accessing the OpenCage API, which provides forward geocoding (from placename to longitude and latitude) and reverse geocoding (from longitude and latitude to placename).
Scientific use casesInterface to the Search API for PLoS Journals
A programmatic interface to the SOLR based search API (http://api.plos.org/) provided by the Public Library of Science journals to search their articles. Functions are included for searching for articles, retrieving articles, making plots, doing faceted searches, highlight searches, and viewing results of highlighted searches in a browser.
Scientific use casesInterface to the Pleiades Archeological Database
Provides a set of functions for interacting with the Pleiades (https://pleiades.stoa.org/) API, including getting status data, places data, and creating a GeoJSON based map on GitHub gists.
View DocumentationImport OpenStreetMap Data as Simple Features or Spatial Objects
Download and import of OpenStreetMap (OSM) data as sf or sp objects. OSM data are extracted from the Overpass web server (http://overpass-api.de/) and processed with very fast C++ routines for return to R.
Scientific use casesStubbing and Setting Expectations on HTTP Requests
Stubbing and setting expectations on HTTP requests. Includes tools for stubbing HTTP requests, including expected request conditions and response conditions. Match on HTTP method, query parameters, request body, headers and more. Can be used for unit tests or outside of a testing context.
View DocumentationClient for the cranchecks.info API
Client for the cranchecks.info API.
View DocumentationFingertips Data for Public Health
Fingertips (http://fingertips.phe.org.uk/) contains data for many indicators of public health in England. The underlying data is now more easily accessible by making use of the API.
Scientific use casesMake Fake Data
Make fake data, supporting addresses, person names, dates, times, colors, coordinates, currencies, digital object identifiers (DOIs), jobs, phone numbers, DNA sequences, doubles and integers from distributions and within a range.
View DocumentationDownloading Supplementary Data from Published Manuscripts
Downloads data supplementary materials from manuscripts, using papers’ DOIs as references. Facilitates open, reproducible research workflows: scientists re-analyzing published datasets can work with them as easily as if they were stored on their own computer, and others can track their analysis workflow painlessly. The main function suppdata() returns a (temporary) location on the user’s computer where the file is stored, making it simple to use suppdata() with standard functions like read.csv().
Scientific use casesSustainable Transport Planning
Tools for transport planning with an emphasis on spatial transport data and non-motorized modes. Enables common transport planning tasks including: downloading and cleaning transport datasets; creating geographic “desire lines” from origin-destination (OD) data; route assignment, locally and via interfaces to routing services such as https://cyclestreets.net/; calculation of route segment attributes such as bearing and aggregate flow; and travel watershed analysis. See Lovelace and Ellison (2018) doi:10.32614/RJ-2018-053 and vignettes for details.
Scientific use casesGenerates Networks from BTS Data
A flexible tool that allows generating bespoke air transport statistics for urban studies based on publicly available data from the Bureau of Transport Statistics (BTS) in the United States https://www.transtats.bts.gov/databases.asp?Mode_ID=1&Mode_Desc=Aviation&Subject_ID2=0.
Scientific use casesManipulation of Matched Phylogenies and Data using data.table
An implementation that combines trait data and a phylogenetic tree (or trees) into a single object of class treedata.table. The resulting object can be easily manipulated to simultaneously change the trait- and tree-level sampling. Currently implemented functions allow users to use a data.table syntax when performing operations on the trait dataset within the treedata.table object.
View DocumentationLinguistic Typology and Mapping
Provides R with the Glottolog database https://glottolog.org/ and some more abilities for purposes of linguistic mapping. The Glottolog database contains the catalogue of languages of the world. This package helps researchers to make a linguistic maps, using philosophy of the Cross-Linguistic Linked Data project https://clld.org/, which allows for while at the same time facilitating uniform access to the data across publications. A tutorial for this package is available on GitHub pages https://docs.ropensci.org/lingtypology/ and package vignette. Maps created by this package can be used both for the investigation and linguistic teaching. In addition, package provides an ability to download data from typological databases such as WALS, AUTOTYP and some others and to create your own database website.
Scientific use casesA Binary Download Manager
Tools and functions for managing the download of binary files. Binary repositories are defined in YAML format. Defining new pre-download, download and post-download templates allow additional repositories to be added.
View DocumentationA High-Performance Local Taxonomic Database Interface
Creates a local database of many commonly used taxonomic authorities and provides functions that can quickly query this data.
View DocumentationChemical Information from the Web
Chemical information from around the web. This package interacts with a suite of web services for chemical information. Sources include: Alan Wood’s Compendium of Pesticide Common Names, Chemical Identifier Resolver, ChEBI, Chemical Translation Service, ChemIDplus, ChemSpider, ETOX, Flavornet, NIST Chemistry WebBook, OPSIN, PAN Pesticide Database, PubChem, SRS, Wikidata.
Scientific use casesClient for Various CrossRef APIs
Client for various CrossRef APIs, including metadata search with their old and newer search APIs, get citations in various formats (including bibtex, citeproc-json, rdf-xml, etc.), convert DOIs to PMIDs, and vice versa, get citations for DOIs, and get links to full text of articles when available.
Scientific use casesPrint Maps, Draw on Them, Scan Them Back in
Print maps, draw on them, scan them back in, and convert to spatial objects.
View DocumentationDownload and Explore Datasets from UCSC Xena Data Hubs
Download and explore datasets from UCSC Xena data hubs, which are a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others. Databases are normalized so they can be combined, linked, filtered, explored and downloaded.
Scientific use casesConstruct Reproducible Analytic Data Sets as R Packages
A framework to help construct R data packages in a reproducible manner. Potentially time consuming processing of raw data sets into analysis ready data sets is done in a reproducible manner and decoupled from the usual R CMD build process so that data sets can be processed into R objects in the data package and the data package can then be shared, built, and installed by others without the need to repeat computationally costly data processing. The package maintains data provenance by turning the data processing scripts into package vignettes, as well as enforcing documentation and version checking of included data objects. Data packages can be version controlled in github, and used to share data for manuscripts, collaboration and general reproducibility.
Scientific use casesR Interface to FishBase
A programmatic interface to http://www.fishbase.org, re-written based on an accompanying RESTful API. Access tables describing over 30,000 species of fish, their biology, ecology, morphology, and more. This package also supports experimental access to http://www.sealifebase.org data, which contains nearly 200,000 species records for all types of aquatic life not covered by FishBase.
Scientific use casesHTTP Client
A simple HTTP client, with tools for making HTTP requests, and mocking HTTP requests. The package is built on R6, and takes inspiration from Rubys faraday’ gem (https://rubygems.org/gems/faraday). The package name is a play on curl, the widely used command line tool for HTTP, and this package is built on top of the R package curl, an interface to libcurl (https://curl.haxx.se/libcurl).
View DocumentationInterface to the MODIS Land Products Subsets Web Services
Programmatic interface to the Oak Ridge National Laboratories MODIS Land Products Subsets web services (https://modis.ornl.gov/data/modis_webservice.html). Allows for easy downloads of MODIS time series directly to your R workspace or your computer.
Scientific use casesWork with Open Road Traffic Casualty Data from Great Britain
Tools to help download, process and analyse the UK road collision data collected using the STATS19 form. The data are provided as CSV files with detailed road safety data about the circumstances of car crashes and other incidents on the roads resulting in casualties in Great Britain from 1979, the types (including make and model) of vehicles involved and the consequential casualties. The statistics relate only to personal casualties on public roads that are reported to the police, and subsequently recorded, using the STATS19 accident reporting form. See the Department for Transport website https://data.gov.uk/dataset/cb7ae6f0-4be6-4935-9277-47e5ce24a11f/road-safety-data for more information on these data.
View DocumentationSetup, Run and Analyze NetLogo Model Simulations from R via XML
Setup, run and analyze NetLogo (https://ccl.northwestern.edu/netlogo/) model simulations in R. nlrx experiments use a similar structure as NetLogos Behavior Space experiments. However, nlrx offers more flexibility and additional tools for running and analyzing complex simulation designs and sensitivity analyses. The user defines all information that is needed in an intuitive framework, using class objects. Experiments are submitted from R to NetLogo via XML files that are dynamically written, based on specifications defined by the user. By nesting model calls in future environments, large simulation design with many runs can be executed in parallel. This also enables simulating NetLogo experiments on remote high performance computing machines. In order to use this package, Java and NetLogo (>= 5.3.1) need to be available on the executing system.
Scientific use casesCompact and Flexible Summaries of Data
A simple to use summary function that can be used with pipes and displays nicely in the console. The default summary statistics may be modified by the user as can the default formatting. Support for data frames and vectors is included, and users can implement their own skim methods for specific object types as described in a vignette. Default summaries include support for inline spark graphs. Instructions for managing these on specific operating systems are given in the “Using skimr” vignette and the README.
Scientific use casesA High-Performance Database of Shipment-Level CITES Trade Data
Provides convenient access to over 40 years and 20 million records of endangered wildlife trade data from the Convention on International Trade in Endangered Species of Wild Fauna and Flora, stored on a local on-disk, out-of memory DuckDB database for bulk analysis.
Scientific use casesWorking with Audio and Video in R
Bindings to FFmpeg http://www.ffmpeg.org/ AV library for working with audio and video in R. Generates high quality video from images or R graphics with custom audio. Also offers high performance tools for reading raw audio, creating spectrograms, and converting between countless audio / video formats. This package interfaces directly to the C API and does not require any command line utilities.
View DocumentationRead Spectrometric Data and Metadata
Parse various reflectance/transmittance/absorbance spectra file formats to extract spectral data and metadata, as described in Gruson, White & Maia (2019) doi:10.21105/joss.01857. Among other formats, it can import files from Avantes https://www.avantes.com/, CRAIC http://www.microspectra.com/, and OceanInsight (formerly OceanOptics) https://www.oceaninsight.com/ brands.
View DocumentationEcological Metadata as Linked Data
This is a utility for transforming Ecological Metadata Language (EML) files into JSON-LD and back into EML. Doing so creates a list-based representation of EML in R, so that EML data can easily be manipulated using standard R tools. This makes this package an effective backend for other R-based tools working with EML. By abstracting away the complexity of XML Schema, developers can build around native R list objects and not have to worry about satisfying many of the additional constraints of set by the schema (such as element ordering, which is handled automatically). Additionally, the JSON-LD representation enables the use of developer-friendly JSON parsing and serialization that may facilitate the use of EML in contexts outside of R, as well as the informatics-friendly serializations such as RDF and SPARQL queries.
View DocumentationBibtex Parser
Utility to parse a bibtex file.
View DocumentationRecord HTTP Calls to Disk
Record test suite HTTP requests and replays them during future runs. A port of the Ruby gem of the same name (https://github.com/vcr/vcr/). Works by hooking into the webmockr R package for matching HTTP requests by various rules (HTTP method, URL, query parameters, headers, body, etc.), and then caching real HTTP responses on disk in cassettes. Subsequent HTTP requests matching any previous requests in the same cassette use a cached HTTP response.
View DocumentationR Interface to the Data Retriever
Provides an R interface to the Data Retriever https://retriever.readthedocs.io/en/latest/ via the Data Retriever’s command line interface. The Data Retriever automates the tasks of finding, downloading, and cleaning public datasets, and then stores them in a local database.
View DocumentationFetch Scholary Full Text from Crossref
Text mining client for Crossref (https://crossref.org). Includes functions for getting getting links to full text of articles, fetching full text articles from those links or Digital Object Identifiers (DOIs), and text extraction from PDFs.
View DocumentationBindings to OpenCV Computer Vision Library
Experimenting with computer vision and machine learning in R. This package exposes some of the available OpenCV https://opencv.org/ algorithms, such as edge, body or face detection. These can either be applied to analyze static images, or to filter live video footage from a camera device.
View DocumentationExtract Scientific Names from Text
Extract scientific names from text using the Golang tool gnfinder https://github.com/gnames/gnfinder.
View DocumentationTracer from R
BEAST2 (https://www.beast2.org) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. Tracer (http://tree.bio.ed.ac.uk/software/tracer/) is a GUI tool to parse and analyze the files generated by BEAST2. This package provides a way to parse and analyze BEAST2 input files without active user input, but using R function calls instead.
View DocumentationWorking with Sets the Tidy Way
Implements a class and methods to work with sets, doing intersection, union, complementary, power sets, cartesian product and other set operations in a “tidy” way. These set operations are available for both classical sets and fuzzy sets. Load sets from several data structures or import them from several formats.
View DocumentationDownload and Process Public Domain Works from Project Gutenberg
Download and process public domain works in the Project Gutenberg collection http://www.gutenberg.org/. Includes metadata for all Project Gutenberg works, so that they can be searched and retrieved.
View DocumentationAccess to the Neotoma Paleoecological Database Through R
Access paleoecological datasets from the Neotoma Paleoecological Database using the published API (http://wnapi.neotomadb.org/). The functions in this package access various pre-built API functions and attempt to return the results from Neotoma in a usable format for researchers and the public.
Scientific use casesGenerate CodeMeta Metadata for R Packages
The Codemeta Project defines a JSON-LD format for describing software metadata, as detailed at https://codemeta.github.io. This package provides utilities to generate, parse, and modify codemeta.json files automatically for R packages, as well as tools and examples for working with codemeta.json JSON-LD more generally.
View DocumentationIntegrated Taxonomic Information System Client
An interface to the Integrated Taxonomic Information System (ITIS) (https://www.itis.gov). Includes functions to work with the ITIS REST API methods (https://www.itis.gov/ws_description.html), as well as the Solr web service (https://www.itis.gov/solr_documentation.html).
Scientific use casesAccess the Global Plant Phenology Data Portal
An R interface to the Global Plant Phenology Data Portal, which is accessible online at https://www.plantphenology.org/.
View DocumentationCache Mocked HTTP Requests
Cache mocked HTTP requests, leveraging webmockr for the HTTP request matching.
View DocumentationGeneral Purpose GraphQL Client
A GraphQL client, with an R6 interface for initializing a connection to a GraphQL instance, and methods for constructing queries, including fragments and parameterized queries. Queries are checked with the libgraphqlparser C++ parser via the gaphql package.
View DocumentationScan Secrets in R Scripts, Packages, or Projects
Scan secrets in r scripts, packages, or projects.
View DocumentationTime Travel to Test Time Dependent Code
Time travel to test time dependent code.
View DocumentationTime Classes
Time classes, with hooks for mocking time.
View DocumentationConvert Complex Objects to and from R Data Structures
Convert complex objects to and from R data structures.
View DocumentationTools for Vizualizing Data Taxonomically
Tools for vizualizing data taxonomically.
View DocumentationClient for Citoid
Client for Citoid (https://www.mediawiki.org/wiki/Citoid), an API for getting citations for various scholarly work identifiers found on Wikipedia.
View DocumentationNoSQL Database Connector
Simplified document database manipulation and analysis, including support for many NoSQL databases, including document databases (Elasticsearch, CouchDB, MongoDB), key-value databases (Redis), and (with limitations) SQLite/json1.
View DocumentationMicrosoft Academic API Client
The Microsoft Academic Knowledge API provides programmatic access to scholarly articles in the Microsoft Academic Graph (https://academic.microsoft.com/). Includes methods matching all ‘Microsoft Academic’ API routes, including search, graph search, text similarity, and interpret natural language query string.
View DocumentationControl How Many Times Conditions are Thrown
Provides ability to control how many times in function calls conditions are thrown (shown to the user). Includes control of warnings and messages.
View DocumentationFunctions to Automate Downloading Geospatial Data Available from Several Federated Data Sources
Functions to automate downloading geospatial data available from several federated data sources (mainly sources maintained by the US Federal government). Currently, the package enables extraction from seven datasets: The National Elevation Dataset digital elevation models (1 and 1/3 arc-second; USGS); The National Hydrography Dataset (USGS); The Soil Survey Geographic (SSURGO) database from the National Cooperative Soil Survey (NCSS), which is led by the Natural Resources Conservation Service (NRCS) under the USDA; the Global Historical Climatology Network (GHCN), coordinated by National Climatic Data Center at NOAA; the Daymet gridded estimates of daily weather parameters for North America, version 3, available from the Oak Ridge National Laboratory’s Distributed Active Archive Center (DAAC); the International Tree Ring Data Bank; and the National Land Cover Database (NLCD).
Scientific use casesKeep a Collection of Sparkly Data Resources
Tools to get and maintain a data repository from third-party data providers.
View DocumentationBase Classes and Functions for Phylogenetic Tree Input and Output
treeio is an R package to make it easier to import and store phylogenetic tree with associated data; and to link external data from different sources to phylogeny. It also supports exporting phylogenetic tree with heterogeneous associated data to a single tree file and can be served as a platform for merging tree with associated data and converting file formats.
Scientific use casesGroup Animal Relocation Data by Spatial and Temporal Relationship
Detects spatial and temporal groups in GPS relocations (Robitaille et al. (2020) doi:10.1111/2041-210X.13215). It can be used to convert GPS relocations to gambit-of-the-group format to build proximity-based social networks In addition, the randomizations function provides data-stream randomization methods suitable for GPS data.
Scientific use casesOpen Trade Statistics API Wrapper and Utility Program
Access Open Trade Statistics API from R to download international trade data.
View DocumentationZooBank API Client
Interface to the ZooBank API (http://zoobank.org/Api) client. ZooBank (http://zoobank.org/) is the official registry of zoological nomenclature. Methods are provided for using each of the API endpoints, including for querying by author, querying for publications, get statistics on ZooBank activity, and more.
View DocumentationDownload and Prepare C14 Dates from Different Source Databases
Query different C14 date databases and apply basic data cleaning, merging and calibration steps. Currently available databases: 14cpalaeolithic, 14sea, adrac, austarch, calpal, context, emedyd, eubar, euroevol, irdd, jomon, katsianis, kiteeastafrica, medafricarbon, mesorad, pacea, palmisano, radon, radonb.
View DocumentationTools for Working with Taxonomic Databases
Tools for working with taxonomic databases, including utilities for downloading databases, loading them into various SQL databases, cleaning up files, and providing a SQL connection that can be used to do SQL queries directly or used in dplyr.
Scientific use casesHigh-Performance Stemmer, Tokenizer, and Spell Checker
Low level spell checker and morphological analyzer based on the famous hunspell library https://hunspell.github.io. The package can analyze or check individual words as well as parse text, latex, html or xml documents. For a more user-friendly interface use the spelling package which builds on this package to automate checking of files, documentation and vignettes in all common formats.
Scientific use casesSet Up Travis for Testing and Deployment
Tools for interacting with the Travis API for setting up continuous integration for R packages and other R-based projects.
View DocumentationFetch Sections of XML Scholarly Articles
Get chunks of XML scholarly articles without having to know how to work with XML. Custom mappers for each publisher and for each article section pull out the information you want. Works with outputs from package fulltext, xml2 package documents, and file paths to XML documents.
View DocumentationA robots.txt Parser and Webbot/Spider/Crawler Permissions Checker
Provides functions to download and parse robots.txt files. Ultimately the package makes it easy to check if bots (spiders, crawler, scrapers, …) are allowed to access specific resources on a domain.
View DocumentationInterface to the Orcid.org API
Client for the Orcid.org API (https://orcid.org/). Functions included for searching for people, searching by DOI, and searching by Orcid ID.
View DocumentationEmail Address Validation
Email Address Validation.
View DocumentationInterface to Virtuoso using ODBC
Provides users with a simple and convenient mechanism to manage and query a Virtuoso database using the DBI (Data-Base Interface) compatible ODBC (Open Database Connectivity) interface. Virtuoso is a high-performance “universal server,” which can act as both a relational database, supporting standard Structured Query Language (SQL) queries, while also supporting data following the Resource Description Framework (RDF) model for Linked Data. RDF data can be queried using SPARQL (SPARQL Protocol and RDF Query Language) queries, a graph-based query that supports semantic reasoning. This allows users to leverage the performance of local or remote Virtuoso servers using popular R packages such as DBI and dplyr, while also providing a high-performance solution for working with large RDF triplestores from R. The package also provides helper routines to install, launch, and manage a Virtuoso server locally on Mac, Windows and Linux platforms using the standard interactive installers from the R command-line. By automatically handling these setup steps, the package can make using Virtuoso considerably faster and easier for a most users to deploy in a local environment. Managing the bulk import of triples from common serializations with a single intuitive command is another key feature of this package. Bulk import performance can be tens to hundreds of times faster than the comparable imports using existing R tools, including rdflib and redland packages.
View DocumentationSetup and connect to OpenTripPlanner
Setup and connect to OpenTripPlanner (OTP) http://www.opentripplanner.org/. OTP is an open source platform for multi-modal and multi-agency journey planning written in Java. The package allows you to manage a local version or connect to remote OTP server. This package has been peer-reviewed by rOpenSci (v. 0.2.0.0).
View DocumentationGet SNP (Single-Nucleotide Polymorphism) Data on the Web
A programmatic interface to various SNP datasets on the web: OpenSNP (https://opensnp.org), and NBCIs dbSNP database (https://www.ncbi.nlm.nih.gov/projects/SNP/). Functions are included for searching for NCBI. For OpenSNP, functions are included for getting SNPs, and data for genotypes, phenotypes, annotations, and bulk downloads of data by user.
Scientific use casesWork with GitHub Gists
Work with GitHub gists from R (e.g., https://en.wikipedia.org/wiki/GitHub#Gist, https://docs.github.com/en/github/writing-on-github/creating-gists/). A gist is simply one or more files with code/text/images/etc. This package allows the user to create new gists, update gists with new files, rename files, delete files, get and delete gists, star and un-star gists, fork gists, open a gist in your default browser, get embed code for a gist, list gist commits, and get rate limit information when authenticated. Some requests require authentication and some do not. Gists website: https://gist.github.com/.
View DocumentationSpecies Trait Data from Around the Web
Species trait data from many different sources, including sequence data from NCBI (https://www.ncbi.nlm.nih.gov/), plant trait data from BETYdb, data from EOL Traitbank, Birdlife International, and more.
Scientific use casesInterface to the CAVD DataSpace
Provides a convenient API interface to access immunological data within the CAVD DataSpace(https://dataspace.cavd.org), a data sharing and discovery tool that facilitates exploration of HIV immunological data from pre-clinical and clinical HIV vaccine studies.
View DocumentationInterface to the Open Science Framework (OSF)
An interface for interacting with OSF (https://osf.io). osfr enables you to access open research materials and data, or create and manage your own private or public projects.
Scientific use casesAccess Data from the Oregon State Prism Climate Project
Allows users to access the Oregon State Prism climate data (http://www.prism.oregonstate.edu/). Using the web service API data can easily downloaded in bulk and loaded into R for spatial analysis. Some user friendly visualizations are also provided.
View DocumentationClient for CCAFS GCM Data
Client for Climate Change, Agriculture, and Food Security (CCAFS) General Circulation Models (GCM) data. Data is stored in Amazon S3, from which we provide functions to fetch data.
View DocumentationDrugBank Database XML Parser
This tool is for parsing the DrugBank XML database https://www.drugbank.ca/. The parsed data are then returned in a proper R dataframe with the ability to save them in a given database.
View DocumentationExport Data Frames to Excel xlsx Format
Zero-dependency data frame to xlsx exporter based on libxlsxwriter. Fast and no Java or Excel required.
Scientific use casesClient for the Open Citations Corpus
Client for the Open Citations Corpus (http://opencitations.net/). Includes a set of functions for getting one identifier type from another, as well as getting references and citations for a given identifier.
View DocumentationStore and Retrieve Data.frames in a Git Repository
Make versioning of data.frame easy and efficient using git repositories.
View DocumentationManaging Larger Data on a GitHub Repository
Because larger (> 50 MB) data files cannot easily be committed to git, a different approach is required to manage data associated with an analysis in a GitHub repository. This package provides a simple work-around by allowing larger (up to 2 GB) data files to piggyback on a repository as assets attached to individual GitHub releases. These files are not handled by git in any way, but instead are uploaded, downloaded, or edited directly by calls through the GitHub API. These data files can be versioned manually by creating different releases. This approach works equally well with public or private repositories. Data can be uploaded and downloaded programmatically from scripts. No authentication is required to download data from public repositories.
Scientific use casesMarine Regions Data from Marineregions.org
Tools to get marine regions data from http://www.marineregions.org/. Includes tools to get region metadata, as well as data in GeoJSON format, as well as Shape files. Use cases include using data downstream to visualize geospatial data by marine region, mapping variation among different regions, and more.
View DocumentationParse Darwin Core Files
Parse and create Darwin Core (http://rs.tdwg.org/dwc/) Simple and Archives. Functionality includes reading and parsing all the files in a Darwin Core Archive, including the datasets and metadata; read and parse simple Darwin Core files; and validation of Darwin Core Archives.
Scientific use casesSearch Vertnet, a Database of Vertebrate Specimen Records
Retrieve, map and summarize data from the VertNet.org archives (http://vertnet.org/). Functions allow searching by many parameters, including taxonomic names, places, and dates. In addition, there is an interface for conducting spatially delimited searches, and another for requesting large datasets via email.
Scientific use casesGenomic Data Retrieval
Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, RNA, coding sequence (CDS), GFF, and metagenome retrieval from NCBI RefSeq, NCBI Genbank, ENSEMBL, and UniProt databases. Furthermore, an interface to the BioMart database (Smedley et al. (2009) doi:10.1186/1471-2164-10-22) allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as NCBI RefSeq (Pruitt et al. (2007) doi:10.1093/nar/gkl842), NCBI nr, NCBI nt, NCBI Genbank (Benson et al. (2013) doi:10.1093/nar/gks1195), etc. with only one command.
Scientific use casesOpen Source OCR Engine
Bindings to Tesseract https://opensource.google.com/projects/tesseract: a powerful optical character recognition (OCR) engine that supports over 100 languages. The engine is highly configurable in order to tune the detection algorithms and obtain the best possible results.
Scientific use casesCollect Metrics on Packages from CRAN, GitHub, and StackOverflow
This package was designed to address two issues, 78 and 69, for the ROpenSci unconf17 concerning avoiding redundant / overlapping packages and a framework for reproducible tables. As this is a complex topic, the smaller tasks being accomplished is producing a list of metrics that can be used to compare similar packages utilizing information collected from CRAN, GitHub, and StackOverflow.
Scientific use casesConvert Data from and to GeoJSON or TopoJSON
Convert data to GeoJSON or TopoJSON from various R classes, including vectors, lists, data frames, shape files, and spatial classes. geojsonio does not aim to replace packages like sp, rgdal, rgeos, but rather aims to be a high level client to simplify conversions of data from and to GeoJSON and TopoJSON.
Scientific use casesInterface to Species Occurrence Data Sources
A programmatic interface to many species occurrence data sources, including Global Biodiversity Information Facility (GBIF), USGSs Biodiversity Information Serving Our Nation (BISON), iNaturalist, Berkeley Ecoinformatics Engine, eBird, Integrated Digitized Biocollections (iDigBio), VertNet, Ocean Biogeographic Information System (OBIS), and Atlas of Living Australia (ALA). Includes functionality for retrieving species occurrence data, and combining those data.
Scientific use casesClient for the Comprehensive Knowledge Archive Network (CKAN) API
Client for CKAN API (https://ckan.org/). Includes interface to CKAN APIs for search, list, show for packages, organizations, and resources. In addition, provides an interface to the datastore API.
Scientific use casesAccesses Air Quality Data from the Open Data Platform OpenAQ
Allows access to air quality data from the API of the OpenAQ platform https://docs.openaq.org/, with the different services the API offers (getting measurements for a given query, getting latest measurements, getting lists of available countries/cities/locations).
Scientific use casesRead and Write Ecological Metadata Language Files
Work with Ecological Metadata Language (EML) files. EML is a widely used metadata standard in the ecological and environmental sciences, described in Jones et al. (2006), doi:10.1146/annurev.ecolsys.37.091305.110031.
View DocumentationLightweight Qualitative Coding
A free, lightweight, open source option for analyzing text-based qualitative data. Enables analysis of interview transcripts, observation notes, memos, and other sources. Supports the work of social scientists, historians, humanists, and other researchers who use qualitative methods. Addresses the unique challenges faced in analyzing qualitative data analysis. Provides opportunities for researchers who otherwise might not develop software to build software development skills.
View DocumentationGeneral Purpose Interface to Elasticsearch
Connect to Elasticsearch, a NoSQL database built on the Java Virtual Machine. Interacts with the Elasticsearch HTTP API (https://www.elastic.co/elasticsearch/), including functions for setting connection details to Elasticsearch instances, loading bulk data, searching for documents with both HTTP query variables and JSON based body requests. In addition, elastic provides functions for interacting with API’s for indices, documents, nodes, clusters, an interface to the cat API, and more.
View DocumentationDownload and Process Data from the Paleobiology Database
Includes 19 functions to wrap each endpoint of the PaleobioDB API, plus 8 functions to visualize and process the fossil data. The API documentation for the Paleobiology Database can be found in http://paleobiodb.org/data1.1/.
Scientific use casesCreate Interactive Web Graphics via plotly.js
Create interactive web graphics from ggplot2 graphs and/or a custom interface to the (MIT-licensed) JavaScript library plotly.js inspired by the grammar of graphics.
Scientific use casesSecure Shell (SSH) Client for R
Connect to a remote server over SSH to transfer files via SCP, setup a secure tunnel, or run a command or script on the host while streaming stdout and stderr directly to the client.
View DocumentationFetch Phylogenies from Many Sources
Includes methods for fetching phylogenies from a variety of sources, including the Phylomatic web service (http://phylodiversity.net/phylomatic), and Phylocom (https://github.com/phylocom/phylocom/).
Scientific use casesLandscape Utility Toolbox
Provides utility functions for some of the less-glamorous tasks involved in landscape analysis. It includes functions to coerce raster data to the common tibble format and vice versa, it helps with flexible reclassification tasks of raster data and it provides a function to merge multiple raster. Furthermore, landscapetools helps landscape scientists to visualize their data by providing optional themes and utility functions to plot single landscapes, rasterstacks, -bricks and lists of raster.
Scientific use casesTools for Managing SSH and Git Credentials
Setup and retrieve HTTPS and SSH credentials for use with git and other services. For HTTPS remotes the package interfaces the git-credential utility which git uses to store HTTP usernames and passwords. For SSH remotes we provide convenient functions to find or generate appropriate SSH keys. The package both helps the user to setup a local git installation, and also provides a back-end for git/ssh client libraries to authenticate with existing user credentials.
View DocumentationFind Free Versions of Scholarly Publications via Unpaywall
This web client interfaces Unpaywall https://unpaywall.org/products/api, formerly oaDOI, a service finding free full-texts of academic papers by linking DOIs with open access journals and repositories. It provides unified access to various data sources for open access full-text links including Crossref and the Directory of Open Access Journals (DOAJ). API usage is free and no registration is required.
Scientific use casesInternational Cricket Data
Data on all international cricket matches is provided by ESPNCricinfo. This package provides some scraper functions to download the data into tibbles ready for analysis. Some innings-level data sourced from Howzstat is also included in the package.
View DocumentationClient for Neuroscience Information Framework APIs
Client for Neuroscience Information Framework (NIF) APIs (https://neuinfo.org/; https://neuinfo.org/about/webservices). Package includes functions for each API route, and gives back data in tidy data.frame format.
View DocumentationInterface to the USGS BISON API
Interface to the USGS BISON (https://bison.usgs.gov/) API, a database for species occurrence data. Data comes from species in the United States from participating data providers. You can get data via taxonomic and location based queries. A simple function is provided to help visualize data.
Scientific use casesFunctions to mine endoscopic and associated pathology datasets
This script comprises the functions that are used to clean up endoscopic
reports and pathology reports as well as many of the scripts used for analysis.
The scripts assume the endoscopy and histopathology data set is merged already but it can
also be used of course with the unmerged datasets.
General Purpose Client for ERDDAP Servers
General purpose R client for ERDDAP servers. Includes functions to search for datasets, get summary information on datasets, and fetch datasets, in either csv or netCDF format. ERDDAP information: https://upwell.pfeg.noaa.gov/erddap/information.html.
Scientific use casesRetrieve Data from the 1000 Plants Initiative (1KP)
The 1000 Plants Initiative (www.onekp.com) has sequenced the transcriptomes
of over 1000 plant species. This package allows these sequences and
metadata to be retrieved and filtered by code, species or recursively by
clade. Scientific names and NCBI taxonomy IDs are both supported.
View Documentation
Interface to the Open Tree of Life API
An interface to the Open Tree of Life API to retrieve phylogenetic trees, information about studies used to assemble the synthetic tree, and utilities to match taxonomic names to ‘Open Tree identifiers. The Open Tree of Life’ aims at assembling a comprehensive phylogenetic tree for all named species.
Scientific use casesPredict Gender from Names Using Historical Data
Infers state-recorded gender categories from first names and dates of birth using historical datasets. By using these datasets instead of lists of male and female names, this package is able to more accurately infer the gender of a name, and it is able to report the probability that a name was male or female. GUIDELINES: This method must be used cautiously and responsibly. Please be sure to see the guidelines and warnings about usage in the README or the package documentation. See Blevins and Mullen (2015) http://www.digitalhumanities.org/dhq/vol/9/3/000223/000223.html.
View DocumentationWorld Register of Marine Species (WoRMS) Client
Client for World Register of Marine Species (http://www.marinespecies.org/). Includes functions for each of the API methods, including searching for names by name, date and common names, searching using external identifiers, fetching synonyms, as well as fetching taxonomic children and taxonomic classification.
Scientific use casesDownload Qualtrics Survey Data
Provides functions to access survey results directly into R using the Qualtrics API. Qualtrics https://www.qualtrics.com/about/ is an online survey and data collection software platform. See https://api.qualtrics.com/ for more information about the Qualtrics API. This package is community-maintained and is not officially supported by Qualtrics.
View DocumentationClient for the CORE API
Client for the CORE API (https://core.ac.uk/docs/). CORE (https://core.ac.uk) aggregates open access research outputs from repositories and journals worldwide and make them available to the public.
View DocumentationAccess Publisher Copyright & Self-Archiving Policies via the SHERPA/RoMEO API
Fetches information from the SHERPA/RoMEO API http://www.sherpa.ac.uk/romeo/apimanual.php which indexes policies of journal regarding the archival of scientific manuscripts before and/or after peer-review as well as formatted manuscripts.
Scientific use casesPolyhedra Database
A polyhedra database scraped from various sources as R6 objects and rgl visualizing capabilities.
View DocumentationDirectory of Open Access Journals Client
Client for the Directory of Open Access Journals (DOAJ) (https://doaj.org/). API documentation at https://doaj.org/api/v1/docs. Methods included for working with all DOAJ API routes: fetch article information by identifier, search for articles, fetch journal information by identifier, and search for journals.
View DocumentationParse a BibTeX File to a Data Frame
Parse a BibTeX file to a data.frame to make it accessible for further analysis and visualization.
Scientific use casesR Interface to the Species+ Database
A programmatic interface to the Species+ https://speciesplus.net/ database via the Species+/CITES Checklist API https://api.speciesplus.net/.
Scientific use casesTaxonomic Information from Wikipedia
Taxonomic information from Wikipedia, Wikicommons, Wikispecies, and Wikidata. Functions included for getting taxonomic information from each of the sources just listed, as well performing taxonomic search.
View DocumentationA Tidy Approach to NetCDF Data Exploration and Extraction
Tidy tools for NetCDF data sources. Explore the contents of a NetCDF source (file or URL) presented as variables organized by grid with a database-like interface. The hyper_filter() interactive function translates the filter value or index expressions to array-slicing form. No data is read until explicitly requested, as a data frame or list of arrays via hyper_tibble() or hyper_array().
View DocumentationConnector to CouchDB
Provides an interface to the NoSQL database CouchDB (http://couchdb.apache.org). Methods are provided for managing databases within CouchDB, including creating/deleting/updating/transferring, and managing documents within databases. One can connect with a local CouchDB instance, or a remote CouchDB databases such as Cloudant. Documents can be inserted directly from vectors, lists, data.frames, and JSON. Targeted at CouchDB v2 or greater.
View DocumentationAccess for Dryad Web Services
Interface to the Dryad “Solr” API, their “OAI-PMH” service, and fetch datasets. Dryad (https://datadryad.org/) is a curated host of data underlying scientific publications.
Scientific use casesFast, Consistent Tokenization of Natural Language Text
Convert natural language text into tokens. Includes tokenizers for shingled n-grams, skip n-grams, words, word stems, sentences, paragraphs, characters, shingled characters, lines, tweets, Penn Treebank, regular expressions, as well as functions for counting characters, words, and sentences, and a function for splitting longer texts into separate documents, each with the same number of words. The tokenizers have a consistent interface, and the package is built on the stringi and Rcpp packages for fast yet correct tokenization in UTF-8.
Scientific use casesClasses for GeoJSON
Classes for GeoJSON to make working with GeoJSON easier. Includes S3 classes for GeoJSON classes with brief summary output, and a few methods such as extracting and adding bounding boxes, properties, and coordinate reference systems; working with newline delimited GeoJSON; linting through the geojsonlint package; and serializing to/from Geobuf binary GeoJSON format.
View DocumentationSimple Jenkins Client for R
Manage jobs and builds on your Jenkins CI server https://jenkins.io/. Create and edit projects, schedule builds, manage the queue, download build logs, and much more.
View DocumentationSemantically Rich I/O for the NeXML Format
Provides access to phyloinformatic data in NeXML format. The package should add new functionality to R such as the possibility to manipulate NeXML objects in more various and refined way and compatibility with ape objects.
Scientific use casesInterface to Bold Systems API
A programmatic interface to the Web Service methods provided by Bold Systems (http://www.boldsystems.org/) for genetic barcode data. Functions include methods for searching by sequences by taxonomic names, ids, collectors, and institutions; as well as a function for searching for specimens, and downloading trace files.
Scientific use casesAccess Nomis UK Labour Market Data
Access UK official statistics from the Nomis database. Nomis includes data from the Census, the Labour Force Survey, DWP benefit statistics and other economic and demographic data from the Office for National Statistics, based around statistical geographies. See https://www.nomisweb.co.uk/api/v01/help for full API documentation.
View DocumentationR Bindings for ZeroMQ
Interface to the ZeroMQ lightweight messaging kernel (see http://www.zeromq.org/ for more information).
View DocumentationBase Package for Outsider
Base package for outsider https://github.com/ropensci/outsider. The outsider package and its sister packages enable the installation and running of external, command-line software within R. This base package is a key dependency of the user-facing outsider package as it provides the utilities for interfacing between Docker https://www.docker.com and R. It is intended that end-users of outsider do not directly work with this base package.
View DocumentationInstall and Run Programs, Outside of R, Inside of R
Install and run external command-line programs in R through use of Docker https://www.docker.com/ and online repositories.
View DocumentationProvides Access to Git Repositories
Interface to the libgit2 library, which is a pure C implementation of the Git core methods. Provides access to Git repositories to extract data and running some basic Git commands.
Scientific use casesAutomated Phylogenetic Sequence Cluster Identification from GenBank
A pipeline for the identification, within taxonomic groups, of orthologous sequence clusters from GenBank https://www.ncbi.nlm.nih.gov/genbank/ as the first step in a phylogenetic analysis. The pipeline depends on a local alignment search tool and is, therefore, not dependent on differences in gene naming conventions and naming errors.
Scientific use casesMoving-Window and Direct Data Aggregation
Data aggregation via moving window or direct methods. Aggregate a fine-resolution raster to a grid. The moving window method smooths the surface using a specified function within a moving window of a specified size and shape prior to aggregation. The direct method simply aggregates to the grid using the specified function.
View DocumentationFetch Species Origin Data from the Web
Get species origin data (whether species is native/invasive) from the following sources on the web: Encyclopedia of Life (http://eol.org), Flora Europaea (http://rbg-web2.rbge.org.uk/FE/fe.html), Global Invasive Species Database (http://www.iucngisd.org/gisd), the Native Species Resolver (https://bien.nceas.ucsb.edu/bien/tools/nsr/), Integrated Taxonomic Information Service (https://www.itis.gov/), and Global Register of Introduced and Invasive Species (http://www.griis.org/).
View DocumentationR Interface to the Europe PubMed Central RESTful Web Service
An R Client for the Europe PubMed Central RESTful Web Service (see https://europepmc.org/RestfulWebService for more information). It gives access to both metadata on life science literature and open access full texts. Europe PMC indexes all PubMed content and other literature sources including Agricola, a bibliographic database of citations to the agricultural literature, or Biological Patents. In addition to bibliographic metadata, the client allows users to fetch citations and reference lists. Links between life-science literature and other EBI databases, including ENA, PDB or ChEMBL are also accessible. No registration or API key is required. See the vignettes for usage examples.
View DocumentationRead Data from JSTOR/DfR
Functions and helpers to import metadata, ngrams and full-texts delivered by Data for Research by JSTOR.
View DocumentationJSON for Linking Data
JSON-LD is a light-weight syntax for expressing linked data. It is primarily intended for web-based programming environments, interoperable web services and for storing linked data in JSON-based databases. This package provides bindings to the JavaScript library for converting, expanding and compacting JSON-LD documents.
View DocumentationInterface to the Biodiversity Heritage Library
Interface to Biodiversity Heritage Library (BHL) (https://www.biodiversitylibrary.org/) API (https://www.biodiversitylibrary.org/docs/api3.html). BHL is a repository of digitized literature on biodiversity studies, including floras, research papers, and more.
Scientific use casesMutation Testing Framework
Mutation testing framework.
View DocumentationText Extraction, Rendering and Converting of PDF Documents
Utilities based on libpoppler for extracting text, fonts, attachments and metadata from a PDF file. Also supports high quality rendering of PDF documents into PNG, JPEG, TIFF format, or into raw bitmap vectors for further processing in R.
Scientific use casesHydrological Data Discovery Tools
Tools to discover hydrological data, accessing catalogues and databases from various data providers.
View DocumentationAcquisition and Processing of NASA Soil Moisture Active-Passive (SMAP) Data
Facilitates programmatic access to NASA Soil Moisture Active
Passive (SMAP) data with R. It includes functions to search for, acquire,
and extract SMAP data.
View Documentation
Download Data from the European Social Survey on the Fly
Download data from the European Social Survey directly from their website http://www.europeansocialsurvey.org/. There are two families of functions that allow you to download and interactively check all countries and rounds available.
View DocumentationInterface to USDA Databases
An interface to the web service methods provided by the United States Department of Agriculture (USDA). The Agricultural Research Service (ARS) provides a large set of databases. The current version of the package holds interfaces to the Systematic Mycology and Microbiology Laboratory (SMML), which consists of four databases: Fungus-Host Distributions, Specimens, Literature and the Nomenclature database. It provides functions for querying these databases. The main function is \code{associations}, which allows searching for fungus-host combinations.
Scientific use casesExtract Text from Rich Text Format (RTF) Documents
Wraps the unrtf utility to extract text from RTF files. Supports document conversion to HTML, LaTeX or plain text. Output in HTML is recommended because unrtf has limited support for converting between character encodings.
View DocumentationAn API Client for the Internet Archive
Search the Internet Archive (https://archive.org), retrieve metadata, and download files.
View DocumentationR Interface to the Global Population Dynamics Database
R Interface to the Global Population Dynamics Database (https://ecologicaldata.org/wiki/global-population-dynamics-database)
View DocumentationGeneral Purpose Oai-PMH Services Client
A general purpose client to work with any OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) service. The OAI-PMH protocol is described at http://www.openarchives.org/OAI/openarchivesprotocol.html. Functions are provided to work with the OAI-PMH verbs: GetRecord, Identify, ListIdentifiers, ListMetadataFormats, ListRecords, and ListSets.
Scientific use casesNatureServe Interface
Interface to NatureServe (https://www.natureserve.org/). Includes methods to get data, image metadata, search taxonomic names, and make maps.
View DocumentationGeoJSON Topology Calculations and Operations
Tools for doing calculations and manipulations on GeoJSON, a geospatial data interchange format (https://tools.ietf.org/html/rfc7946). GeoJSON is also valid JSON.
View DocumentationAAPOR Survey Outcome Rates
Standardized survey outcome rate functions, including the response rate, contact rate, cooperation rate, and refusal rate. These outcome rates allow survey researchers to measure the quality of survey data using definitions published by the American Association of Public Opinion Research (AAPOR). For details on these standards, see AAPOR (2016) https://www.aapor.org/Standards-Ethics/Standard-Definitions-(1).aspx.
View DocumentationExtensible Style-Sheet Language Transformations
An extension for the xml2 package to transform XML documents by applying an xslt style-sheet.
View DocumentationInterface with the United Nations Comtrade API
Interface with and extract data from the United Nations Comtrade API https://comtrade.un.org/data/. Comtrade provides country level shipping data for a variety of commodities, these functions allow for easy API query and data returned as a tidy data frame.
View DocumentationDetect Text Reuse and Document Similarity
Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.
Scientific use casesR Client for the eBird Database of Bird Observations
A programmatic client for the eBird database (https://ebird.org/home), including functions for searching for bird observations by geographic location (latitude, longitude), eBird hotspots, location identifiers, by notable sightings, by region, and by taxonomic name.
Scientific use casesAPI Client for the Open Context Archeological Database
Search, browse, and download data from Open Context (https://opencontext.org)
View DocumentationAn R client for HathiTrust API
An R client for HathiTrust API (https://www.hathitrust.org). Only for the bibliographic API for now.
View DocumentationR Interface to Global Biotic Interactions
A programmatic interface to the web service methods provided by Global Biotic Interactions (GloBI) (https://www.globalbioticinteractions.org/). GloBI provides access to spatial-temporal species interaction records from sources all over the world. rglobi provides methods to search species interactions by location, interaction type, and taxonomic name. In addition, it supports Cypher, a graph query language, to allow for executing custom queries on the GloBI aggregate species interaction data set.
Scientific use casesConvert Between WKT and GeoJSON
Convert WKT to GeoJSON and GeoJSON to WKT. Functions included for converting between GeoJSON to WKT, creating both GeoJSON features, and non-features, creating WKT from R objects (e.g., lists, data.frames, vectors), and linting WKT.
View DocumentationAPI Wrapper for US Energy Information Administration Open Data
Provides API access to data from the US Energy Information Administration (EIA) https://www.eia.gov/. Use of the API requires a free API key obtainable at https://www.eia.gov/opendata/register.php. The package includes functions for searching EIA data categories and importing time series and geoset time series datasets. Datasets returned by these functions are provided in a tidy format or alternatively in more raw form. It also offers helper functions for working with EIA date strings and time formats and for inspecting different summaries of series metadata. The package also provides control over API key storage and caching of API request results.
View DocumentationTaxonomic Classes
Provides taxonomic classes for
groupings of taxonomic names without data, and those
with data. Methods provided are “taxonomically aware”, in
that they know about ordering of ranks, and methods that
filter based on taxonomy also filter associated data.
This package is described in the publication: “Taxa: An R
package implementing data standards and methods for
taxonomic data”, Zachary S.L. Foster, Scott Chamberlain,
Niklaus J. Grünwald (2018) doi:10.12688/f1000research.14013.2.
Parsing GenBank files into semantically useful objects
Reads Genbank files.
View DocumentationClient for Turfjs for Geospatial Analysis
Client for Turfjs (http://turfjs.org) for geospatial analysis. The package revolves around using GeoJSON data. Functions are included for creating GeoJSON data objects, measuring aspects of GeoJSON, and combining, transforming, and creating random GeoJSON data objects.
View DocumentationClient for jq, a JSON Processor
Client for jq, a JSON processor (https://stedolan.github.io/jq/), written in C. jq allows the following with JSON data: index into, parse, do calculations, cut up and filter, change key names and values, perform conditionals and comparisons, and more.
View DocumentationPhoto Searcher
Queries the Flick API (https://www.flickr.com/services/api/) to return photograph metadata as well as the ability to download the images as jpegs.
View DocumentationR Interface to Apache Tika
Extract text or metadata from over a thousand file types, using Apache Tika https://tika.apache.org/. Get either plain text or structured XHTML content.
View DocumentationA package for accessing World Bank climate data
This package will download model predictions from 15 different global circulation models in 20 year intervals from the world bank. Users can also access historical data, and create maps at 2 different spatial scales.
Scientific use casesLabel Creation for Tracking and Collecting Data from Biological Samples
Tools to generate unique identifier codes and printable barcoded labels for the management of biological samples. The creation of unique ID codes and printable PDF files can be initiated by standard commands, user prompts, or through a GUI addin for R Studio. Biologically informative codes can be included for hierarchically structured sampling designs.
View DocumentationrOpenSci's blog guidance
It provides templates for roweb2 blogging and help for a GitHub forking workflow.
View DocumentationGeneral Purpose R Interface to Solr
Provides a set of functions for querying and parsing data from Solr (https://lucene.apache.org/solr) endpoints (local and remote), including search, faceting, highlighting, stats, and more like this. In addition, some functionality is included for creating, deleting, and updating documents in a Solr database.
View DocumentationCall Googles Natural Language API, Cloud Translation' API, Cloud Speech API and Cloud Text-to-Speech API
Call Google Cloud machine learning APIs for text and speech tasks. Call the Cloud Translation API https://cloud.google.com/translate/ for detection and translation of text, the Natural Language API https://cloud.google.com/natural-language/ to analyse text for sentiment, entities or syntax, the Cloud Speech API https://cloud.google.com/speech/ to transcribe sound files to text and the Cloud Text-to-Speech API https://cloud.google.com/text-to-speech/ to turn text into sound files.
View DocumentationHTTP Error Helpers
HTTP error helpers. Methods included for general purpose HTTP error handling, as well as individual methods for every HTTP status code, both via status code numbers as well as their descriptive names. Supports ability to adjust behavior to stop, message or warning. Includes ability to use custom whisker template to have any configuration of status code, short description, and verbose message. Currently supports integration with crul, curl, and httr.
View DocumentationProvides some helper functions for using the GitHub V4 API
Uses the ghql package and jqr to get some common data from Github V4 API.
View DocumentationCreate Useful .gitignore Files for your Project
Simple interface to query gitignore.io to fetch gitignore templates that can be included in the .gitignore file. More than 450 templates are currently available.
View DocumentationValidate JSON Schema
Uses the node library is-my-json-valid or ajv to validate JSON against a JSON schema. Drafts 04, 06 and 07 of JSON schema are supported.
View DocumentationInteract with the UK AIR Pollution Database from DEFRA
Get data from DEFRA’s UK-AIR website https://uk-air.defra.gov.uk/. It basically scrapes the HTML content.
Scientific use casesObtain and Visualize Regulome-Gene Expression Correlations in Cancer
Builds a SQLite database file of pre-calculated transcription factor/microRNA-gene correlations (co-expression) in cancer from the Cistrome Cancer Liu et al. (2011) doi:10.1186/gb-2011-12-8-r83 and miRCancerdb databases (in press). Provides custom classes and functions to query, tidy and plot the correlation data.
Scientific use casesAPI Client and Dataset Management for the Demographic and Health Survey (DHS) Data
Provides a client for (1) querying the DHS API for survey indicators and metadata (https://api.dhsprogram.com/#/index.html), (2) identifying surveys and datasets for analysis, (3) downloading survey datasets from the DHS website, (4) loading datasets and associate metadata into R, and (5) extracting variables and combining datasets for pooled analysis.
Scientific use casesInterface to the Libraries.io API
Interface to the Libraries.io API (https://libraries.io/api). Libraries.io indexes data from 36 different package managers for programming languages.
View DocumentationRead, Tidy, and Display Data from Microtiter Plates
Tools for interacting with data from experiments done in microtiter plates. Easily read in plate-shaped data and convert it to tidy format, combine plate-shaped data with tidy data, and view tidy data in plate shape.
View DocumentationManage Cached Files
Suite of tools for managing cached files, targeting use in other R packages. Uses rappdirs for cross-platform paths. Provides utilities to manage cache directories, including targeting files by path or by key; cached directories can be compressed and uncompressed easily to save disk space.
View DocumentationR Package Client for the Netherlands Biodiversity API
Access to the digitised Natural History collection at the Naturalis Biodiversity Center. This is the official client to the Netherlands Biodiversity API (NBA, http://api.biodiversitydata.nl) for the R programming language. More information on the NBA can be found at http://docs.biodiversitydata.nl.
View DocumentationHigh Level Encryption Wrappers
Encryption wrappers, using low-level support from sodium and openssl. cyphr tries to smooth over some pain points when using encryption within applications and data analysis by wrapping around differences in function names and arguments in different encryption providing packages. It also provides high-level wrappers for input/output functions for seamlessly adding encryption to existing analyses.
View DocumentationInterface to the Greek National Data Bank for Hydrometeorological Information
R interface to the Greek National Data Bank for Hydrological and Meteorological Information http://www.hydroscope.gr/. It covers Hydroscope’s data sources and provides functions to transliterate, translate and download them into tidy dataframes.
Scientific use casesAPI Wrapper Around Postcodes.io
Free UK geocoding using data from Office for National Statistics. It is using several functions to get information about post codes, outward codes, reverse geocoding, nearest post codes/outward codes, validation, or randomly generate a post code. API wrapper around https://postcodes.io.
View DocumentationClient for the DataCite API
Client for the web service methods provided by DataCite (https://www.datacite.org/), including functions to interface with their RESTful search API. The API is backed by Elasticsearch, allowing expressive queries, including faceting.
Scientific use casesauthor name disambiguation, author georeferencing, and mapping of coauthorship networks with Web of Science data
Tools to parse and organize reference records downloaded from the Web of Science citation database into an R-friendly format, disambiguate the names of authors, geocode their locations, and generate/visualize coauthorship networks. This package has been peer-reviewed by rOpenSci (v. 1.0).
View DocumentationCitation Style Language (CSL) Utilities
Tools for working with the Citation Style Language (CSL) (https://citationstyles.org), an XML-based format describing the formatting of citations, notes and bibliographies. Functions are included for downloading and searching for styles and locales, and loading and parsing styles and locales. seasl aims to help users fetch and modify CSL files for work combining code and writing that requires citations.
View DocumentationBespoke Images of OpenStreetMap Data
Bespoke images of OpenStreetMap (OSM) data and data visualisation using OSM objects.
View DocumentationProgrammatic Interface to the Web Service Methods Provided by UC Berkeley's Natural History Data
The ecoengine (ecoengine; https://ecoengine.berkeley.edu/). provides access to more than 5 million georeferenced specimen records from the University of California, Berkeley’s Natural History Museums.
View DocumentationA Tool for Writing Cleaner, More Transparent Code
To create clearer, more concise code provides this toolbox helps coders to isolate the essential parts of a script that produces a chosen result, such as an object, tables and figures written to disk.
View DocumentationDownload and Aggregate Data from Public Hire Bicycle Systems
Download and aggregate data from all public hire bicycle systems which provide open data, currently including Santander Cycles in London, U.K.; from the U.S.A., Ford GoBike in San Francisco CA, citibike in New York City NY, Divvy in Chicago IL, Capital Bikeshare in Washington DC, Hubway in Boston MA, Metro in Los Angeles LA, Indego in Philadelphia PA, and Nice Ride in Minnesota; Bixi from Montreal, Canada; and mibici from Guadalajara, Mexico.
Scientific use casesAccess data from the NASS Quick Stats API
Interface to access data via the United States Department of Agricultures National Agricultural Statistical Service (NASS) Quick Stats’ web API https://quickstats.nass.usda.gov/api. Convenience functions facilitate building queries based on available parameters and valid parameter values. This product uses the NASS API but is not endorsed or certified by NASS.
View DocumentationTools for Validating GeoJSON
Tools for linting GeoJSON. Includes tools for interacting with the online tool http://geojsonlint.com, the Javascript library geojsonhint (https://www.npmjs.com/package/geojsonhint), and validating against a GeoJSON schema via the Javascript library (https://www.npmjs.com/package/is-my-json-valid). Some tools work locally while others require an internet connection.
View DocumentationConvert Antipsychotic Doses to Chlorpromazine Equivalents
As different antipsychotic medications have different potencies, the doses of different medications cannot be directly compared. Various strategies are used to convert doses into a common reference so that comparison is meaningful. Chlorpromazine (CPZ) has historically been used as a reference medication into which other antipsychotic doses can be converted, as “chlorpromazine-equivalent doses”. Using conversion keys generated from widely-cited scientific papers (Gardner et. al 2010 doi:10.1176/appi.ajp.2009.09060802, Leucht et al. 2016 doi:10.1093/schbul/sbv167), antipsychotic doses are converted to CPZ (or any specified antipsychotic) equivalents. The use of the package is described in the included vignette. Not for clinical use.
Scientific use casesBuild outsider Modules
Developer functions and resources for building outsider modules.
View DocumentationHelper for rOpenSci Package Developpers
Provides helpers for rOpenSci package developpers, mostly helping with metadata management (badges, DESCRIPTION) and GitHub infrastructure (GitHub issue and PR templates).
View DocumentationDownload Data from the Catchment Data Explorer Website
Facilitates searching, download and plotting of Water Framework Directive (WFD) reporting data for all waterbodies within the UK Environment Agency area. The types of data that can be downloaded are: WFD status classification data, Reasons for Not Achieving Good (RNAG) status, objectives set for waterbodies, measures put in place to improve water quality and details of associated protected areas. The site accessed is https://environment.data.gov.uk/catchment-planning/. The data are made available under the Open Government Licence v3.0 https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/.
View DocumentationPositron Emission Tomography Time-Activity Curve Analysis
To facilitate the analysis of positron emission tomography (PET) time activity curve (TAC) data, and to encourage open science and replicability, this package supports data loading and analysis of multiple TAC file formats. Functions are available to analyze loaded TAC data for individual participants or in batches. Major functionality includes weighted TAC merging by region of interest (ROI), calculating models including standardized uptake value ratio (SUVR) and distribution volume ratio (DVR, Logan et al. 1996 doi:10.1097/00004647-199609000-00008), basic plotting functions and calculation of cut-off values (Aizenstein et al. 2008 doi:10.1001/archneur.65.11.1509). Please see the walkthrough vignette for a detailed overview of tacmagic functions.
Scientific use casesR Bindings for Selenium WebDriver
Provides a set of R bindings for the Selenium 2.0 WebDriver (see https://selenium.dev/documentation/en/ for more information) using the JsonWireProtocol (see https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol for more information). Selenium 2.0 WebDriver allows driving a web browser natively as a user would either locally or on a remote machine using the Selenium server it marks a leap forward in terms of web browser automation. Selenium automates web browsers (commonly referred to as browsers). Using RSelenium you can automate browsers locally or remotely.
Scientific use casesParse NOAA Integrated Surface Data Files
Tools for parsing NOAA Integrated Surface Data (ISD) files, described at https://www.ncdc.noaa.gov/isd. Data includes for example, wind speed and direction, temperature, cloud data, sea level pressure, and more. Includes data from approximately 35,000 stations worldwide, though best coverage is in North America/Europe/Australia. Data is stored as variable length ASCII character strings, with most fields optional. Included are tools for parsing entire files, or individual lines of data.
View DocumentationWebdriver/Selenium Binary Manager
There are a number of binary files associated with the Webdriver/Selenium project (see http://www.seleniumhq.org/download/, https://sites.google.com/a/chromium.org/chromedriver/, https://github.com/mozilla/geckodriver, http://phantomjs.org/download.html and https://github.com/SeleniumHQ/selenium/wiki/InternetExplorerDriver for more information). This package provides functions to download these binaries and to manage processes involving them.
View DocumentationGoogle's Compact Language Detector 3
Google’s Compact Language Detector 3 is a neural network model for language identification and the successor of cld2 (available from CRAN). The algorithm is still experimental and takes a novel approach to language detection with different properties and outcomes. It can be useful to combine this with the Bayesian classifier results from cld2. See https://github.com/google/cld3#readme for more information.
View DocumentationStraightforward BibTeX and BibLaTeX Bibliography Management
Provides tools for importing and working with bibliographic references. It greatly enhances the bibentry class by providing a class BibEntry which stores BibTeX and BibLaTeX references, supports UTF-8 encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. BibTeX and BibLaTeX .bib files can be read into R and converted to BibEntry objects. Interfaces to NCBI Entrez, CrossRef, and Zotero are provided for importing references and references can be created from locally stored PDF files using Poppler. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with RMarkdown or RHTML.
View DocumentationTools to Manipulate and Query Semantic Data
The Resource Description Framework, or RDF is a widely used data representation model that forms the cornerstone of the Semantic Web. RDF represents data as a graph rather than the familiar data table or rectangle of relational databases. The rdflib package provides a friendly and concise user interface for performing common tasks on RDF data, such as reading, writing and converting between the various serializations of RDF data, including rdfxml, turtle, nquads, ntriples, and json-ld; creating new RDF graphs, and performing graph queries using SPARQL. This package wraps the low level redland R package which provides direct bindings to the redland C library. Additionally, the package supports the newer and more developer friendly JSON-LD format through the jsonld package. The package interface takes inspiration from the Python rdflib library.
Scientific use casesClient for the Pangaea Database
Tools to interact with the Pangaea Database (https://www.pangaea.de), including functions for searching for data, fetching datasets by dataset ID, and working with the Pangaea OAI-PMH service.
Scientific use casesCreate and Query a Local Copy of GenBank in R
Download large sections of GenBank https://www.ncbi.nlm.nih.gov/genbank/ and generate a local SQL-based database. A user can then query this database using restez functions or through rentrez https://CRAN.R-project.org/package=rentrez wrappers.
Scientific use casesCollecting Twitter Data
An implementation of calls designed to collect and organize Twitter data via Twitter’s REST and stream Application Program Interfaces (API), which can be found at the following URL: https://developer.twitter.com/en/docs. This package has been peer-reviewed by rOpenSci (v. 0.6.9).
Scientific use casesConduct Co-Localization Analysis of Fluorescence Microscopy Images
Automate the co-localization analysis of fluorescence microscopy images. Selecting regions of interest, extract pixel intensities from the image channels and calculate different co-localization statistics. The methods implemented in this package are based on Dunn et al. (2011) doi:10.1152/ajpcell.00462.2010.
Scientific use casesInterface to Phylocom
Interface to Phylocom (http://phylodiversity.net/phylocom/), a library for analysis of phylogenetic community structure and character evolution. Includes low level methods for interacting with the three executables, as well as higher level interfaces for methods like aot, ecovolve, bladj, phylomatic, and more.
View DocumentationPreliminary Visualisation of Data
Create preliminary exploratory data visualisations of an entire dataset to identify problems or unexpected features using ggplot2.
Scientific use casesAssertive Programming for R Analysis Pipelines
Provides functionality to assert conditions that have to be met so that errors in data used in analysis pipelines can fail quickly. Similar to stopifnot() but more powerful, friendly, and easier for use in pipelines.
Scientific use casesGet Australian Flight Data, 1985-2016
A package to obtain Australian aviation data from BITRE. This incudes airport traffic data between 1985-2016 covering international freight data, and both international and domestic data on number of passengers, and flight movements - for both regional and metropolitan airports. The Package also includes distances of flight originating in or ending in Australia, and the location of all relevant airports.
View DocumentationProgrammatic Interface to the openfisheries.org API
A programmatic interface to openfisheries.org. This package is part of the rOpenSci suite (https://ropensci.org).
Scientific use casesDendrograms for Evolutionary Analysis
Contains functions for developing phylogenetic trees as deeply-nested lists (“dendrogram” objects). Enables bi-directional conversion between dendrogram and “phylo” objects (see Paradis et al (2004) doi:10.1093/bioinformatics/btg412), and features several tools for command-line tree manipulation and import/export via Newick parenthetic text.
Scientific use casesAccess London Natural History Museum Host-Helminth Record Database
Access to large host-parasite data is often hampered by the availability of data and difficulty in obtaining it in a programmatic way to encourage analyses. helminthR provides a programmatic interface to the London Natural History Museum’s host-parasite database, one of the largest host-parasite databases existing currently http://www.nhm.ac.uk/research-curation/scientific-resources/taxonomy-systematics/host-parasites/. The package allows the user to query by host species, parasite species, and geographic location.
Scientific use casesA DoOR to the Complete Olfactome
This is a function package providing functions to perform data manipulations and visualizations for DoOR.data. See the URLs for the original and the DoOR 2.0 publication.
View DocumentationA DoOR to the Complete Olfactome
This is a data package providing Drosophila odorant response data for DoOR.functions. See URLs for the original and the DoOR 2.0 publications.
View DocumentationThe Critical Care Clinical Data Processing Tools
An electronic health care record (EHR) data cleaning and processing platform. It focus on heterogeneous high resolution longitudinal data. It works with Critical Care Health Informatics Collaborative (CCHIC) dataset. It is created to address various data reliability and accessibility problems of EHRs as such.
View DocumentationUtilities to Handle WKT Spatial Data
Utilities to generate bounding boxes from WKT (Well-Known Text) objects and R data types, validate WKT objects and convert object types from the sp package into WKT representations.
Scientific use casesDiscovery, Access and Manipulation of TreeBASE Phylogenies
Interface to the API for TreeBASE http://treebase.org from R. TreeBASE is a repository of user-submitted phylogenetic trees (of species, population, or genes) and the data used to create them.
View DocumentationCreate Geographic and Non-Geographic Map Tiles
Creates geographic map tiles from geospatial map files or non-geographic map tiles from simple image files. This package provides a tile generator function for creating map tile sets for use with packages such as leaflet. In addition to generating map tiles based on a common raster layer source, it also handles the non-geographic edge case, producing map tiles from arbitrary images. These map tiles, which have a non-geographic, simple coordinate reference system (CRS), can also be used with leaflet when applying the simple CRS option. Map tiles can be created from an input file with any of the following extensions: tif, grd and nc for spatial maps and png, jpg and bmp for basic images. This package requires Python and the gdal library for Python. Windows users are recommended to install OSGeo4W (https://trac.osgeo.org/osgeo4w/) as an easy way to obtain the required gdal support for Python.
View DocumentationText Interchange Format
Provides validation functions for common interchange formats for representing text data in R. Includes formats for corpus objects, document term matrices, and tokens. Other annotations can be stored by overloading the tokens structure.
View DocumentationParse Full Text XML Documents from PubMed Central
Parse XML documents from the Open Access subset of Europe PubMed Central https://europepmc.org including section paragraphs, tables, captions and references.
View DocumentationData for Atlantic and east Pacific tropical cyclones since 1998
Includes storm discussions, forecast/advisories, public advisories, wind speed probabilities, strike probabilities and more. This package can be used along with rrricanes (>= 0.2.0-6). Data is considered public domain via the National Hurricane Center.
View DocumentationGet Texts from the Perseus Digital Library
The Perseus Digital Library is a collection of classical texts. This package helps you get them. The available works can also be viewed here: http://cts.perseids.org/.
View DocumentationHigh Resolution World Vector Map Data from Natural Earth used in rnaturalearth
Facilitates mapping by making natural earth map data from http:// www.naturalearthdata.com/ more easily available to R users. Focuses on vector data.
View DocumentationWorld Vector Map Data from Natural Earth Used in rnaturalearth
Vector map data from http://www.naturalearthdata.com/. Access functions are provided in the accompanying package rnaturalearth.
Scientific use casesAPI Wrapper for the UK REF 2014 Impact Case Studies Database
Provides wrapper functions around the UK Research Excellence Framework 2014 Impact Case Studies Database API http://impact.ref.ac.uk/. The database contains relevant publication and research metadata about each case study as well as several paragraphs of text from the case study submissions. Case studies in the database are licenced under a CC-BY 4.0 licence http://creativecommons.org/licenses/by/4.0/legalcode.
View DocumentationInterface to the Bird-Watching Dataset Proyecto AVIS
Interface to http://proyectoavis.com database. It provides means to download data filtered by species, order, family, and several other criteria. Provides also basic functionality to plot exploratory maps of the datasets.
View DocumentationGenerate Random WKT or GeoJSON
Generate random positions (latitude/longitude), Well-known text (WKT) points or polygons, or GeoJSON points or polygons.
View DocumentationDownload and Read RAM Legacy Stock Assessment Database
Contains functions to download, cache and read in Excel version of the RAM Legacy Stock Assessment Data Base, an online compilation of stock assessment results for commercially exploited marine populations from around the world. The database is named after Dr. Ransom A. Myers whose original stock-recruitment database, is no longer being updated. More information about the database can be found at https://ramlegacy.org/. Ricard, D., Minto, C., Jensen, O.P. and Baum, J.K. (2012) doi:10.1111/j.1467-2979.2011.00435.x.
View DocumentationRetrieves Altmerics Data for Any Published Paper from Altmetric.com
Provides a programmatic interface to the citation information and alternate metrics provided by Altmetric. Data from Altmetric allows researchers to immediately track the impact of their published work, without having to wait for citations. This allows for faster engagement with the audience interested in your work. For more information, visit https://www.altmetric.com/.
Scientific use casesClient for Various Ocean Time Series Datasets
Interact with various ocean time series datasets, including BATS, HOT, and more. Package focuses on data retrieval only. All functions return a data.frame for easy downstream use for plots, vizualization, analysis.
View DocumentationA Shiny Application for Automatic Measurements of Tree-Ring Widths on Digital Images
Use morphological image processing and edge detection algorithms to automatically measure tree ring widths on digital images. Users can also manually mark tree rings on species with complex anatomical structures. The arcs of inner-rings and angles of successive inclined ring boundaries are used to correct ring-width series. The package provides a Shiny-based application, allowing R beginners to easily analyze tree ring images and export ring-width series in standard file formats.
View DocumentationDatasets for Historians
These sample data sets are intended for historians learning R. They include population, institutional, religious, military, and prosopographical data suitable for mapping, quantitative analysis, and network analysis.
View DocumentationGet Landsat 8 Data from Amazon Public Data Sets
Get Landsat 8 Data from Amazon Web Services (AWS) public data sets (https://registry.opendata.aws/landsat-8/). Includes functions for listing images and fetching them, and handles caching to prevent unnecessary additional requests.
View DocumentationSplit Geospatial Objects into Pieces
Split geospatial objects into pieces. Includes support for some spatial object inputs, Well-Known Text, and GeoJSON.
View DocumentationHistorical Datasets for Predicting Gender from Names
The historical datasets in this package are used in the gender package to predict gender from first names and birth years.
View DocumentationAvoid the Typical Working Directory Pain When Using knitr
An extension of knitr that adds flexibility in several ways. One common source of frustration with knitr is that it assumes the directory where the source file lives should be the working directory, which is often not true. ezknitr addresses this problem by giving you complete control over where all the inputs and outputs are, and adds several other convenient features to make rendering markdown/HTML documents easier.
View DocumentationClient for CAMS Radiation Service
Copernicus Atmosphere Monitoring Service (CAMS) Radiation Service provides time series of global, direct, and diffuse irradiations on horizontal surface, and direct irradiation on normal plane for the actual weather conditions as well as for clear-sky conditions. The geographical coverage is the field-of-view of the Meteosat satellite, roughly speaking Europe, Africa, Atlantic Ocean, Middle East. The time coverage of data is from 2004-02-01 up to 2 days ago. Data are available with a time step ranging from 15 min to 1 month. For license terms and to create an account, please see http://www.soda-pro.com/web-services/radiation/cams-radiation-service.
Scientific use casesClient for the Bittrex Exchange
A client for the Bittrex crypto-currency exchange https://bittrex.com including the ability to query trade data, manage account balances, and place orders.
View DocumentationInterface to the arXiv API
An interface to the API for arXiv (https://arxiv.org), a repository of electronic preprints for computer science, mathematics, physics, quantitative biology, quantitative finance, and statistics.
Scientific use casesprogrammatic interface to the AntWeb
A complete programmatic interface to the AntWeb database from the California Academy of Sciences.
Scientific use casesAntarctic Geographic Place Names
Antarctic geographic names from the Composite Gazetteer of Antarctica, and functions for working with those place names.
View DocumentationInterface to the National Phenology Network API
Programmatic interface to the Web Service methods provided by the National Phenology Network (https://usanpn.org/), which includes data on various life history events that occur at specific times.
View DocumentationA GraphQL Query Parser
Bindings to the libgraphqlparser C++ library. Parses GraphQL syntax and exports the AST in JSON format.
View DocumentationGoogle's Compact Language Detector 2
Bindings to Google’s C++ library Compact Language Detector 2 (see https://github.com/cld2owners/cld2#readme for more information). Probabilistically detects over 80 languages in plain text or HTML. For mixed-language input it returns the top three detected languages and their approximate proportion of the total classified text bytes (e.g. 80% English and 20% French out of 1000 bytes). There is also a cld3 package on CRAN which uses a neural network model instead.
Scientific use casesSupports the Analysis of RTI MicroPEM Output Files
Supports the input and reproducible analysis of RTI MicroPEM output files.
Scientific use casesAccesses the Monkeylearn API for Text Classifiers and Extractors
Allows using some services of Monkeylearn http://monkeylearn.com/ which is a Machine Learning platform on the cloud for text analysis (classification and extraction).
View DocumentationOpenBIS API Access to the InfectX Data Repository
The Open Source Biology Information System (openBIS) is a general purpose framework for management, annotation and publication of large data sets that arise from biological experiments. By making the JSON-RPC based openBIS API available to R, image-based high throughput screening data as generated by the InfectX/TargetInfectX projects can be browsed, searched and downloaded directly from R. Currently, several kinome-wide RNA interference screens performed on HeLa cells in presence of a selection of bacterial and viral pathogens and using oligo libraries form multiple vendors are available. Further genome-wide screens are forthcoming. The full data obtained from these experiments is accessible, including raw microscopy images, object segmentation masks, single cell feature data generated by CellProfiler and infection scoring data, alongside rich meta data and quality control data.
View DocumentationWorking with GTFS (General Transit Feed Specification) feeds in R
Provides API wrappers for popular public GTFS feed sharing sites, reads feed data into a gtfs data object, validates data quality, provides convenience functions for common tasks.
View DocumentationGenerate Starting Trees For Combined Molecular, Morphological and Stratigraphic Data
Combine a list of taxa with a phylogeny to generate a starting tree for use in total evidence dating analyses.
View DocumentationWorld Map Data from Natural Earth
Facilitates mapping by making natural earth map data from http://www.naturalearthdata.com/ more easily available to R users.
Scientific use casesSimple Text Wrappers
Simple functions for common repeatable tasks in NLP and text mining.
View DocumentationSimulating Neutral Landscape Models
Provides neutral landscape models (doi:10.1007/BF02275262,
http://sci-hub.tw/10.1007/bf02275262).
Neutral landscape models range from “hard”
neutral models (completely random distributed), to “soft” neutral models
(definable spatial characteristics) and generate landscape patterns that are
independent of ecological processes.
Thus, these patterns can be used as null models in landscape ecology. nlmr
combines a large number of algorithms from other published software for
simulating neutral landscapes. The simulation results are obtained in a
geospatial data format (raster* objects from the raster package) and can,
therefore, be used in any sort of raster data operation that is performed
with standard observation data.
Interface to Chromosome Counts Database API
A programmatic interface to the Chromosome Counts Database (http://ccdb.tau.ac.il/). This package is part of the rOpenSci suite (https://ropensci.org).
Scientific use casesRead EPUB File Metadata and Text
Provides functions supporting the reading and parsing of internal e-book content from EPUB files. The epubr package provides functions supporting the reading and parsing of internal e-book content from EPUB files. E-book metadata and text content are parsed separately and joined together in a tidy, nested tibble data frame. E-book formatting is not completely standardized across all literature. It can be challenging to curate parsed e-book content across an arbitrary collection of e-books perfectly and in completely general form, to yield a singular, consistently formatted output. Many EPUB files do not even contain all the same pieces of information in their respective metadata. EPUB file parsing functionality in this package is intended for relatively general application to arbitrary EPUB e-books. However, poorly formatted e-books or e-books with highly uncommon formatting may not work with this package. There may even be cases where an EPUB file has DRM or some other property that makes it impossible to read with epubr. Text is read as is for the most part. The only nominal changes are minor substitutions, for example curly quotes changed to straight quotes. Substantive changes are expected to be performed subsequently by the user as part of their text analysis. Additional text cleaning can be performed at the users discretion, such as with functions from packages like tm or qdap’.
View DocumentationSplit, Combine and Compress PDF Files
Content-preserving transformations transformations of PDF files such as split, combine, and compress. This package interfaces directly to the qpdf C++ API and does not require any command line utilities. Note that qpdf does not read actual content from PDF files: to extract text and data you need the pdftools package.
View DocumentationPopler R Package
Browse and query the popler database.
View DocumentationExtract Text from Microsoft Word Documents
Wraps the AntiWord utility to extract text from Microsoft Word documents. The utility only supports the old doc format, not the new xml based docx format. Use the xml2 package to read the latter.
View DocumentationEntrez in R
Provides an R interface to the NCBIs EUtils’ API, allowing users to search databases like GenBank https://www.ncbi.nlm.nih.gov/genbank/ and PubMed https://www.ncbi.nlm.nih.gov/pubmed/, process the results of those searches and pull data into their R sessions.
Scientific use casesAnalysis of Work Loops and Other Data from Muscle Physiology Experiments
Functions for the import, transformation, and analysis of data from muscle physiology experiments. The work loop technique is used to evaluate the mechanical work and power output of muscle. Josephson (1985) https://jeb.biologists.org/content/114/1/493 modernized the technique for application in comparative biomechanics. Although our initial motivation was to provide functions to analyze work loop experiment data, as we developed the package we incorporated the ability to analyze data from experiments that are often complementary to work loops. There are currently three supported experiment types: work loops, simple twitches, and tetanus trials. Data can be imported directly from .ddf files or via an object constructor function. Through either method, data can then be cleaned or transformed via methods typically used in studies of muscle physiology. Data can then be analyzed to determine the timing and magnitude of force development and relaxation (for isometric trials) or the magnitude of work, net power, and instantaneous power among other things (for work loops). Although we do not provide plotting functions, all resultant objects are designed to be friendly to visualization via either base-R plotting or tidyverse functions. This package has been peer-reviewed by rOpenSci (v. 1.1.0).
View DocumentationrOpenSci package review project template
Creates files and collects materials necessary to complete an rOpenSci package review. Review files are prepopulated with review package specific metadata. Review package source code is also cloned for local testing and inspection.
View DocumentationEasily Visualize Data from ERDDAP Servers via the rerddap Package
Easily visualize and animate tabledap and griddap objects obtained via the rerddap package in a simple one-line command, using either base graphics or ggplot2 graphics. plotdap handles extracting and reshaping the data, map projections and continental outlines. Optionally the data can be animated through time using the gganmiate package.
View DocumentationCheck Package for Potential Security Violations
Check an R package for potential security risks and violations via static code analysis.
View DocumentationPersonal Workstation Safety Check
Sobriety checkpoints are designed to help ensure personal and public safety. Methods are provided to run “sobriety” checks on your system to ensure your computing environment is as safe as possible from the perspectives of confidentiality and integrity.
View DocumentationBindings for Tabula PDF Table Extractor Library
Bindings for the Tabula http://tabula.technology/ Java library, which can extract tables from PDF documents. The tabulizerjars package https://github.com/ropensci/tabulizerjars provides versioned Java .jar files, including all dependencies, aligned to releases of Tabula.
Scientific use casesDBHYDRO Hydrologic and Water Quality Data
Client for programmatic access to the South Florida Water Management Districts DBHYDRO’ database at https://www.sfwmd.gov/science-data/dbhydro, with functions for accessing hydrologic and water quality data.
View DocumentationRecodes Sex/Gender Descriptions Into A Standard Set
gendercodeR allows for simple recoding of freetext gender responses.
View DocumentationHistorical and Contemporary Boundaries of the United States of America
The boundaries for geographical units in the United States of America contained in this package include state, county, congressional district, and zip code tabulation area. Contemporary boundaries are provided by the U.S. Census Bureau (public domain). Historical boundaries for the years from 1629 to 2000 are provided form the Newberry Librarys Atlas of Historical County Boundaries’ (licensed CC BY-NC-SA). Additional data is provided in the USAboundariesData package; this package provides an interface to access that data.
View Documentationcheckers
Package to assess analysis + review guide for analysis best practice
View DocumentationSimple Data Versioning
Simple dataversioning using GitHub to store data.
Scientific use casesInterface to the "Geonames" Spatial Query Web Service
The web service at https://www.geonames.org/ provides a number of spatial data queries, including administrative area hierarchies, city locations and some country postal code queries. A (free) username is required and rate limits exist.
Scientific use casesSPARQL DSL Client
SPARQL DSL Client.
View DocumentationQuick Ref. Guides For R Functions
An R equivalent for the command line tool “tldr”, which provides quick guides to functions. Contributions from the community are welcome!
View DocumentationCreate requisite files and launch binder with mybinder.org
Computational reproducibility is a critical component of modern open science. Methods such as docker exist to containerise analyses, ensuring that operating systems and package versions are recorded and can be recreated in order to rerun analyses. Setting up dockerfiles, however, is a nontrivial task on top of a growing technical barrier to reproducible research. Binder is a easy interface to produce a virtual machine within which to rerun analyses without requiring installation or understanding of underlying containerisation principles. It does however still require researchers to search through their code to find packages and version of packages used in the project. This package seeks to make the bridge to using binder for analyses in R even simpler, by setting up the install.R file with all packages and version (both on CRAN and github) in one step. The binder can also be launched right from R, without needing to manually input repository information into the mybinder.org interface.
View DocumentationAustralian Popular Baby Names
Data on the most popular baby names in Australia.
View DocumentationImproving the Track Changes and Reviewing Experience in R Markdown
Provides functionality to compare two versions of an rmarkdown document and display their differences in a nicely-formatted manner, along with an RStudio addin that adds the required JavaScript code to an rmarkdown document, so that when rendered to HTML it can be annotated using the Hypothes.is service.
View DocumentationR client to Joint Research Centre's DOPA REST API
R client for REST web services of DOPA (Digital Observatory for protected Areas) by the European Union Joint Research Centre.
View DocumentationPackage review tools
Provides tools to facilitate an R package review.
View DocumentationIn-Process Version of MonetDB
An in-process version of MonetDB, a SQL database designed for analytical tasks. Similar to SQLite, the database runs entirely inside the R shell.
View DocumentationExtracts Metadata from Directory and File Names
Extract metadata from directory and file names based on a template into data frame.
View DocumentationTidy up nested list hairballs
This is a package to transform large, multi-nested lists into a more
user-friendly format. The initial focus is on making
processing of return values from jsonlite::fromJSON()
queries more seamless,
but ideally this package should be useful for deeply-nested lists from an array
of sources.
Tools to Intercept, Validate and Consume Web/Network Traffic
The mitmproxy https://mitmproxy.org/ project provides tools to intercept, modify and/or introspect network traffic. Methods are provided to download, install, configure and launch mitmproxy plus introspect and validate network captures.
View DocumentationDatasets for the USAboundaries package
Contains datasets, including higher resolution boundary data, for use in the USAboundaries package. These datasets come from the U.S. Census Bureau, the Newberry Librarys Historical Atlas of U.S. County Boundaries, and Erik Steiners ‘United States Historical City Populations, 1790-2010’.
View DocumentationIntuit Package Harm
Intuit package harm.
View DocumentationHeadless Chrome Orchestration
The Chrome browser https://www.google.com/chrome/ has a headless mode which can be instrumented programmatically. Tools are provided to perform headless Chrome instrumentation on the command-line and will eventually provide support for the DevTools instrumentation API or the forthcoming phantomjs-like higher-level API being promised by the development team.
View DocumentationRead and Write Data Packages
Convenience functions for reading and writing datasets following the data packagist format.
View DocumentationArrested Development
Here to help you when your development is, shall we say, arrested.
View DocumentationWhat the Package Does (Title Case)
More about what it does (maybe more than one line) Use four spaces when indenting paragraphs within the Description.
View DocumentationSend Live Status, Progress and Other Information Between Functions and Processes
jobstatus lets you pass live progress, status, and other information between functions and processes in R, so that you can keep an eye on how complex and long-running jobs are progressing. jobstatus uses the future package so you can even get live progress information back from jobs running in parallel.
View DocumentationRStudio Addin for Tracking Document Changes
More about what it does (maybe more than one line) Use four spaces when indenting paragraphs within the Description.
View DocumentationTools to Work with the Keybase API
Keybase <keybase.io> is a directory of people and public keys and provides
methods for obtaining public keys, validating users and exchanging files and/or messages
in a secure fashion. Tools are provided to search for and retrieve information about
Keybase users, retrieve and import user public keys and list and/or download files.
There’s also a thin but useful R wrapper around many of they keybase
command-line
utility functions.
Create and manipulate Schema.org Dataset metadata
What the package does (one paragraph).
View DocumentationInterface to the IEEE Xplore Gateway
An interface to the IEEE Xplore Gateway, for searching IEEE publications.
View DocumentationR Bindings to rlite
R bindings to rlite. rlite is a “self-contained, serverless, zero-configuration, transactional redis-compatible database engine. rlite is to Redis what SQLite is to SQL.”.
View DocumentationA simple interface and workflow for version control in R (implemented using git as backend but without the confusion)
This package provides a set of easy-to-use tools for beginners wanting to implement version control via git for their projects.
View DocumentationAustralia-Themed Color Palettes
Provide Australia-themed color palettes.
View DocumentationSigning and Verification of R Packages
Signing and verification of R packages.
View DocumentationTesting for R Markdown Chunks
Provides facilities for adding test chunks to RMarkdown documents, as well as CSS and javascript for nice styling of the output. This enables testing of data without completely stopping the knitting of a document, while seeing possible problems in the final HTML output.
View DocumentationAutomate sending email with Gmail
What the package does (one paragraph).
View DocumentationConvert Between Units
Provides conversion functionality between a broad range of scientific, historical, and industrial unit types.
View DocumentationClient for the Index Database of Remote Sensing Indices
Index Database (http://www.indexdatabase.de/) of remote sensing indices.
View DocumentationSpin up a managed cluster and perform parallel calculations
Spin up a head node, which spins up worker nodes, and performs parallel calculations
View DocumentationProvides community-driven color palettes
Provides community-driven color palettes.
View DocumentationGet data related to transportation and cultural places from Rio de Janeiro, Brazil.
Get data related to transportation and cultural places from Rio de Janeiro, Brazil.
View Documentation