Skip to contents

Core Principle

Test functionality, not domain content.

EcoExtract tests verify that the package’s core operations work correctly regardless of the specific schema, prompts, or ecological domain. Tests never check whether the LLM produces “correct” extractions – they verify that the pipeline machinery functions properly.

This means:

  • No schema-specific assertions – tests don’t look for field names like bat_species or pathogen_name
  • No prompt effectiveness testing – tests don’t check whether extraction finds specific data in text
  • No LLM quality evaluation – tests verify API calls return valid structure, not accurate content
  • Schema-agnostic fixtures – test schemas deliberately differ from the package defaults

Test Categories

Local Tests (no API keys required)

These tests run entirely offline and execute in under a second. They are always run by devtools::test() and devtools::check().

test-database.R – Database operations:

  • Database initialization creates required tables (documents, records, record_edits)
  • Database schema matches JSON schema definition
  • Records can be saved and retrieved
  • Array fields stored as single-level JSON arrays

test-review.R – Human review and accuracy:

  • save_document() updates reviewed_at timestamp
  • Modified records marked as human_edited
  • Deleted records marked as deleted_by_user
  • Edit tracking populates record_edits table
  • calculate_accuracy() returns correct structure and metrics

test-utils.R – Utility functions:

  • Record ID generation and formatting
  • Special character handling in IDs
  • Token estimation for various inputs

test-deduplication.R – Deduplication logic:

  • Text canonicalization (Unicode normalization, case folding, whitespace trimming)
  • Cosine similarity and Jaccard similarity calculation
  • Jaccard-based deduplication (exact duplicates, typos, partial matches)
  • Schema validation (x-unique-fields required and valid)

test-bibtex.R – BibTeX export:

  • Document metadata exports as valid BibTeX entries
  • Citation extraction from bibliography field
  • Handles incomplete metadata gracefully

Integration Tests (require API keys)

These tests make real API calls and validate the end-to-end pipeline. They are automatically skipped when the required API keys are not set, so contributors without keys can still run the local test suite.

Required API keys:

Key Service Used For
ANTHROPIC_API_KEY Anthropic Claude Data extraction, metadata, refinement, LLM deduplication
MISTRAL_API_KEY Tensorlake OCR processing (via ohseer)
OPENAI_API_KEY OpenAI Embedding-based deduplication

test-integration.R – All API-requiring tests in one file:

  • Full pipeline: PDF to database (OCR, metadata, extraction, refinement)
  • API failures captured in status columns, not thrown as errors
  • Schema-agnostic pipeline with a host-pathogen schema (proves no hard-coded assumptions)
  • Embedding-based deduplication (exact duplicates, near-duplicates, missing fields)
  • Field-by-field deduplication (partial matches, populated field comparison)
  • LLM-based semantic deduplication (common names vs scientific names)

Running Tests

# Run all tests (integration tests auto-skip without keys)
devtools::test()

# Run a specific test file
testthat::test_file("tests/testthat/test-database.R")

# Full package check (includes tests, documentation, examples)
devtools::check()

To include integration tests, set up API keys in a .env file (see the Complete Guide for details). The .env file is automatically loaded when R starts in the project directory.

Design Patterns

Cleanup with withr. All test resources (temp databases, files, environment variables) are cleaned up automatically using withr, ensuring no side effects between tests.

Focused assertions. Each test_that() block tests one specific behavior rather than bundling multiple concerns.

Edge case coverage. Tests cover typical usage, edge cases (empty inputs, NULL values), error conditions, and type validation.

Schema-agnostic design. Test fixtures use schemas that deliberately differ from the package defaults, proving no hard-coded domain assumptions exist in the pipeline.

API key gating. Integration tests use skip_if() guards so the full local test suite passes without any API keys configured.