Core Principle
Test functionality, not domain content.
EcoExtract tests verify that the package’s core operations work correctly regardless of the specific schema, prompts, or ecological domain. Tests never check whether the LLM produces “correct” extractions – they verify that the pipeline machinery functions properly.
This means:
-
No schema-specific assertions – tests don’t look
for field names like
bat_speciesorpathogen_name - No prompt effectiveness testing – tests don’t check whether extraction finds specific data in text
- No LLM quality evaluation – tests verify API calls return valid structure, not accurate content
- Schema-agnostic fixtures – test schemas deliberately differ from the package defaults
Test Categories
Local Tests (no API keys required)
These tests run entirely offline and execute in under a second. They
are always run by devtools::test() and
devtools::check().
test-database.R – Database
operations:
- Database initialization creates required tables (documents, records, record_edits)
- Database schema matches JSON schema definition
- Records can be saved and retrieved
- Array fields stored as single-level JSON arrays
test-review.R – Human review and
accuracy:
-
save_document()updates reviewed_at timestamp - Modified records marked as human_edited
- Deleted records marked as deleted_by_user
- Edit tracking populates record_edits table
-
calculate_accuracy()returns correct structure and metrics
test-utils.R – Utility functions:
- Record ID generation and formatting
- Special character handling in IDs
- Token estimation for various inputs
test-deduplication.R – Deduplication
logic:
- Text canonicalization (Unicode normalization, case folding, whitespace trimming)
- Cosine similarity and Jaccard similarity calculation
- Jaccard-based deduplication (exact duplicates, typos, partial matches)
- Schema validation (x-unique-fields required and valid)
test-bibtex.R – BibTeX export:
- Document metadata exports as valid BibTeX entries
- Citation extraction from bibliography field
- Handles incomplete metadata gracefully
Integration Tests (require API keys)
These tests make real API calls and validate the end-to-end pipeline. They are automatically skipped when the required API keys are not set, so contributors without keys can still run the local test suite.
Required API keys:
| Key | Service | Used For |
|---|---|---|
ANTHROPIC_API_KEY |
Anthropic Claude | Data extraction, metadata, refinement, LLM deduplication |
MISTRAL_API_KEY |
Tensorlake | OCR processing (via ohseer) |
OPENAI_API_KEY |
OpenAI | Embedding-based deduplication |
test-integration.R – All API-requiring
tests in one file:
- Full pipeline: PDF to database (OCR, metadata, extraction, refinement)
- API failures captured in status columns, not thrown as errors
- Schema-agnostic pipeline with a host-pathogen schema (proves no hard-coded assumptions)
- Embedding-based deduplication (exact duplicates, near-duplicates, missing fields)
- Field-by-field deduplication (partial matches, populated field comparison)
- LLM-based semantic deduplication (common names vs scientific names)
Running Tests
# Run all tests (integration tests auto-skip without keys)
devtools::test()
# Run a specific test file
testthat::test_file("tests/testthat/test-database.R")
# Full package check (includes tests, documentation, examples)
devtools::check()To include integration tests, set up API keys in a .env
file (see the Complete
Guide for details). The .env file is automatically
loaded when R starts in the project directory.
Design Patterns
Cleanup with withr. All test resources (temp
databases, files, environment variables) are cleaned up automatically
using withr, ensuring no side effects between tests.
Focused assertions. Each test_that()
block tests one specific behavior rather than bundling multiple
concerns.
Edge case coverage. Tests cover typical usage, edge cases (empty inputs, NULL values), error conditions, and type validation.
Schema-agnostic design. Test fixtures use schemas that deliberately differ from the package defaults, proving no hard-coded domain assumptions exist in the pipeline.
API key gating. Integration tests use
skip_if() guards so the full local test suite passes
without any API keys configured.