Apply Official Taxonomies to Unstructured Data

Paste or upload a document to get ranked classification results from official taxonomies. This is a free online classification tool — no account required for single documents. If you need to classify many documents, use bulk upload instead.

Classification Results

Code

Deterministic Taxonomy Classification Infrastructure

Taxer converts unstructured text into standardized classification codes — reproducibly, cost‑efficiently, and at scale

Classify documents such as:

Job postings and resumes
Product catalogs and listings
Company and supplier descriptions
Compliance evidence and policy documents
News articles and research documents

Taxer can be used through:

API integration
Bulk processing
Interactive UI

into standardized taxonomies including SOC, O*NET, NAICS, ISIC, and other classification systems.

Deterministic when reproducibility matters.

Optional hybrid AI curation when maximum labeling precision is required.

Built for teams processing anywhere from thousands of documents to millions per day and billions per month.

Get API Key

Built for Taxonomy-Heavy
Workflows

Taxer is designed for teams that already know classification matters — but do not want to rely on brittle rules, manual review, or per-document LLM costs

This is especially relevant for products and pipelines such as:

Job and labor-market data platforms tagging postings into occupations, skills, and industries for better matching and analytics

Product feed and marketplace platforms mapping catalogs into channel taxonomies for improved product discovery

Compliance and GRC systems sorting evidence into controls, frameworks, and policy categories for audit readiness

News and market intelligence platforms tagging articles into risk, industry, or client-specific taxonomies

Procurement and supplier intelligence systems mapping supplier data into NAICS, spend, or capability taxonomies

In these workflows, classification is necessary infrastructure — but it is rarely the product’s core moat.

Taxer handles that layer so teams can move faster without building and maintaining large-label classification systems in-house.

Deterministic When Reproducibility
Matters

Many classification systems feed analytics, search, reporting, or regulated workflows. In these cases, outputs need to be stable and inspectable. Taxer's core engine is deterministic within a fixed engine and taxonomy version.

This means: Same input + same taxonomy version → identical classification results.

There is:

No generative output
No sampling
No temperature
No stochastic behavior in the core engine

This is especially important for teams that need to:

Backfill large historical datasets without label drift
Keep occupation or industry tagging stable across time
Produce auditable outputs for compliance and reporting
Maintain consistent enrichment in production data pipelines

This is especially important for teams that need to:

A labor-market data provider classifying millions of job postings cannot afford shifting occupation labels between runs
A compliance platform mapping evidence to control frameworks needs outputs that can be reviewed and defended
A supplier data platform enriching company records needs repeatable industry mapping across a large corpus

Accuracy Modes

Different teams need different tradeoffs between throughput, cost, and precision.Taxer supports multiple classification approaches depending on the workflow.

Deterministic Classification

High-speed deterministic mapping to official taxonomy codes or other structured classification systems.

Best for:

Recurring production pipelines
Large-scale enrichment jobs
Analytics datasets
Historical backfills
Standardized classification infrastructure

Typical use cases include:

Tagging job postings into occupation taxonomies
Mapping company descriptions into NAICS or ISIC
Categorizing product listings into marketplace category trees
Assigning news or document streams into topic taxonomies
Normalizing supplier records into industry or capability categories

This mode is especially useful when teams need high throughput, predictable cost, and reproducible outputs.

Designed for
Both Scale
and
Precision

Taxer supports a wide range of classification workloads.

Precision Labeling

For smaller or medium-sized datasets where label quality is the priority.

Examples include:

Benchmark datasets
Research corpora
Premium customer deliverables
Historical backfills
Standardized classification infrastructure

Operational Data Pipelines

For recurring production classification jobs that run continuously.

Examples include:

New job postings flowing into a labor-market platform
New SKUs entering a product feed or marketplace system
Categorizing product listings into marketplace category trees
Incoming supplier records being normalized into spend or industry taxonomies
Daily news ingestion being mapped into client-specific categories

Massive Document Processing

For bulk workloads ranging from backfills to continuous high-volume classification.

Examples include:

Monthly job-posting backfills
Large product catalog normalization jobs
Continuous article classification streams
Enterprise document archives
Large supplier or company-profile enrichment runs

Taxer supports workloads from:

Thousands of documents
Tens of thousands of documents
Millions of documents
Billions of classifications per month
Standardized classification infrastructure

Classification workflows can be tuned for:

Cost efficiency
Throughput
Maximum labeling precision depending on what the application requires.

Why Teams Use Taxer

Teams typically adopt Taxer when they hit one of these problems:

Rule Systems Stop Quickly Scaling

What worked for 50 labels breaks down at 500, 5,000, or 10,000+ labels

Manual Review Becomes the Bottleneck

Analysts, ops teams, or support staff end up triaging documents that should be automatically

LLM-Only Approaches Are Too Expensive or Unstable

Per-document generative classification is hard to control at high volume and difficult to audit

In-House Systems Are Costly to Build and Maintain

Large-taxonomy classification infrastructure is not the main thing companies want to spend time on

For early customers, this often looks like:

A product-feed company trying to reduce rule sprawl across channel taxonomies
A jobs platform trying to enrich millions of postings without blowing up unit economics
A compliance vendor trying to reduce evidence triage and review load
A market-intelligence product trying to support customer-specific taxonomies without building a custom model for every account
A supplier intelligence platform trying to improve NAICS or capability coverage at scale

How Taxer Works

Input→Semantic Normalization→Exhaustive Taxonomy Evaluation→Ranked Results→Optional AI Review

Input

Provide unstructured text such as:

Job postings
Resumes
Product descriptions
Company and supplier profiles
Evidence documents
News articles
Structured records with text fields

Optional metadata such as a title can also be included

Semantic Normalization

The system analyzes the document to extract relevant contextual signals, language patterns, and structural indicators.

This helps normalize messy real-world text such as:

Inconsistent job titles
Noisy product catalog language
Incomplete supplier descriptions
Unstructured evidence metadata
Article text with domain-specific terminology

Exhaustive Taxonomy Evaluation

For each selected taxonomy, Taxer evaluates the document against every category in the taxonomy.

Each category receives an independent score between 0 and 1 representing alignment strength.

This is one of Taxer's key differences: it evaluates across the full taxonomy rather than guessing from a partial label set.

That matters most when working with:

Large occupation or skill taxonomies
Marketplace category trees
Compliance control frameworks
Custom industry or risk taxonomies
Supplier capability hierarchies

Ranked Results

Taxer returns the top matching classifications including:

Taxonomy code
Category label
Alignment score

Scores represent alignment strength, not probabilities. This lets teams:

Use top results directly in production
Route low-confidence cases into review queues
Combine deterministic outputs with downstream business logic
Support explainable review workflows for sensitive applications

Optional AI Review

When maximum precision is required, additional AI layers can score and compare candidate classifications before returning a final result.

This is particularly useful for:

Benchmark and gold-standard labeling
Premium customer deliverables
Sensitive compliance workflows
High-value curation pipelines

Ways to Use Taxer

API

Integrate classification directly into existing products, ingestion systems, and pipelines.

Tagging new job postings as they are ingested
Classifying supplier descriptions during enrichment
Assigning category codes to marketplace listings before export
Labeling incoming articles into customer-defined topic sets

Bulk Processing

Upload CSV, Excel, JSON, or XML files and classify very large datasets in batch.

Backfilling months of historical job postings
Reclassifying product catalogs across new sales channels
Processing evidence archives against internal control taxonomies
Mapping large company datasets into official industry codes

Interactive UI

Paste text and immediately see ranked taxonomy classification matches.

Reviewing a new job description draft
Testing company-to-industry classification accuracy
Evaluating a policy or compliance evidence document
Quickly validating taxonomy fit before running a full bulk job

Built for
Teams Like These

Taxer is especially relevant for:

Labor-market and workforce data teams
Procurement and supplier intelligence teams
Compliance and GRC product teams
News, media, and different market intelligence platforms
Product feed and marketplace infrastructure teams
Data engineering teams building large-scale pipelines

These teams often share the same challenge: they need classification to work reliably, but they do not want to build a specialized taxonomy-classification stack from scratch.

Explore deeper documentation:

API documentation
Deterministic classification
Security and data handling
Supported taxonomies
Accuracy modes
Custom and extended taxonomy support

Start Classifying Documents with Official Taxonomies

Try the interactive demo, explore supported taxonomies, or integrate the API into your data pipeline

Get API Key

Apply Official Taxonomies to Unstructured Data

Classification Results

Deterministic Taxonomy Classification Infrastructure

Built for Taxonomy-HeavyWorkflows

Deterministic When ReproducibilityMatters

Accuracy Modes

Deterministic Classification

Designed forBoth Scale and Precision

Precision Labeling

Operational Data Pipelines

Massive Document Processing

Taxer supports workloads from:

Classification workflows can be tuned for:

Why Teams Use Taxer

How Taxer Works

Input

Semantic Normalization

Exhaustive Taxonomy Evaluation

Ranked Results

Optional AI Review

Ways to Use Taxer

Built for Teams Like These

Start Classifying Documents with Official Taxonomies

Built for Taxonomy-Heavy
Workflows

Deterministic When Reproducibility
Matters

Designed for
Both Scale
and
Precision

Built for
Teams Like These