Apply Official Taxonomies to Unstructured Data
Paste or upload a document to get ranked classification results from official taxonomies. This is a free online classification tool — no account required for single documents. If you need to classify many documents, use bulk upload instead.
Classification Results
| Code | |||
|---|---|---|---|
Deterministic Taxonomy Classification Infrastructure
Taxer converts unstructured text into standardized classification codes — reproducibly, cost‑efficiently, and at scale
Classify documents such as:
Job postings and resumes
Product catalogs and listings
Company and supplier descriptions
Compliance evidence and policy documents
News articles and research documents
Taxer can be used through:
API integration
Bulk processing
Interactive UI
into standardized taxonomies including SOC, O*NET, NAICS, ISIC, and other classification systems.
Deterministic when reproducibility matters.
Optional hybrid AI curation when maximum labeling precision is required.
Built for teams processing anywhere from thousands of documents to millions per day and billions per month.
Built for Taxonomy-Heavy
Workflows
Taxer is designed for teams that already know classification matters — but do not want to rely on brittle rules, manual review, or per-document LLM costs
This is especially relevant for products and pipelines such as:
Job and labor-market data platforms tagging postings into occupations, skills, and industries for better matching and analytics
Product feed and marketplace platforms mapping catalogs into channel taxonomies for improved product discovery
Compliance and GRC systems sorting evidence into controls, frameworks, and policy categories for audit readiness
News and market intelligence platforms tagging articles into risk, industry, or client-specific taxonomies
Procurement and supplier intelligence systems mapping supplier data into NAICS, spend, or capability taxonomies
In these workflows, classification is necessary infrastructure — but it is rarely the product’s core moat.
Taxer handles that layer so teams can move faster without building and maintaining large-label classification systems in-house.
Deterministic When Reproducibility
Matters
Many classification systems feed analytics, search, reporting, or regulated workflows. In these cases, outputs need to be stable and inspectable. Taxer's core engine is deterministic within a fixed engine and taxonomy version.
This means: Same input + same taxonomy version → identical classification results.
There is:
No generative output
No sampling
No temperature
No stochastic behavior in the core engine
This is especially important for teams that need to:
Backfill large historical datasets without label drift
Keep occupation or industry tagging stable across time
Produce auditable outputs for compliance and reporting
Maintain consistent enrichment in production data pipelines
This is especially important for teams that need to:
A labor-market data provider classifying millions of job postings cannot afford shifting occupation labels between runs
A compliance platform mapping evidence to control frameworks needs outputs that can be reviewed and defended
A supplier data platform enriching company records needs repeatable industry mapping across a large corpus
Accuracy Modes
Different teams need different tradeoffs between throughput, cost, and precision.Taxer supports multiple classification approaches depending on the workflow.
Deterministic Classification
High-speed deterministic mapping to official taxonomy codes or other structured classification systems.
Best for:
Recurring production pipelines
Large-scale enrichment jobs
Analytics datasets
Historical backfills
Standardized classification infrastructure
Typical use cases include:
Tagging job postings into occupation taxonomies
Mapping company descriptions into NAICS or ISIC
Categorizing product listings into marketplace category trees
Assigning news or document streams into topic taxonomies
Normalizing supplier records into industry or capability categories
This mode is especially useful when teams need high throughput, predictable cost, and reproducible outputs.
Designed for
Both Scale
andPrecision
Taxer supports a wide range of classification workloads.
Precision Labeling
For smaller or medium-sized datasets where label quality is the priority.
Examples include:
- Benchmark datasets
- Research corpora
- Premium customer deliverables
- Historical backfills
- Standardized classification infrastructure
Operational Data Pipelines
For recurring production classification jobs that run continuously.
Examples include:
- New job postings flowing into a labor-market platform
- New SKUs entering a product feed or marketplace system
- Categorizing product listings into marketplace category trees
- Incoming supplier records being normalized into spend or industry taxonomies
- Daily news ingestion being mapped into client-specific categories
Massive Document Processing
For bulk workloads ranging from backfills to continuous high-volume classification.
Examples include:
- Monthly job-posting backfills
- Large product catalog normalization jobs
- Continuous article classification streams
- Enterprise document archives
- Large supplier or company-profile enrichment runs
Taxer supports workloads from:
- Thousands of documents
- Tens of thousands of documents
- Millions of documents
- Billions of classifications per month
- Standardized classification infrastructure
Classification workflows can be tuned for:
- Cost efficiency
- Throughput
- Maximum labeling precision depending on what the application requires.
Why Teams Use Taxer
Teams typically adopt Taxer when they hit one of these problems:
Rule Systems Stop Quickly Scaling
What worked for 50 labels breaks down at 500, 5,000, or 10,000+ labels
Manual Review Becomes the Bottleneck
Analysts, ops teams, or support staff end up triaging documents that should be automatically
LLM-Only Approaches Are Too Expensive or Unstable
Per-document generative classification is hard to control at high volume and difficult to audit
In-House Systems Are Costly to Build and Maintain
Large-taxonomy classification infrastructure is not the main thing companies want to spend time on
For early customers, this often looks like:
A product-feed company trying to reduce rule sprawl across channel taxonomies
A jobs platform trying to enrich millions of postings without blowing up unit economics
A compliance vendor trying to reduce evidence triage and review load
A market-intelligence product trying to support customer-specific taxonomies without building a custom model for every account
A supplier intelligence platform trying to improve NAICS or capability coverage at scale
How Taxer Works
Input→Semantic Normalization→Exhaustive Taxonomy Evaluation→Ranked Results→Optional AI Review
Input
Provide unstructured text such as:
- Job postings
- Resumes
- Product descriptions
- Company and supplier profiles
- Evidence documents
- News articles
- Structured records with text fields
Optional metadata such as a title can also be included
Semantic Normalization
The system analyzes the document to extract relevant contextual signals, language patterns, and structural indicators.
This helps normalize messy real-world text such as:
- Inconsistent job titles
- Noisy product catalog language
- Incomplete supplier descriptions
- Unstructured evidence metadata
- Article text with domain-specific terminology
Exhaustive Taxonomy Evaluation
For each selected taxonomy, Taxer evaluates the document against every category in the taxonomy.
Each category receives an independent score between 0 and 1 representing alignment strength.
This is one of Taxer's key differences: it evaluates across the full taxonomy rather than guessing from a partial label set.
That matters most when working with:
- Large occupation or skill taxonomies
- Marketplace category trees
- Compliance control frameworks
- Custom industry or risk taxonomies
- Supplier capability hierarchies
Ranked Results
Taxer returns the top matching classifications including:
- Taxonomy code
- Category label
- Alignment score
Scores represent alignment strength, not probabilities. This lets teams:
- Use top results directly in production
- Route low-confidence cases into review queues
- Combine deterministic outputs with downstream business logic
- Support explainable review workflows for sensitive applications
Optional AI Review
When maximum precision is required, additional AI layers can score and compare candidate classifications before returning a final result.
This is particularly useful for:
- Benchmark and gold-standard labeling
- Premium customer deliverables
- Sensitive compliance workflows
- High-value curation pipelines
Ways to Use Taxer
API
Integrate classification directly into existing products, ingestion systems, and pipelines.
- Tagging new job postings as they are ingested
- Classifying supplier descriptions during enrichment
- Assigning category codes to marketplace listings before export
- Labeling incoming articles into customer-defined topic sets
Bulk Processing
Upload CSV, Excel, JSON, or XML files and classify very large datasets in batch.
- Backfilling months of historical job postings
- Reclassifying product catalogs across new sales channels
- Processing evidence archives against internal control taxonomies
- Mapping large company datasets into official industry codes
Interactive UI
Paste text and immediately see ranked taxonomy classification matches.
- Reviewing a new job description draft
- Testing company-to-industry classification accuracy
- Evaluating a policy or compliance evidence document
- Quickly validating taxonomy fit before running a full bulk job
Built for
Teams Like These
Taxer is especially relevant for:
- Labor-market and workforce data teams
- Procurement and supplier intelligence teams
- Compliance and GRC product teams
- News, media, and different market intelligence platforms
- Product feed and marketplace infrastructure teams
- Data engineering teams building large-scale pipelines
These teams often share the same challenge: they need classification to work reliably, but they do not want to build a specialized taxonomy-classification stack from scratch.
Explore deeper documentation:
- API documentation
- Deterministic classification
- Security and data handling
- Supported taxonomies
- Accuracy modes
- Custom and extended taxonomy support
Start Classifying Documents with Official Taxonomies
Try the interactive demo, explore supported taxonomies, or integrate the API into your data pipeline