For those who touch it, product data is often labelled as ‘broken’ when product listings get rejected, search filters show empty results, or customers are unable to find key specs. However, in most businesses, it’s not that the catalogue is damaged. Rather, it was never properly completed in the first place. Product records were populated with just enough information to be launched,..,and then left in a permanently ‘good enough’ state.
Our article explains why this happens – how unfinished catalogues create operational inefficiencies and degrade revenue, and what it takes to finish product records in a way which enables you to scale them across channels and keep continuously improving them.
‘Broken’ vs ‘unfinished’: Not a question of semantics
First, let’s define the difference:
Broken data is wrong: elements like incorrect values, duplicated SKUs, corrupted variants, or outdated prices presented as ‘current.’
Unfinished data is absent or thin: The tell-tale signs include missing attributes, incomplete digital assets, one generic description (where you really need channel-ready fields), and no coverage for the attributes which power search, filters, compliance, and syndication.
If you treat unfinished data as ‘broken,’ you fall into the trap of constantly running cleaning sprints and then wondering why they don’t seem to move the dial on performance. Ultimately, the real blockers are blanks and gaps in data, not just errors.
How and why catalogues end up unfinished
Partial population is the default outcome of how product data management happens in real life:
- Migrations prioritise continuity over completeness. During ERP/PIM migrations, teams load minimum viable fields to avoid disruption to trading and make vague promises to themselves to “enrich later.” ‘Later’ tends not to arrive!
- Supplier data becomes the foundation. Suppliers provide operational specs for their systems, not channel-ready content for yours. When you ingest it as-is, the gaps become permanent
- Attribute models evolve but legacy products don’t. New marketplaces, new compliance fields, new filter requirements: your schema expands, but you don’t backfill your legacy ranges
- Enrichment is selective. Heroes and best-sellers get attention, but the long tail stays thin, making quality hit-and-miss across categories
This is why ‘product data quality’ problems often show up indirectly as inconsistency and underperformance rather than obvious mistakes.
The operational consequences: Everything is down to missing data
Unfinished catalogues create entirely predictable failure modes:
- Search and filters underperform. Missing structured attributes mean products don’t appear when customers filter by size, material, fitment, ratings, or compatibility. That’s lost discovery, not an issue with the UX.
- Channel publishing fails. Marketplaces reject feeds when mandatory fields are empty or values aren’t in the stipulated format.
- Non-compliance blocks timely launches. If regulated fields (especially origin, safety flags, ingredients, certifications, and sustainability product data) are absent, products stall in draft form or end up being pulled after they go live.
- Customer service becomes a data lookup function. Service teams waste time answering questions which a finished record should have answered at source.
All these symptoms look like a system failure but in fact, they’re usually a catalogue completion failure.
Why clean-up work doesn’t solve the issue
Of course, cleaning is necessary, but not as an end in itself.
Product data cleaning removes duplicates, standardises formats, and corrects obvious errors. However, it’s completion which will fill what’s missing and upgrade skimpily populated fields into channel-ready artefacts:
- Structured attributes
- Controlled vocabularies
- Variant rules
- Digital assets
- Copy which actually matches channel requirements and search intent
Completion requires a distinct operating model because someone has to be ultimately accountable for “what good always looks like” per category and per channel, not just “just let’s concentrate on fixing what’s wrong today.”
What it takes to finish a catalogue
To finish records at scale, you need three things: Definition. Workflow. Enforcement.
1) Define ‘done’ per category
Create category-level completion specifications:
- Required and conditional attributes (with units, formats, allowed values)
- Channel-specific mandatory fields and mappings
- Asset requirements (angles, resolution, naming conventions)
- Variant logic (parent/child, inheritance, compatibility rules)
The above are the practical core elements of PIM data governance.
2) Build completion into the workflow
Completion must be a governed workflow in your Product Information Management (PIM) system, and not an optional list of tasks:
- Enrichment steps with clear owners (supplier, buying, marketing, regulatory, data stewardship)
- Approval gates for sensitive fields (compliance, safety, claims)
- Readiness scoring so teams can see what is publishable and what is not
3) Enforce standards so that unfinished records can’t leak
If your system allows ‘skip for now,’ people will skip it! Genuine enforcement means:
- Validation rules that block progression when required fields are empty
- Controlled overwrite rules so supplier/ERP refreshes don’t wipe enriched fields
- Monitored completeness KPIs so unfinished areas don’t quietly expand
This is how you move from ‘episodic’ product data enrichment to continuous and consistent completeness.
What ‘good’ looks like in practice
Start with Data’s work with MKM Building Supplies is a common pattern for us:
- Redesigning schema/taxonomy
- Sourcing and enriching product information at scale
- Using that information to fill attribute gaps and normalise specifications (so the catalogue supports digital journeys instead of fighting them)
That’s what genuine ‘finishing’ is in operational terms: Making product records complete enough to perform effectively across channels, rather than just ‘existing.’
Book a discovery call
If your teams’ never-ending lament is “the data is broken,” there’s a pretty good chance that you’re dealing with an unfinished catalogue, with no shared definition of what ‘done’ really means. Get in touch with us today at Start with Data. A discovery call will be able to map where completeness is failing (by category, channel, and workflow) and what changes to your operating model will ensure you’re using finished records and, crucially, keeping them finished.