It’s so often the case that an AI initiative in product data starts with a tool demo and a tight deadline for go-live. After all, this shiny new innovation is going to turbo-charge the business, right? Then, when things aren’t living up to expectations, it’s the catalogue that gets blamed for outputs which are wrong, inconsistent, or simply unsafe to publish.
Our article pinpoints the data failures which stymie the full effectiveness of AI tools and reveals the operational consequences in the areas of search, merchandising, and supplier onboarding. Finally, we set out a corrective sequence you can execute:
- Stabilise
- Standardise
- Enforce
Use it as a checklist before you scale AI pilots into production.
1. The failure: product facts aren’t captured as facts
Most catalogues contain, for want of a better term, ‘facts.’ They’re trapped in:
- Free-text descriptions
- Supplier PDFs and spec sheets
- Product titles stuffed with attributes
- Spreadsheets with mixed units and conventions
Typical structural faults include:
- duplicated SKUs with conflicting attributes
- variant families split into separate products
- attributes reused with different meanings (“size” as UK number, S/M/L, and inches)
- unit drift (mm/cm/in; g/kg/lbs) stored as text strings
Your new AI doesn’t resolve these conflicts. On the contrary, it learns them.
2. Why AI makes it worse (and does it faster)
When the foundations of your data are weak, AI amplifies these issues in three predictable ways:
- Repetition: wrong titles or attributes become wrong descriptions at scale
- Misclassification: bad taxonomy drives bad recommendations and poor semantic search
- False confidence: outputs look plausible, so errors reach customers and channels
Some of the most common failure modes you’ll recognise:
- hallucinated materials when “Material” is empty
- “compatible accessories” suggested, but without relationship data
- inconsistent spelling and terminology (colour VS color; cm VS centimetre VS CM)
3. Operational consequences
In day-to-day operations, these inconsistencies turn into constant manual rework and, naturally, delays:
- Merchandising spends time reviewing AI copy instead of enriching ranges
- Search teams chase “relevance” problems that are actually missing attributes
- Syndication feeds fail validation or get suppressed because formats are wrong
- Customer service has to deal with avoidable questions and disputes
And you can measure the commercial and risk impact:
- lower conversion from broken discovery and comparison
- higher returns from incorrect specs or compatibility claims
- margin leakage from incorrect shipping rules and pack sizes
- compliance exposure where safety, warranty, or expiry dates are wrong
4. What does “AI-ready” mean in practice?
AI readiness is not a question of volume. It’s three enforced basics:
A. Completeness (per category)
Don’t audit “fields that exist.” Confine yourselves to auditing populated attributes which matter most in buying decisions:
- dimensions (packaged and unpackaged)
- weight and capacity
- material and finish
- voltage/phase/IP rating where relevant
- compatibility, certifications, and intended use
The targets you need to instil:
- Defined and mandatory attributes per category
- Measurement of fill-rate by SKU and by supplier
- Treating anything below 95% correct as operational debt
B. Consistency (units, vocabulary, format)
Make sure the same concept looks the same everywhere:
- numeric value stored separately from unit
- controlled value lists (For example: Navy, not navy blue / dark navy)
- standard date format (use ISO where systems allow for it)
- consistent naming rules for titles and feature bullets
C. Structure (relationships and typed fields)
AI needs something about which it can carry out reasoning. Common examples:
- parent–child variant links and defined variant axes
- accessory/consumable/spare-part relationships
- cross-references for equivalents and replacements
- attribute datatypes and validation rules
5. B2C and B2B: Examples of how AI can fail
B2C:
- Furniture: depth and width swapped AI è copy repeats it è returns rate goes up
- Electronics: screen sizes stored as 55”, 55-inch, 139.7 cm è filters and AI search disagree
- DIY: drill bits listed as 10mm, 1 cm, 3/8″ è recommendations miss exact matches
B2B:
- Industrial fittings: “50mm”, “5 cm”, “2 inch” create three filter values è buyers hesitate and go elsewhere
- Electrical components: 24VDC vs 24 V; phase missing è AI suggests incompatible parts
- Chemicals: litres mixed with gallons è pack sizes are inconsistent è reporting on price-per-unit collapses.
6. The corrective sequence: Stabilise. Standardise. Enforce.
Stabilise (stopping new contamination)
- quarantine inbound supplier files
- block free-text entry for measurable attributes
- add validation rules (datatype, range, allowed values)
- create an exception queue and approval gate
Standardise (defining the model)
- publish an attribute dictionary with definitions and examples
- set ‘golden’ units and conversion rules
- update supplier templates and mapping specs
- clarify packaged vs unpackaged dimensions and naming
Enforce (making it non-optional)
- reject non-conforming values at ingestion
- track exceptions by supplier and category
- set completeness thresholds before syndication
- run regular audits for drift and duplicates
7. The next step: align before you automate
If teams are fixing AI outputs in spreadsheets, you’re essentially paying twice for the same gap in your data governance. Therefore, begin with a data assessment which scores completeness, consistency, and structure in your highest-revenue categories, then turn those findings into a remediation backlog of specific artefacts
- Attribute sets
- Validation rules
- Supplier templates
- Workflows
- Approval gates
From our experience, it’s better to delay production rollouts of generative descriptions, semantic search re-ranking, and automated attribute extraction until your assessment shows stable schema, validated values, and governed supplier inputs. If you rush ahead, review cycles tend to expand and overall confidence remains brittle at best, untrustworthy at worst.
Everyone’s getting in the act with AI. But as we’ve seen, there are perils and pitfalls to negotiate first. If you’re not getting what you need from your AI tools, Get in touch with us today at Start with Data, because we can arrange a product data structure audit for you and get to the root of the issue.