Inconsistent inbound product data: fix the structure first

Don’t put the blame on the them! In fact, suppliers usually can follow a product structure, but they can’t seem to follow the one some businesses have. What they tend to receive is a template with column headers and “required” fields. So far, so good. But they’re then expected to comply with a web of implicit rules: a few instances:

Category logic that lives in someone’s head
Attributes whose baseline meaning shifts depending on product type
Variant patterns which only make sense inside your the storefront
Validation standards that are only visible once their file gets rejected

Therefore, the same supplier may be sending “good” data, but still failing the onboarding process. And that’s not because they’re careless, but because the merchant’s product structure isn’t an operational backbone shared across the organisation. On the contrary, it’s a set of assumptions to which it’s impossible to consistently adhere from the outside (without insider information!)

Below, we’ll dive into the whys, wherefores, and how to deal with this situation.

Inconsistent inbound data is rarely the real problem

When teams complain about inconsistent inbound data, they usually mean one of three things:

supplier files vary week to week
internal teams submit product info in different shapes
mappings keep failing, so onboarding defaults to manual

These symptoms are real-life but treating them as a “data quality” issue often sends you into an endless loop of cleaning, chasing exceptions, and building fundamentally unstable (and short-term) import rules.In essence, most of the time, inconsistency is simply what happens when the structure is unclear or unusable. It’s a fallacy to point at “messy data” or “bad suppliers” as scapegoats. It’s more likely to be a model which doesn’t give inbound data anywhere stable to go, or conversely, gives it so many possible landing spots that each feed becomes a one-off mini-project.

The big tell: the same product can be ‘right’ in two different ways

If you’ve ever has a dispute about where an attribute belongs (whether it’s category vs specification, product vs variant, or global vs category-specific), you’ve already inadvertently found the structural issue.

A fit-for-use structure should perform three main tasks:

Defining what a product is in your world (and where variants begin)
Defining which attributes apply to which items (and under what conditions)
Defining what “valid” looks like, so imports can be enforced not negotiated

If any of these three pillars are missing, inbound data becomes inconsistent by default because it’s basically up to every sender how they fill that gap.

Why “just standardise the supplier template” doesn’t have sticking power

Supplier onboarding programmes often begin with a template, as in “Here are the columns we need.” This helps, but it will break down quickly if your structure is unable to support consistency. There are several common patterns of failure:

1) Your categories aren’t operational

If taxonomy is a list of labels rather than a governed hierarchy with rules, suppliers will classify products in ways that make sense to them, not to your navigation, reporting, or channel feeds (especially if not explicitly identified).

What happens? The same item appears in multiple places, or in a category that doesn’t have the right attribute set, and once again, you’re back to manual interventions.

2) Attribute meaning isn’t defined

“Colour”, “Finish”, “Shade”, “Material”, “Primary material” — when definitions are vague for suppliers, mapping becomes a case of guesswork and duplicates tend to multiply.

So, while inbound data technically ‘arrives’, it won’t act suitably: search filters get fragmented, search facets split, and your teams trust the catalogue less and less.

3) Product vs variant is inconsistent

If the model doesn’t clearly separate product-level truths (brand, range, description) from variant-level truths (size, pack quantity, voltage), supplier feeds will routinely force that decision every. It’s this which explains why some imports create duplicate products, while others overwrite variants incorrectly.

4) Validation is missing or impossible

Even the most resilient PIM platforms need to rely on the model being coherent enough to be able to validate against. If requirements change by category, channel, or business unit but aren’t expressed as such within the structure, you aren’t able to enforce rules without having to hardcode exceptions.

The underlying mechanism: inconsistency is structural debt

Inbound data will always look inconsistent when the business has accumulated structural debt, as in:

legacy categories kept for reporting
inherited attribute sets which don’t match the current assortment
Individual channel requirements bolted on as extra fields
governance is split among teams who don’t share definitions

At this point, no amount of cleaning can adequately stabilise onboarding. You can ‘polish’ the feed, but only until the next supplier arrives with a different shape, when the same underlying structural ambiguity will cause the same problems.

What a usable structure changes immediately

A fit-for-your-purposes structure doesn’t eliminate supplier variation. Rather, it makes that variation manageable. To be specific, it allows you to:

map once, reuse often (because categories and attributes behave predictably)
reduce the number of ‘special cases’ (because rules are expressed in the model)
validate at ingestion (so issues are flagged early, not unearthed in the channel itself)
separate data collection from enrichment (so you’re not trying to fix and clean data at import)

These circumstances are also where a PIM can stop being a mere data repository and become an operating system, because workflow, governance, and enforcement can only work when the product structure is stable enough to anchor them.

The fastest way to fix it: a structure audit

A structure audit isn’t just a taxonomy ‘tidy-up’. It’s an exhaustive, practical diagnosis of whether inbound data can land reliably.

A quality audit should typically examine:

taxonomy integrity: where categories are overloaded, duplicated, or used inconsistently
schema clarity: attribute definitions, ownership, allowed values, units, and dependencies
product/variant logic: where the model forces inconsistent grouping
ingestion reality: how supplier files and internal sources actually arrive (not how you wish they arrived)
validation readiness: what rules can be enforced today vs what’s currently “tribal knowledge”
change governance: how structure will evolve without breaking downstream feeds

Its output should not be a theoretical framework. It needs to be a set of structural decisions and a governed model which reduces effort needed for mapping and increases predictability.

If you recognise the pattern, don’t start with more cleaning

If inbound data keeps failing, it’s because your structure can’t absorb it without human interpretation.

Sure, cleaning helps greatly once the structure is clarified. What doesn’t help is cleaning data before that structure is stable. It will simply accelerate the production of data that still won’t behave consistently.If inbound product data keeps arriving in unpredictable shapes, or your onboarding process relies on excessive manual mapping and exception handling you need a structure audit. Reach out to us today and we can organise one with you. It’ll show exactly where your taxonomy and schema are preventing consistency, and what needs to change so supplier and internal feeds can land cleanly and be validated efficiently.

Why suppliers can’t follow your product structure