Skip to main content

Sitation Blog

Creating a good Product Data Model and Common Missteps

August 12, 2025

Catherine Marquand headshot

Catherine Marquand

SVP, Data & Content Services and Chief Customer Officer

Close
Creating a good Product Data Model and Common Missteps

A robust and well-structured data model is foundational for effective digital merchandising. It facilitates the organization of information, contributing to a more intuitive customer experience, increased conversions and business growth. Ever wondered about the criteria for a strong product data model, or how to begin building one? We’re here to offer some advice and highlight common pitfalls.

The Foundation: What Makes a Good Data Model?

A good data model is more than just a collection of attributes and categories. It’s a strategic framework that anticipates user behavior, supports business objectives, and scales with your growth. Here are the key characteristics of a strong data model:

  • Scalability: Can it handle growth in products, categories, and attributes without breaking down or becoming cumbersome? 
  • Flexibility: Can it adapt to changing business needs, market trends, and new product lines?
  • Usability: Is it easy for both internal teams to manage and for suppliers/customers to navigate?
  • Consistency: Does it apply a uniform approach across all products and categories?
  • Integrity: Does it accurately represent your product information without redundancy or errors?
  • Performance: Does it allow for quick retrieval of information, ensuring a fast and responsive user experience?

If your hierarchy doesn’t measure up to any of the aforementioned success criteria, it’s perfectly fine to iterate on what you have. It typically takes multiple iterations with key stakeholders to define a successful data model.

Do I need more than one product data model?

Even in organizations that don’t formally establish and manage multiple taxonomies within their Product Information Management (PIM) system, it’s common to find several data models serving distinct operational purposes. This multi-model approach is often a practical necessity, driven by the diverse needs of different departments and systems.

Typically, we observe the existence of at least three primary types of hierarchies or data models:

  • Financial Hierarchy: This model is predominantly stored and managed within the Enterprise Resource Planning (ERP) system. Its primary function is to support financial reporting, cost accounting, and revenue tracking. It dictates how products, services, and associated revenues/expenses are categorized for ledger entries, budgeting, and financial analysis. This hierarchy is crucial for ensuring regulatory compliance and providing a clear financial overview of the business.
  • Product Hierarchy: These critical hierarchies often reside in both the ERP system and the PIM system; however, they do not need to match. In both systems, this hierarchy defines how products are grouped and classified at a fundamental level. In an ERP, this hierarchy might influence everything from inventory management to supply chain operations, while in a PIM, this hierarchy will drive classification, product content/enrichment with attributes, digital assets, marketing content, and syndication/consistency across retailers and channels. 
  • Merchandising Hierarchies: These hierarchies are specifically designed for customer-facing applications, particularly for publishing product information to owned commerce platforms (e-commerce websites, mobile apps, etc.). Unlike the master product hierarchy, merchandising hierarchies are optimized for discoverability, user experience, and sales conversion. They may involve product groupings based on categories, collections, themes, or even seasonal trends, making it easier for customers to browse and find what they’re looking for. These hierarchies are highly flexible and often adapted to A/B testing results and customer behavior analytics to maximize engagement and sales.

The challenge and opportunity lie in effectively managing the relationships and data flow between these different data models. While each serves a unique purpose, maintaining consistency and accuracy across them is paramount for operational efficiency, data integrity, and a seamless customer experience.

Sitation’s Guiding Principles for Taxonomy Creation

Over the years, we’ve developed a set of seven guiding principles, designed to ensure data models are not only effective but also sustainable. These principles are especially crucial when developing product taxonomies for digital merchandising:

  1. Usability: The taxonomy should be intuitive and easy for end-users to understand and navigate. If a customer can’t find it, they can’t buy it.
  2. Specificity: Categories and attributes should be clearly defined and distinct, avoiding ambiguity. For example, instead of a broad “Accessories” category, consider “Phone Accessories,” “Laptop Accessories,” etc.
  3. Consistency: Apply naming conventions, attribute definitions, and categorization rules uniformly across all products and categories. This is vital for maintaining a clean and manageable data set.
  4. Proximity: Group related items together naturally. If a customer is looking for a specific type of product, all relevant variations and accessories should be easily discoverable nearby.
  5. Balance: Aim for a balanced depth and breadth in your taxonomy. Avoid overly shallow taxonomies that lack detail, and excessively deep ones that can make navigation cumbersome.
  6. Uniqueness: Each product or attribute should have a unique place within the taxonomy to avoid confusion and ensure accurate filtering.
  7. Common Sense: Sometimes, the simplest and most logical approach is the best. Don’t overcomplicate the structure; prioritize clarity and user experience.

Common Missteps in Product Taxonomy Creation

Even with the best intentions, it is easy to find yourself working through some of the common mistakes we find in client hierarchies. A well-structured product taxonomy is crucial for discoverability, user experience, and internal data management, yet several common pitfalls can derail its development. Here are some of the most frequently observed missteps:

  • Lack of a Clear Strategy and Defined Goals: Many organizations dive into taxonomy creation without first establishing a clear understanding of why they need one and what it should achieve. Is the primary goal to improve customer search, streamline internal inventory, or optimize for SEO? Without defined objectives, the taxonomy can become a muddled and inconsistent structure that fails to serve any purpose effectively.
  • Insufficient Stakeholder Involvement: Building a comprehensive taxonomy requires input from various departments, including sales, marketing, product development, IT, and customer service. Failing to involve all relevant stakeholders can lead to a taxonomy that doesn’t reflect the nuances of different product lines, customer language, or business processes. This often results in a system that is difficult to maintain and unpopular with users.
  • Overly Broad or Granular Categories: Striking the right balance in category depth is critical. Categories that are too broad (e.g., “Electronics”) offer little value for navigation or search. Conversely, categories that are too granular (e.g., “15.6-inch Full HD Anti-Glare Laptop with Intel Core i7-11800H Processor and 16GB RAM”) can overwhelm users and make it impossible to browse effectively. The ideal taxonomy provides enough detail for users to find what they need without creating an unmanageable number of options.
  • Inconsistent Naming Conventions and Terminology: A common misstep is the lack of standardized naming conventions across the taxonomy. Using synonyms for the same concept (e.g., “pants” and “trousers”) or inconsistent capitalization and punctuation can confuse users and undermine search accuracy. Establishing a style guide and adhering to it rigidly is essential for a clean and intuitive taxonomy.
  • Ignoring User Behavior and Language: A taxonomy should be built with the end-user in mind. Businesses often create taxonomies based on internal product classifications or manufacturing specifications, which may not align with how customers actually search for or describe products. Conducting user research, analyzing search queries, and employing analytics data can provide valuable insights into customer language and preferences, helping to shape a more user-centric taxonomy.
  • Neglecting Scalability and Future Growth: A well-designed taxonomy should be flexible enough to accommodate new products, categories, and business expansions without requiring a complete overhaul. Failing to consider future growth can lead to a rigid structure that quickly becomes obsolete as the product catalog evolves.
  • Lack of Ongoing Maintenance and Governance: A product taxonomy is not a “set it and forget it” solution. Products are added, retired, and updated, and customer language can evolve. Without a clear governance process, including regular reviews, updates, and designated ownership, the taxonomy will inevitably become outdated, inaccurate, and less effective over time.
  • Over-reliance on Automated Categorization Without Human Oversight: While AI and machine learning tools can assist in initial categorization, they are rarely a substitute for human oversight. Automated systems can struggle with nuance, context, and the subtle distinctions that human experts can identify. A hybrid approach, combining technology with human review and refinement, typically yields the best results.

Client Misstep Examples:

  1. “Junk Drawer” Category: non-specific structures 
    • Misstep: A large, national retailer we work with had a prior history of leveraging the ‘endless aisle’ ecomm strategy, where the goal is pure assortment expansion. As a consequence of ballooning taxonomy, they have a large ‘junk drawer’ L1 they affectionately call “Expanded Assortment.” This is essentially a final destination for items that don’t fit neatly into other categories.
    • Impact: Since suppliers and customers have no idea what they can find here, items classified in this L1 have low engagement, poor product data quality (including misclassified items) and missed sales opportunities for products hidden within. It also makes internal management a nightmare, but since this category is a lower priority, it lives on year after year. 
  2. Attribute Anarchy: Enumerated List Nightmares
    • Misstep: Another extremely common misstep we find is duplication and lack of parallel structure in an enumerated list or ‘picklist’ attribute type. One large retailer that we work with has over 200 valid values on their publicly facing website for the ‘size’ attribute in the women’s category – these include everything from ‘M’ and ‘medium’, to ‘14W’ and ‘14 W’, and a multitude of other difficult to manage values.
      • We often see this same opportunity in attributes like ‘brand’ and ‘color’ as well.
    • Impact: When suppliers are allowed to write free-text into a field that should be managed with set values, it’s impossible for every supplier to guess the ideal format consistently. Product teams are left in a constant battle of reworking content data and customers are left with a poor experience on websites. 

When It’s Okay to Break the Rules (Carefully)

While our guiding principles are essential, there are rare occasions where a strategic deviation might be necessary. Here are a couple of “common sense” examples where you have full authority to ‘break the rules’:

  • Emerging Trends/Niche Products: If a new product category is rapidly emerging but doesn’t fit perfectly into your existing structure, creating a temporary or experimental category might be beneficial to capitalize on the trend quickly, even if it slightly breaks strict consistency. However, it’s critical to have a plan to integrate it properly later.
  • Shopper Behavior-Driven: In certain categories, with specific assortments, we find the need to put two disjointed items together. This is driven by trends showing shoppers often buy the items together and would otherwise get lost navigating to multiple categories on the digital shelf. This typically occurs when the items in focus are a tiny part of the overall assortment and product data on individual items has a high completeness percentage. 

One real-world example we’ve seen while working with a large Office Supplies retailer is to merchandise 3D printers and 3D printer filaments together. Best practices tell us these items are very different and need mutually exclusive attributes, but we found stronger SEO impact and shopper conversion when putting them into the same category.

The Sitation Approach to Data Model Creation

We advocate for an iterative approach to data model creation, recognizing that it’s an evolving artifact rather than a static blueprint. This process typically involves 2-3 refinement cycles where data intelligence drives the initial structure, followed by human refinement:

  1. Data-Driven Initial Creation: In the first phase, we leverage automated tools and analytical techniques to analyze the identified data sources. The goal here is to allow the inherent patterns, relationships, and structures within the data to largely dictate the initial shape and relationships within the data model. This objective analysis helps to reveal natural groupings and hierarchies based on your specific customer data, catalog, and customer search patterns as an initial source of information. 
  2. Human Oversight and Refinement: Once a data-driven preliminary model is established, the critical human element comes into play. Key stakeholders review, scrutinize, and ultimately have the final say on updates and modifications. This phase involves:
  3. Iteration and Optimization: Based on the human feedback and initial usage, the model undergoes further iterations. This might involve refining existing structures, adding new attributes, or optimizing relationships for better performance or analytical clarity. Each iteration brings the model closer to an optimal state, balancing automated efficiency with human intelligence and business relevance.

By embracing this comprehensive, collaborative, and iterative methodology, organizations can create data models that are not only technically sound but also strategically aligned with business goals, adaptable to change, and truly empowering for data-driven decision-making.

Conclusion

A well-architected data model and a thoughtfully designed taxonomy are investments that pay dividends in customer satisfaction, operational efficiency, and increased revenue. By adhering to principles like usability, specificity, and consistency, and by being mindful of common pitfalls, you can build a digital merchandising experience that stands out. At Sitation, we’re here to help you navigate these complexities and build a data model that drives your business forward.