How MCM Uses Catalog Fields for Relevance and Filtering
This guide explains which catalog fields most affect the relevance of ads in MCM (our ML models and retrieval) and which fields are used only for filtering. You can use it when designing your catalog schema or improving feed quality.
Two different roles for catalog fields
In MCM, catalog fields are used in two main ways:
- Relevance & retrieval Fields whose text content is used by our search relevance systems and embedding models to decide which items are a good match for a user’s query or context, and how to rank them.
- Filtering only Fields that act as constraints (e.g., category, color, delivery option) to include or exclude items, but **do not make an item “more relevant” **to a query on their own.
Both are important, but they solve different problems:
- To improve what shows up for a given search or context → focus on relevance fields.
- To narrow down results safely without empty pages → focus on filtering fields.
Fields that impact search relevance (retrieval & ranking)
Today, our search relevance and retrieval systems primarily use the following 5 fields to understand what an item is and how it matches a user’s query:
- title
- categories (category tree)
- ad_account_title
- brand
- description
These are the fields with the largest impact on semantic matching between a query like “running shoes for men” and the items in your catalog.
Field-by-field guidance
title
- Used very heavily in both search relevance and embedding-based retrieval.
- Keep titles concise but descriptive: include key attributes that shoppers actually search for (brand, product type, key attributes like gender/age group/primary spec).
Good examples
- Nike Air Zoom Pegasus 40 Men's Running Shoes
- Apple iPhone 15 Pro 256GB – Titanium Blue
Avoid:
- Overly long marketing copy.
- Stuffing irrelevant keywords.
categories
Category paths (e.g., Men > Shoes > Running) are used for:
- Search relevance (semantic understanding of product type and context).
- Filtering when users choose categories on your site.
Best practices
- Maintain clear, intuitive hierarchies that mirror how users browse.
- Avoid using categories as a “keyword bucket”; they should represent true taxonomy, not search terms.
- Keep category paths stable where possible; large restructures can temporarily affect relevance until models re-adapt.
ad_account_title
- Alternative or supplementary title at the ad-account level.
- Used to provide extra text signals for relevance (e.g., marketing-focused naming, internal naming tuned per advertiser).
Best practices
- Use when you need a slightly different naming convention per advertiser or campaign.
- Keep it human-readable and descriptive; avoid internal-only codes or IDs as the main content.
brand
- Brand text is treated as a strong relevance signal and is also often used as a filter.
- Users frequently search by brand name, and our systems are tuned to recognize that.
Best practices
- Use the canonical brand name your users know (NIKE, Apple, UNIQLO).
- Avoid mixing brand with other attributes (e.g., Nike Official Mall – Summer is less clean than just Nike).
description
- Provides rich, additional text beyond the title.
- Helpful for nuanced queries (materials, style, detailed attributes) when the query text overlaps with description.
Best practices
- Include genuinely useful product information (material, style, intended use, fit, unique features).
- Avoid pure marketing slogans with no useful keywords (e.g., “Best quality ever!!!”) as the only content.
How to prioritize work on relevance fields
If you’re planning catalog cleanup or improvements:
- Start with title and categories – they usually deliver the largest impact on retrieval quality.
- Then improve brand and description for your top SKUs and key categories.
- Use ad_account_title strategically when you need account-specific naming.
Fields used only for filtering (not for semantic relevance)
You may use up to 5 filtering criteria per decision request. More details available here.
Many catalog fields are used as filters during retrieval and decisioning but do not affect how “relevant” an item looks to a query.
One important example:
- color – currently used for filtering only, not for query matching in search relevance.
- This means a query like “red t-shirt” is matched using the text in title / description / categories (e.g., the word “red”), not the color field itself.
More generally, MCM supports filtering on a set of filterable catalog fields (currently 14+), including:
Typical filtering-only fields
Some fields (like brand and categories) are used both as relevance signals (text) and filters (IDs/structured values). Below we focus on the filtering role.
Availability & logistics
- location – where the item is available.
- delivery_option – e.g., one_day, same_day, regular, dawn.
- condition – new, used, refurbished.
Price & promotions
- price
- sale_price
Audience & business
- business_type – e.g., B2B, B2C; used to separate business vs consumer items.
- age_group – adult, teen, kids, etc.
- gender – male, female, unisex.
Visual & material attributes
- color – filtering only; not used in query matching today.
- size – supports multi-value in many setups (e.g., multiple size notations).
- material – e.g., leather, cotton, wool.
- pattern – striped, floral, etc.
Identifiers
- brand_id – ID representation of brand, used for deterministic brand filtering;
- separate from the brand name text, which does affect relevance.
These fields are used in our filtering-aware retrieval and general filtering layers to ensure that:
- Only eligible items (e.g., in-stock, right location, right audience) are considered.
- Filters coming from your UI (e.g., “Color = Red”, “Business type = B2B”) are respected.
They do not increase an item’s likelihood of being retrieved for a free-text query unless that information is also represented in the relevance fields (title, categories, brand, description, ad_account_title).
Practical recommendations for platforms
To improve retrieval relevance
Focus your feed work on:
- High-quality titles that include:
- Product type
- Key attributes (brand, core spec, gender/age group, usage context)
- Common user language (what people actually type in search boxes).
- Clean, meaningful categories that mirror your site’s navigation.
- Accurate brand names and, where useful, account-specific ad titles (ad_account_title) that stay human-readable.
- Rich but focused descriptions that mention important details users might search for (material, fit, style, compatibility, etc.).
To use filters safely (without empty result pages)
When you add filters in your UI or Decision API requests:
- Ensure your filterable fields (color, size, material, location, delivery_option, business_type, age_group, gender, price/sale_price, brand_id, etc.) are:
- Populated for most relevant items
- Consistent (e.g., always one_day, not a mix of one-day, One day, etc.).
- Be cautious introducing very restrictive filters (e.g., combining many conditions at once), as they reduce the candidate pool and can lead to empty responses if the catalog data is sparse or inconsistent.
- If you want better behavior for a query like "red t-shirt":
- Make sure the word “red” appears in the title or description of items where it matters.
- Use the color filter optionally (e.g., via a facet) rather than relying on it as the only mechanism to capture “redness” in search relevance.
FAQ
Q1. “Do you use the color field to match queries like ‘red t-shirt’?”
- Today, no – color is used for filtering only.
- Relevance is based on text fields (title, categories, ad_account_title, brand, description). If the word “red” appears there, it can improve matching; the color field alone will not.
Q2. “Which fields should we optimize first for better search relevance?”
Prioritize, in order:
- title
- categories
- brand
- description
- ad_account_title
These are the fields our relevance systems use most strongly.
Q3. “Can filtering fields ever affect model performance?”
Indirectly, yes:
- Good filtering fields help ensure the model only sees eligible candidates, which improves the quality of what users see and can improve downstream metrics.
- But to make a specific item more likely to be retrieved for a given query, you should improve the relevance text fields, not just filtering fields.
Summary
- Relevance & retrieval: driven primarily by title, categories, ad_account_title, brand, description.
- Filtering only: fields like color, size, material, condition, location, delivery_option, business_type, age_group, gender, price/sale_price, brand_id constrain which items are considered, but do not increase semantic relevance on their own.
- For best results:
- Invest in high-quality text content in the five relevance fields.
- Keep filter fields accurate and consistent to avoid empty or overly narrow result sets. This division lets you design your catalog so that search feels smart and relevant, while filters remain reliable and safe for your users.
Updated 6 days ago
