MerchantOpsMerchantOps

Product Lakehouse

The Product Lakehouse is your staging area for raw product data. Upload documents, review extracted data, and promote verified products to your main catalog.

A tour of the Product Lakehouse and its key features.

What is the Product Lakehouse?

The Product Lakehouse is a staging area where raw product data lands before being promoted to your main catalog. Think of it as an inbox for your product data that allows you to:

  • Upload documents in various formats (PDF, Excel, CSV)
  • Review AI-extracted data for accuracy
  • Verify column mappings from source files
  • Clean up any parsing issues
  • Promote approved products to the Catalog

This two-stage approach ensures that only verified, quality data enters your main product catalog.

The Lakehouse Workflow

Getting products from raw files to your enriched catalog follows these steps:

  1. Upload: Add product documents (PDFs, spreadsheets) through the Document Uploads interface
  2. Extract: MerchantOps AI processes your documents and extracts product data automatically
  3. Map: Review the column mappings to ensure source columns map correctly to product fields
  4. Review: Check extracted Lakehouse Products for accuracy before promotion
  5. Promote: Move verified products to your main Catalog for enrichment and export
Upload
Map
Review
Catalog

Lakehouse Features

Brand Technologies

In addition to document processing, the Lakehouse includes a Technologies section where you can manage brand sources. Technologies represent vendors or brand sources that can be used for:

  • Web scraping product information from brand websites
  • Linking products to their brand for enrichment
  • Storing brand-specific metadata

Best Practices

Data Quality

  • Review before promoting: Always check extracted data for accuracy before moving to the Catalog
  • Use consistent file formats: Standardize your vendor file formats when possible
  • Include product identifiers: Ensure files include SKUs or unique product keys

Workflow Efficiency

  • Batch similar files: Upload files from the same source together for consistent mapping
  • Save column mappings: Reuse mappings for recurring file formats
  • Set up technologies first: Configure brand technologies before uploading their product data

Next Steps

Start using the Product Lakehouse: