Abstract
The advancement of Artificial Intelligence (AI) models heavily relies on large high-quality datasets. However, in advanced manufacturing, collecting such data is time-consuming and labor-intensive for a single enterprise. Hence, it is important to establish a context-aware and privacy-preserving data sharing system to share small-but-high-quality datasets between trusted stakeholders. Existing data sharing approaches have explored privacy-preserving data distillation methods and focused on valuating individual samples tied to a specific AI model, limiting their flexibility across data modalities, AI tasks, and dataset ownership. In this work, we propose a performance-oriented representation learning (PORL) framework in a Directed Graph Neural Network (DiGNN). PORL distills raw datasets into privacy-preserving proxy datasets for sharing and learns compact meta data representations for each stakeholder locally. The meta data will then be used in DiGNN to forecast the AI model performance and guide the sharing via graph-level supervised learning. The effectiveness of the PORL-DiGNN is validated by two case studies: data sharing in the semiconducting manufacturing network between similar processes to create similar quality defect models; and data sharing in the design and manufacturing network of Microbial Fuel Cell anodes between upstream (design) and downstream (Additive Manufacturing) stages to create distinct but related AI models. Note to Practitioners—To accelerate AI adoption in advanced manufacturing, there is an urgent need for data sharing among participants to prepare high-quality datasets and improve AI model performance. This work proposes a dataset-sharing framework that lays the foundation for future data exchange and trade. Current approaches lack the flexibility to support sharing across diverse context and may expose the value of the data prematurely. To address these challenges, we introduce a performance-oriented representation learning framework that generates data for sharing and valuation to secure the value and preserve the private information, and then utilizes graph-based supervised learning to guide the sharing decisions for data receivers. The framework’s effectiveness and generalizability are demonstrated through two real-world manufacturing dataset-sharing case studies. Industrial participants can use this framework to rank datasets from others based on their predicted utility for specific downstream AI tasks.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 15576-15587 |
| Number of pages | 12 |
| Journal | IEEE Transactions on Automation Science and Engineering |
| Volume | 22 |
| DOIs | |
| State | Published - 2025 |
Keywords
- Dataset-sharing
- data trading
- dataset valuation
- manufacturing industrial internet
- representation learning
ASJC Scopus subject areas
- Control and Systems Engineering
- Electrical and Electronic Engineering
Fingerprint
Dive into the research topics of 'High-Quality Dataset-Sharing and Trade Based on a Performance-Oriented Directed Graph Neural Network'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS