Data mining in an engineering design environment: Or applications from graph matching

Carol J. Romanowski, Rakesh Nagi, Moises Sudit

Research output: Contribution to journalArticlepeer-review


Data mining has been making inroads into the engineering design environment - an area that generates large amounts of heterogeneous data for which suitable mining methods are not readily available. For instance, an unsupervised data mining task (clustering) requires an accurate measure of distance or similarity. This paper focuses on the development of an accurate similarity measure for bills of materials (BOM) that can be used to cluster BOMs into product families and subfamilies. The paper presents a new problem called tree bundle matching (TBM) that is identified as a result of the research, gives a non-polynomial formulation, a proof that the problem is NP-hard, and suggests possible heuristic approaches. In a typical life cycle of an engineering project or product, enormous amounts of diverse engineering data are generated. Some of these include BOM, product design models in CAD, engineering drawings, manufacturing process plans, quality and test data, and warranty records. Such data contain information crucial for efficient and timely development of new products and variants; however, this information is often not available to designers. Our research employs data mining methods to extract this design information and improve its accessibility to design engineers. This paper focuses on one aspect of the overall research agenda, clustering BOMs into families and subfamilies. It extends previous work on a graph-based similarity measure for BOMs (a class of unordered trees) by presenting a new TBM problem, and proves the problem to be NP-hard. The overall contribution of this work is to demonstrate the OR applications from graph matching, stochastic methods, optimization, and others to data mining in the engineering design environment.

Original languageEnglish (US)
Pages (from-to)3150-3160
Number of pages11
JournalComputers and Operations Research
Issue number11
StatePublished - Nov 2006
Externally publishedYes


  • Bills of material
  • Clustering
  • Matching problems
  • Similarity measure
  • Unordered trees
  • Weighted bipartite matching

ASJC Scopus subject areas

  • Computer Science(all)
  • Modeling and Simulation
  • Management Science and Operations Research


Dive into the research topics of 'Data mining in an engineering design environment: Or applications from graph matching'. Together they form a unique fingerprint.

Cite this