Data augmentation with Generative Adversarial Networks (GAN)

1 minute read

FakeTables: Using GANs to Generate Functional Dependency Preserving Tables with Bounded Real Data

Citation: Haipeng Chen, Sushil Jajodia, Jing Liu, Noseong Park, Vadim Sokolov, and V. S. Subrahmanian. “Faketables: using GANs to generate functional dependency preserving tables with bounded real data.” In Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 2074-2080. AAAI Press, 2019.

Highlights

  • Developed a GAN architecture to synthesize incomplete tabular data with two constraints that are maintaining column-wise statistical means and semantic properties
  • Improved classification performance with synthesized data and outperformed various state-of-the-art data augmentation approaches

This work mainly tackle situations that organizations or companies can only share a subset of tabular data and column-wise statistical means. The aim of this work is to generate a synthetic table from the released, incomplete sub-table under two constraints that the synthetic table (1) maintains the released column-wise statistical means and (2) preserves the semantic properties of the released sub-table. A novel generative adversarial network, ITS-GAN, is proposed for such purpose. In ITS-GAN, autoencoders are adopted to capture the semantic properties. The two contraints are added as regularization terms to the loss function of the generator. The performance of ITS-GAN is evaluated on two datasets. The results show ITS-GAN improves classification performance and outperforms various state-of-the-art data augmentation approaches.

FakeTable_1