Static tainting extraction approach based on information flow graph for personally identifiable information

Yi Liu, Lejian Liao, Tian Song*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Personally identifiable information (PII) is widely used for many aspects such as network privacy leak detection, network forensics, and user portraits. Internet service providers (ISPs) and administrators are usually concerned with whether PII has been extracted during the network transmission process. However, most studies have focused on the extractions occurring on the client side and server side. This study proposes a static tainting extraction approach that automatically extracts PII from large-scale network traffic without requiring any manual work and feedback on the ISP-level network traffic. The proposed approach does not deploy any additional applications on the client side. The information flow graph is drawn via a tainting process that involves two steps: inter-domain routing and intra-domain infection that contains a constraint function (CF) to limit the “over-tainting”. Compared with the existing semantic-based approach that uses network traffic from the ISP, the proposed approach performs better, with 92.37% precision and 94.04% recall. Furthermore, three methods that reduce the computing time and the memory overhead are presented herein. The number of rounds is reduced to 0.0883%, and the execution time overhead is reduced to 0.0153% of the original approach.

Original languageEnglish
Article number132104
JournalScience China Information Sciences
Volume63
Issue number3
DOIs
Publication statusPublished - 1 Mar 2020

Keywords

  • constraint function
  • information flow graph
  • inter-domain routing
  • intra-domain infection
  • network privacy leak detection
  • network traffic analysis
  • personally identifiable information
  • static tainting

Fingerprint

Dive into the research topics of 'Static tainting extraction approach based on information flow graph for personally identifiable information'. Together they form a unique fingerprint.

Cite this