Complex Data Types (Advanced Data Mining Methods)

  • Sequence data, time series
  • Graph data, social networks
  • Web data, data fusion
  • Research frontiers

Sequence Data

You have an ordered list or stream, with or without time.

EX: biological sequence, stock price, transaction history

  • Sequential pattern mining (frequent sub-sequence)
  • Pairwise sequence alignment (ex BLAST for biological sequences)
  • Sequence modeling/prediction (ex Markov chain model)

Time Series Data

Sequence data that changes over time.

Typically look at a combined signal of:

  • T: overall trend
  • C: cyclic patterns
  • R: random noise
  • A: anomalies

Graph Data

G = (V, E) relationships between entities, very powerful abstraction.

  • find frequent sub graphs
  • anomaly detection
  • graph modeling
  • link prediction

Online Social Network (OSN) Data

Special type of graph with users, groups, content, and interactions.

  • OSN modeling and community detection
  • topic modeling, sentiment analysis
  • information diffusion, recommendation
  • cyber-safety: malicious accounts and behaviors

Web Data

  • use all methods: text, image, audio, video, links

Build search engines, recommender systems, knowledge graphs

Relates to data fusion, where you cluster multi modal data.

Check out KDD conference for state of the art.