Our research is broadly concerned with automated undertanding of rich media, esp. pictures and language; and modeling of collective online behaviours. We specialize and innovate in different method in machine leanring and optimization, recent favorites include: stochastic time series models, sequence and language models with neural netoworks, matrix and tensor factorisation, active learning, structured prediction models.

Understanding popularity in social media

Individual preferences is relatively well understood (e.g. the netflix prize), but the aggregate attention, or popularity, is not. We bridge this gap by explaining the dynamics of attention, and connecting models for the user and the network. We measure the longitudinal popularity change of YouTube videos over several years. We propose novel time series descriptions that correlate with popularity. We design new stochastic point process models to describe the ongoing interaction of external promotion and popularity evolution. Our project answers one of the core problems in computational social science. The results can be used to understand grass-root movements and organized online campaigns, predict crowd behavior under realistic social settings, identify potential hits in the digital media landscape.

Papers: ICWSM’17, WWW’17, CIKM’16, ICWSM’15, ACM MM’14,


Generating Stylized Descriptions of Images

Recent progress in image recognition and language modelling is making automatic description of image content a reality. Yet these written descriptions lack style and emotion; aspects which help to engage and interest readers. This project looks to introduce style and emotion into automatically generated descriptions. We build upon our SentiCap (AAAI’16) system which generates image captions with human-level sentiments, and our recent publications on predicting naming choices (WACV’15, MM workshop’15). The project will solve two machine learning challenges, the first is style transfer with a small number of training examples, the second is unsupervised style transfer where we adapt description generators using unstructured language materials from the target style.

Papers: AAAI’16, WACV’15, MM Workshop’15


Inferring Private Traits over Time from Wikipedia

The cumulative effect of collective online participation has an important and adverse impact on individual privacy.As an online system evolves over time, new digital traces of individual behavior may uncover previously hidden statistical links between an individual’s past actions and her private traits. To quantify this effect, we analyze the evolution of individual privacy loss by studying the edit history of Wikipedia over 13 years, including more than 117,523 different users performing 188,805,088 edits. We trace each Wikipedia’s contributor using apparently harmless features, such as the number of edits performed on the Mathematics, Culture or Nature sections. We show that even at this unspecific level of behavior description, it is possible to use off-the-shelf machine learning algorithms to uncover usually undisclosed personal traits, such as gender, religion or education.

Paper: WSDM’16


Structured Prediction and Planning for Trajectories

Data modeling over space and time is an important problem for machine learning, statistics and many application areas. Within the work on recommending travel trajectories, two desired qualities are still missing from the current solutions. The first is principled method to jointly learn point ranking, a prediction problem, and optimise for route creation, a planning problem. The second is a unified way to incorporate various features such as location, time, distance, user profile, social interactions, as they tend to get specialised and separate treatments. We propose a solution to address these two challenges.

Paper: CIKM’16


Visualizing Citation Patterns among Publication Venues

We visualize the citation behavior over time for different subfields in computer science, using data from microsoft academic graph.


Picturing Everyday Knowledge in Multimedia

Knowledge graphs have become powerful sources for web search, but an equivalent source about things and their relations in pictures and videos does not exist yet. This project develops core techniques to learn image-centric knowledge graphs by connecting large collections of image/video and their descriptions to existing knowledge bases with encyclopedic, lexical, and commonsense knowledge. One compelling application for multimedia knowledge graphs is in the understanding of ongoing news and social events. We will design methods that construct high-quality knowledge graphs that are specifically relevant and adapted to each event, and propose new methods to automatically generate multimedia event summary documents.

Paper: CIKM’16, ACM MM’13

Selected topics of the recent past

Multimedia-hard problems

MM-hard refers to multimedia problems that require human-level insights and perception that can’t be realized with a single algorithmic approach. The notion of MM-hard has the potential to benefit multimedia research in two funda- mental ways. The first is to describe problems in terms of their (machine and human) difficulty; the second is to be able to do problem reduction — that is, convert one problem to another and compare problems.

Paper: IEEE Multimedia ‘14

Tracking visual memes in social media

We propose visual memes, or frequently reposted short video segments, for tracking large-scale video remix in social media. Video remixing is prevalent on social media platforms, it is part of “venacular creativity” (Burgess 2009) where users create “curated selections based on what they liked or thought was important”. Social influence are often characterized from text-based online interactions such as quoting or reweeting (Leskovec 2009). Our tool allows such metric to be developed for visual media. We found that:

  • Over 50% news-related videos contain remixed content, over 70% YouTube authors participate in remixing.
  • Remix probability does not correlate well with traiditional popularity metrics such as view count.
  • Influence analysis on visual remix overtime can reveal content importance and user roles.

Paper: ACM MM’11, TMM’13

Macroscopic patterns in online news discussions

We analyze the ICWSM’11 Spinn3r dataset containing over 60 million English documents. We observe surprising connections among the 161 wikipedia events it covers, and that over half (55%) of users only link to a small fraction of prolific users (1%), a notable departure from the balanced traditional bow-tie model of web content.

Paper: ICWSM ‘12

June 2, 2016
1074 words


Recent updates

Getting in touch:
-- drop a line if you are interested in knowing more about our work, collaborating, or joining us.
We have two PhD openings in 2018: one on Modeling Online Attention and one on Picturing Everyday Knowledge. We are also looking for research fellow candidates with passion and compelling track record.
comments powered by Disqus