Cross-Device Identity Resolution using Machine Learning: A Scalable Device Graph Approach
Keywords:Machine Learning, Device Graph
This paper outlines how Mediacorp, Singapore's public service broadcaster, addresses its cross-device identity challenges using a scalable device graph approach. Research in this area is relevant to the domain of advertising technology as it enables a holistic view of consumers that can be extended to use cases such as improving advertisement targeting, personalized recommendations and demographic predictions. Past research efforts were limited to high-level descriptions of the steps undertaken to create a one-off, static device graph based on data collected over a circumscribed time frame, thus limiting its use in larger-scale commercial applications. In this paper, we propose a scalable solution that enables continuous, incremental revisions of our device graph. We leverage behavioral data captured by Mediacorp across its sites and platforms to build a richer device graph that is updated weekly. First, we introduce additional features and explore various classifiers to improve pairwise probability scores between devices that are likely to belong to the same user. Then, we apply community clustering algorithms to uncover device communities to establish the final device graph. Extensive experiments showed that our additive approach has consistently delivered >90% precision and recall in real-world applications.