PaperHub
6.2
/10
Rejected5 位审稿人
最低5最高8标准差1.0
5
8
6
6
6
4.2
置信度
正确性2.8
贡献度2.6
表达2.8
ICLR 2025

DiffMove: Human Trajectory Recovery via Conditional Diffusion Model

OpenReviewPDF
提交: 2024-09-26更新: 2025-02-05
TL;DR

This paper presents DiffMove, a novel conditional diffusion based method for recovering human trajectories from incomplete data, outperforming existing approaches by an average of 11% in recall.

摘要

关键词
Trajectory recoveryDiffusion modelSelf-supervised learningHuman mobility

评审与讨论

审稿意见
5

This paper addresses the issue of human trajectory recovery, specifically focusing on reconstructing a user's missing visited locations during specific time intervals. The study highlights the challenges posed by current methods in dealing with the complexities, irregularities, and uncertainties that are inherent in human mobility patterns.

To tackle these challenges, the paper introduces the use of diffusion models and proposes a novel approach called DiffMove. This approach adapts diffusion models for discrete location recovery and integrates both historical and recent trajectory data. DiffMove includes a Trajectory Location Encoder module, which transforms discrete location indices into continuous embeddings, and a Conditional Embedding Denoiser that extracts valuable information from past and recent trajectories to enhance the diffusion model's conditional information.

The performance of DiffMove is evaluated using two human mobility datasets, where it is compared against several leading methods in human mobility prediction and recovery.

优点

  1. In practical terms, the human trajectory recovery problem addressed in this paper is relevant.

  2. DiffMove, introduced in this paper, is straightforwardly designed to employ diffusion models for recovering missing locations in human trajectories. The method's design and implementation are clearly outlined, ensuring reproducibility.

  3. The experiments clearly demonstrate the superior performance of the proposed method.

缺点

  1. I find the framing of the problem regarding human trajectory recovery in this paper somewhat debatable. As outlined in Definition 3.1, each trajectory point signifies a location visited by a user within a specific time frame (of 30 minutes). A point is considered missing if the location isn't observed during that interval. There are two scenarios where this approach might be inappropriate:

    • When a user visits a location, such as a movie theater, and remains there for an extended period, it is expected that they won't visit other locations during subsequent intervals. These should not be marked as missing locations. If these intervals are instead recorded as the location where the user stayed, two questions arise: 1) Practically speaking, how can we determine how long a user stayed at one location when such data is typically not available in check-in datasets like FourSquare? 2) The trajectory would consist of consecutive repetitive locations, making recovery quite trivial.
    • If a user visits a location (e.g., fast food restaurant) and then moves to another within minutes, the 30-minute interval may not accurately capture their movements.
  2. The paper's motivation could be articulated more effectively. In the introduction, it states that there are "limitations in complex sparse, irregular, and uncertain scenarios inherently in human mobility," which is somewhat vague. This does not clearly illustrate how these limitations are tackled by the proposed method. Furthermore, claims like "traditional methods typically provide a biased deterministic imputed trajectory" suggest an intention to highlight probabilistic generation's advantages over deterministic methods; however, the proposed approach neither focuses on probabilistic generation nor evaluates this aspect experimentally.

  3. The proposed method lacks substantial novelty in its design. The challenge presented—"the imputation targets of conventional diffusion models are continuous numerical values"—and the solution of mapping discrete location indices into continuous embedding space appear standard given existing latent diffusion [1-2] and discrete diffusion [3] solutions. Integrating historical and recent trajectory information via Spatial Conditional Block seems straightforward and primarily involves combining existing components.

  4. The paper's illustrations could be clearer. Specifically, components in Figure 1 lacks appropriate labels, leading to vague references such as "the blue and white location icons in trajectories" and "shown in the top right part of Figure 1." This can cause potential confusion for readers.

  5. I recommend including comparisons or discussions on some missing related works regarding human mobility trajectory recovery methods; certain comparison methods [4] could be addressed here. While recognizing that settings differ slightly on trajectory recovery, discussing GPS trajectory recovery/imputation methods [5-6] might also prove beneficial.

[1] Li, Xiang, et al. "Diffusion-lm improves controllable text generation." Advances in Neural Information Processing Systems 35 (2022): 4328-4343.

[2] Lovelace, Justin, et al. "Latent diffusion for language generation." Advances in Neural Information Processing Systems 36 (2024).

[3] Lou, Aaron, Chenlin Meng, and Stefano Ermon. "Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution." Forty-first International Conference on Machine Learning.

[4] Xi, Dongbo, et al. "Modelling of bi-directional spatio-temporal dependence and users’ dynamic preferences for missing poi check-in identification." Proceedings of the AAAI conference on artificial intelligence. Vol. 33. No. 01. 2019.

[5] Ren, Huimin, et al. "Mtrajrec: Map-constrained trajectory recovery via seq2seq multi-task learning." Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021.

[6] Chen, Yuqi, et al. "Rntrajrec: Road network enhanced trajectory recovery with spatial-temporal transformer." 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 2023.

问题

  1. (Relate to W1) Could the authors explain the reasoning behind how they have framed the problem of human trajectory recovery?

  2. (Relate to W2) Could the authors elaborate on the challenges associated with human trajectory recovery, discuss the limitations of current methods, and describe how their proposed approach addresses these challenges and limitations?

审稿意见
8

In this paper, the authors modify the widely studied diffusion model for trajectory recovery, an acknowledged problem in the spatio-temporal community. The authors design DiffMove, adapting diffusion models to address two challenges in the recovery problem. On two real-world datasets, DiffMove outperformed SOTA methods on multiple metrics.

优点

(1) This paper is well-written. The introduction is very clear and easy to follow—appreciation to the authors to make our lives easier.

(2) The experiments are solid with ablation studies and robustness analysis.

(3) The trajectory recovery is an interesting problem and probably enables multiple downstream applications.

缺点

(1) In Section 3.1, the authors might need to specify how many locations are missing in the trajectory. The number of missing locations is fixed or random will make a difference to the final solutions.

(2) Some writing issues. In line 060, does the authors mean “cannot be fully exploited”? In line 160, Eqn(2) is not easy to identify in the manuscript.

(3) It would be great if the authors could discuss the fundamental differences and similarities between trajectory recovery and generation. From the reviewer’s perspective, they both generate some trajectory points.

(4) In the abstract, the 11% improvement of recall, might misleading the readers. It might be more appropriate to report the improvement compared to the best baseline models,as shown in Table 1.

(5) There are multiple losses in the training process, how do different losses matter to the final performance? Also, how to determine the coefficients for each loss;

(6) One major concern is the claim about predictive and generative models for trajectory recovery in the second paragraph of the Introduction section. From the reviewer’s understanding, the recovery task is to infer one of the multiple points that happened in the past. Thus, predictive or generative seems similar to me. Also, could you explain whether these papers are predictive recovery or generative recovery? [1][2][3]

(7) I have some concerns about Fig.1. The first “Raw Current Trajectory” has 9 points, but the “Masked Locaitons” have 2 points and the “Current Trajectory” has 4 points after the random masking. So, where are the other points?

(8) The format should be consistent. e.g., the legend of Fig.1 and Fig.1, the period is missing in FIg.2.

(9) It seems most baselines are not from recent papers. Recently, there are many papers about trajectory learning, e.g., trajectory generation. Could you explain the rationale of choosing baseline models?

[1] TrajWeaver: Trajectory Recovery with State Propagation Diffusion Model

[2] RNTrajRec: Road Network Enhanced Trajectory Recovery with Spatial-Temporal Transformer

[3] MTrajRec: Map-constrained trajectory recovery via seq2seq multi-task learning

问题

No further questions.

审稿意见
6

The task of this manuscript is the imputation of human mobility trajectory data. Two challenges are proposed: (1) Traditional diffusion models aim for imputation of continuous numerical values, which can be directly obtained through denoising. In contrast, the goal of trajectory recovery is to represent locations as discrete IDs, allowing for a fuller utilization of the transitional and periodic patterns of human trajectories. (2) Besides the temporal dependencies of the current sample, the denoising process should also consider the transitional and periodic dependencies between the current sample and historical samples. To address these challenges, this manuscript introduces DiffMove. To tackle the first challenge, DiffMove converts discrete trajectories into an embedding space, inferring the embedding representations of missing locations through conditional diffusion, and then passing them to a Missing Location Decoder to obtain location IDs. To address the second challenge, DiffMove designs a Spatial Conditional Block based on graph neural network and attention mechanism to learn the spatial transitional and periodic patterns of both current and historical trajectories. Additionally, a Target Conditional Block is introduced to extract knowledge of the missing locations of the current trajectory from historical trajectories. Experiments conducted on two mobility datasets (Geolife and Foursquare) show that DiffMove outperforms the baseline by an average of 11% in recall rate.

优点

This manuscript focuses on the interpolation task using diffusion models on human mobility trajectory data and presents two practical challenges: converting discrete trajectories into continuous values that can be processed by diffusion models, and utilizing information from historical trajectories to guide the recovery of missing locations in the current trajectory. Additionally, extensive experiments are conducted to demonstrate the effectiveness of the proposed model.

缺点

  1. Some sentences in this manuscript are not fluent.

  2. The model lacks innovation, as its overall framework is similar to CSDI[1].

  3. The selected datasets for the experiments are insufficient, and the baseline models chosen for comparison do not include advanced models from recent years.

[1] Tashiro Y, Song J, Song Y, et al. Csdi: Conditional score-based diffusion models for probabilistic time series imputation[J]. Advances in Neural Information Processing Systems, 2021, 34: 24804-24816.

问题

  1. How are the GPS trajectory points in the Geolife dataset converted into discrete location representations (since GPS is continuous and location is discrete)? Cover to discrete regions?

  2. How do the baseline models selected for the comparative experiments incorporate historical information? The baselines in this manuscript mainly focus on prediction task, how do they perform imputation task?

  3. The DDPM has significant computational overhead. If there are many location points, using an adjacency matrix will further increase the time cost and memory usage. How can computational efficiency be ensured?

  4. The denoising network simultaneously employs structures like GCN, cross-attention, and transformers. Could a complex network design affect the prediction performance? Based on experience, the inclusion of GCN may not yield good results.

审稿意见
6

This paper introduces DiffMove, a conditional diffusion-based trajectory recovery method designed to recover missing locations from sparse human mobility data. DiffMove improves accuracy by transforming trajectory locations into an embedding space for denoising and recovering missing locations through an embedding decoder. The model effectively captures the spatial and temporal patterns in human mobility, and integrates multiple conditional feature extraction modules to systematically address spatial-temporal dependencies.

优点

  1. This paper employs a conditional diffusion-based trajectory recovery method to achieve high-quality generation and accurate trajectory recovery.
  2. This paper innovatively designs and integrates multiple conditional feature extraction modules to address the complexity of spatial-temporal dependencies.
  3. This paper conducts extensive experiments on two representative real-world mobility datasets, with results showing significant improvements over the baseline.

缺点

  1. The limitation noted in line 44, “The previous approaches have limitations in complex, sparse, irregular, and uncertain scenarios inherently in human mobility,” requires further elaboration. Specifically, does the proposed method address these limitations, and if so, how?
  2. Additionally, it would be beneficial for the authors to compare their approach with more recent trajectory recovery models such as SimiDTR [a] and RNTrajRec [b]. A recent work called Diff-RNTraj [c] also shares a similar approach; a discussion of the distinctions between this work and the proposed model would further strengthen the paper.
  3. Some minor errors include: ① to obtain incoming/outgoing aggregated node embedding eI,s/eo,s, ② the formula representation of the cosine similarity between the imputed target embeddings and location embeddings [a] Zhang, Yupu et al. “SimiDTR: Deep Trajectory Recovery with Enhanced Trajectory Similarity.” International Conference on Database Systems for Advanced Applications (2023) [b] Y. Chen, H. Zhang, W. Sun and B. Zheng, "RNTrajRec: Road Network Enhanced Trajectory Recovery with Spatial-Temporal Transformer," 2023 IEEE 39th International Conference on Data Engineering (ICDE), Anaheim, CA, USA, 2023, pp. 829-842, doi: 10.1109/ICDE55515.2023.00069 [c] T. Wei et al., "Diff-RNTraj: A Structure-aware Diffusion Model for Road Network-constrained Trajectory Generation," in IEEE Transactions on Knowledge and Data Engineering, doi: 10.1109/TKDE.2024.3460051.

问题

  1. How are the visited locations specifically represented, and how are the masked locations embedded?
  2. Better comparison with existing work and better motivate the usage of the diffusion model.
审稿意见
6

This paper provides an innovative self-supervised model DiffMove to recover trajectories using a conditional diffusion model. The model uses multiple losses to learn: 1) the trajectory location encoder that projects raw points to effective trajectory embeddings, 2) the conditional embedding denoiser that generates the target trajectory location embedding with the help of historical and current trajectory embeddings, and 3) the final embedding decoder to map the denoised embeddings back to their locations.

Extensive experiments have demonstrated the SOTA performances of DiffMove on both GPS and check-in trajectories, strong robustness against various historical trajectory location missing rates, and efficacy of main model components by ablation studies. It would be better if there were visualizations and efficiency analysis.

优点

  1. This paper delivers a novel method of human mobility modeling using a conditional diffusion model. The noising and denoising happen on trajectory embeddings, which benefits the effective denoising operations for trajectory representation learning.

  2. Trajectory modeling in a format of each user on each day helps with capturing user-wise periodic moving patterns. And current and historical controlled embedding denoiser ensures the effective feature extraction from such formatted data, contributing to capability of capturing spatiotemporal dependencies in low-sampling rate trajectories.

  3. The model DiffMove achieves SOTA performance across extensive experimental settings with high robustness. Combined with its innovation, it can be impactful in trajectory recovery and human mobility pattern modeling.

缺点

  1. The paper makes substantial improvements from TRILL (Deng et al., 2023) but could be better if included more datasets to show the generalization of DiffMove on diverse datasets.

  2. Efficiency analysis is missing. If there is no improvement in efficiency, is it causing longer running time than the baselines do?

  3. If there are visualizations of the effects of the denoising process of embeddings would be better.

问题

  1. In Section A.7, can you clarify how you “gather data on road network” for location representation?

  2. What’s the running time of DiffMove compared to baselines? (just a rough comparison)

  3. Can you specify what kind of information is learned by the denoising process based on the trajectory embeddings?

AC 元评审

This paper focuses on conditional diffusion generation for trajectory recovery. The claimed innovation lies on the denoiser on the trajectory embedding space. In the experiments, the authors applied random masking to demonstrate the robustness of the method given the real world nature of such data, eg. missing data, sparsity, etc.

While the denoising idea is somewhat new for the trajectory recovery problem, this is certainly not new. The authors introduced Spatial Conditional Block and Target Conditional Block. The experiments of the method followed the benchmark setting of similar papers on the same trajectory recovery problem. Some reviewers still question the validity of the experiments settings (e..g fixed length time slots), and seem like the authors have not been able to convince all the reviewers to raise their score during the discussion. Also reviewers raised the concern that only two datasets are used. The authors replied that they followed existing papers that only used those two datasets, however, those papers were not published in ICLR or equally high-regarded machine learning conferences (ICML, NeurIPS), hence the response is not satisfactory.

After considering all the reviewers' comments and concerns, I believe this paper is not yet ready for publication. Novelty and the limitation of the number of datasets used in this experiment, and more recent SOTA baselines should be included. I have no doubt, after further revisions, this paper will be in the state ready for publication at another conference.

审稿人讨论附加意见

During the discussion reviewer Ktnx raised their concerns that the novelty is limited, and the limitations from adopting a fixed length time slots in the method. This issue has not been addressed well by the authors. I took this into account when considering the final recommendation.

公开评论

Response to Meta-Review

We appreciate the thoughtful meta-review and reviewer comments. We would like to clarify several important points for the broader research community:

1. Regarding Novelty

  • Our work represents the first attempt to apply diffusion models in embedding space to discrete location trajectory recovery, addressing unique challenges not present in continuous-space applications.

  • The embedding-based conditional diffusion framework is a novel technical contribution that enables handling discrete locations while preserving spatial relationships. These components enable us to robustly recover discrete trajectory points by modeling uncertainty through a generative process, which is not addressed by traditional deterministic methods.

  • Our work introduces novel modules—namely, the Spatial Conditional Block, Target Conditional Block, and Denoising Network Block—that are specifically tailored to capture the unique spatial-temporal dependencies in human mobility.

2. On Dataset Usage

  • While we followed established benchmarks using Foursquare and Geolife datasets, these datasets are widely recognized in the mobility research community for their comprehensive real-world coverage and diverse urban contexts. We note that several papers using similar datasets have been published in top venues including KDD, AAAI, and WWW.

  • Our experimental results demonstrate significant improvements (11% average gain in recall) over strong baselines across both datasets.

3. Regarding Fixed Time Slots

(we have already explained and addressed this point same as below and rebuttal clearly during the discussion period, the reviewer Ktnx did not mention any further concerns after reviewing my rebuttal and finally raised score. There could be some unexpected misunderstanding during the AC discussion period which the authors may not be aware of and unable to have the chance to explain)

  • The fixed time slot approach was adopted to enable fair comparison with existing methods

  • It is important to emphasize that our framework is not inherently tied to the 30-minute interval and can easily adapt to other interval lengths, such as 10 minutes or even finer resolutions, provided the data supports such granularity. We can always treat the time interval as a parameter to tune on the data side. This flexibility makes our method broadly applicable, as it caters to human mobility patterns by modeling discrete location transitions effectively. Human mobility typically involves meaningful transitions at specific timeframes (e.g., work, shopping, dining), which align naturally with discrete intervals.

  • The strong performance improvements suggest the effectiveness of our approach even within this constraint

4. On Recent SOTA Comparisons

  • Our baseline comparisons include the most recent and relevant methods in free space human trajectory recovery. The other relevant papers that the reviewers mentioned are more focused on road network related trajectory which is different in nature from the problem we study. We already included them in our related work for clarification.

  • We demonstrate consistent improvements over these methods across multiple metrics.


We believe these clarifications help contextualize our work's contributions and impact on the research community. Our significant performance improvements and novel technical approach advance the state-of-the-art in human trajectory recovery, addressing real-world challenges in mobility applications. We are grateful for all the positive feedbacks and raising high scores from several reviewers and the opportunity to refine our work for potential impact on the research community.

最终决定

Reject