Motif-oriented influence maximization for viral marketing in large-scale social networks
摘要
评审与讨论
The paper introduces a motif-oriented influence maximization (MOIM) problem under the linear threshold model, where motifs, rather than individual nodes, are the primary targets for viral marketing. The authors propose an algorithm to optimize the influence function by establishing submodular upper and lower bounds. The proposed approach aims to maximize the number of activated motifs, with experiments conducted on various datasets to validate its effectiveness.
优点
1.Novel Problem Definition: The paper addresses a less explored area in influence maximization by focusing on motifs instead of individual nodes, adding a new dimension to the problem. 2.Algorithmic Contribution: The proposed algorithm leverages submodular properties to provide an approximation solution with a guaranteed ratio, which is a significant contribution to the field.
缺点
1.Unclear Motif Definition Rationale: The primary issue with the paper is the rationale behind defining a motif as a strongly connected group. The authors do not provide sufficient justification for this choice, and it is unclear if other works have adopted a similar definition.
- Minor Writing Issue: There is a minor writing issue on line 189 where the text appears to be out of sequence.
问题
1.Why was there no comparison of the time required for different algorithms such as TIM+, OPIM, and DeepIM?
局限性
Clarify Motif Definition: Provide a clearer rationale for defining motifs as strongly connected groups. This could include theoretical advantages or references to other studies that have used a similar definition.
Q1: Unclear Motif Definition Rationale: The primary issue with the paper is the rationale behind defining a motif as a strongly connected group. The authors do not provide sufficient justification for this choice, and it is unclear if other works have adopted a similar definition.
Limitations: Clarify Motif Definition: Provide a clearer rationale for defining motifs as strongly connected groups. This could include theoretical advantages or references to other studies that have used a similar definition.
Reply: Thank you for your comments. Motifs represent the basic functional units of graphs and have been well-investigated in the field of network science, for example,
[1] Ribeiro P, Paredes P, Silva M E P, et al. A survey on subgraph counting: concepts, algorithms, and applications to network motifs and graphlets. ACM Computing Surveys (CSUR), 2021.
[2] Chia X W, Tan J K, Ang L F, et al. Emergence of cortical network motifs for short-term memory during learning. Nature Communications, 2023.
[3] Losapio G, Schöb C, Staniczenko P P A, et al. Network motifs involving both competition and facilitation predict biodiversity in alpine plant communities. Proceedings of the National Academy of Sciences, 2021.
[4] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon, "Network motifs: simple building blocks of complex networks," Science, 2002.
In the field of network science, motifs are typically strongly connected. We are the first to investigate the motif-oriented influence maximization problem. In the final version, we will provide a detailed justification of the motif definition as suggested.
Q2: Minor Writing Issue: There is a minor writing issue on line 189 where the text appears to be out of sequence.
Reply: The issue is attributed to the insufficient margin space surrounding Figure 1. We plan to address this template issue in the final version.
Q3: Why was there no comparison of the time required for different algorithms such as TIM+, OPIM, and DeepIM?
Reply: Thank you for your feedback. It is important to note that TIM+, OPIM, and DeepIM are specifically designed for node-level influence maximization and their time complexity is not directly related to the number of targeted motifs. In contrast, the time complexity of the other methods considered is directly dependent on the number of targeted motifs, and we could tune the number of targeted motifs to tune the time consumption of these methods. Therefore, a direct comparison of our method with TIM+, OPIM, and DeepIM in terms of time complexity is not appropriate.
The author addressed my question. I do not want to change my decision
I do not want to change my decision and agree with the rebuttal
The paper proposes a novel approach to influence maximization (IM) by targeting motifs—interconnected subgraphs of nodes—instead of individual nodes. The authors propose the motif-oriented influence maximization (MOIM) model under the linear threshold (LT) and independent cascade (IC) models. They demonstrate the NP-hardness and non-submodular nature of the problem and develop submodular upper and lower bounds for the influence function. The proposed greedy algorithm achieves a (1 - 1/e - ε) approximation ratio and near-linear time complexity. Extensive experiments validate the algorithm's effectiveness and efficiency across various datasets, showing superior performance compared to existing methods.
优点
- The paper introduces a novel approach by shifting the focus from individual nodes to motifs in the context of influence maximization. This new perspective addresses real-world scenarios where group decisions are crucial, adding significant value to the field of viral marketing.
- The paper is well-written, with a clear problem definition, detailed methodology, and rigorous theoretical derivations. The logical flow and thorough explanations make the contributions easy to understand and appreciate.
- The method's support for multiple classical diffusion models, including the linear threshold (LT) and independent cascade (IC) models, enhances its versatility and applicability in various real-world scenarios.
缺点
- The paper lacks an approximation hardness result, which is important for understanding the limits of the proposed algorithm's performance. Additionally, the NP-hardness of the problem is derived straightforwardly from traditional influence maximization, offering little new insight into the complexity specific to motif-oriented influence maximization.
- The approximation rate is not graph-independent and can deteriorate exponentially with the size of the motif. Additionally, it would be beneficial to include experiments demonstrating how performance changes with increasing motif size, providing deeper insights into the algorithm's scalability and effectiveness in varied settings.
问题
For the 𝑟=1 case, the problem is actually submodular and can be solved using classic influence maximization algorithms by simply adding additional nodes. Why is there still a significant performance gap, given that the proposed algorithm is supposed to solve this problem exactly? Can you provide more insights or explanations for this discrepancy?
局限性
N/A
Q1: The paper lacks an approximation hardness result, which is important for understanding the limits of the proposed algorithm's performance. Additionally, the NP-hardness of the problem is derived straightforwardly from traditional influence maximization, offering little new insight into the complexity specific to motif-oriented influence maximization.
Reply: Thanks for the comment. We will relocate certain less critical sections, such as the discussion on NP-hardness, to the Appendix part. Additionally, we have expanded our discussion on the approximation hardness result. Our proposed algorithm achieves an approximation ratio of , as detailed in Lemma 4.
Q2: The approximation rate is not graph-independent and can deteriorate exponentially with the size of the motif. Additionally, it would be beneficial to include experiments demonstrating how performance changes with increasing motif size, providing deeper insights into the algorithm's scalability and effectiveness in varied settings.
Reply: We appreciate the suggestion regarding the influence of motif size on our results. To address this point, we have conducted additional experiments to explore the effect of motif size on our findings (see Figure 1 in the global Author Rebuttal and Table R1). Table R1 presents the expected motifs as a function of motif size under the LT model in the Amazon graph. The result indicates that as the motif size increases, the expected number of activated motifs also increases. However, theoretical analysis reveals that the approximation rate degrades exponentially with the motif size. On one hand, given that real-world graphs are typically heterogeneous, with central nodes having a higher probability of occurrence in larger motifs. On the other hand, the central nodes have a higher likelihood of being activated in information cascades. Consequently, we observe a trend where larger motif sizes correspond to higher expected numbers of activated motifs. Therefore, our approach remains effective for analyzing large motif sizes in real-world scenarios. We will provide further clarification on this matter in the final version of our manuscript.
Table R1: The expected motif influence vs. motif size under LT model in Amazon graph. We set and the initial seeds , meaning that a motif is activated if more than three nodes in the motif are activated. Larger is better.
| 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
|---|---|---|---|---|---|---|---|---|
| 0.344 | 0.268 | 0.242 | 0.248 | 0.266 | 0.287 | 0.308 | 0.616 |
Q3: For the 𝑟=1 case, the problem is actually submodular and can be solved using classic influence maximization algorithms by simply adding additional nodes. Why is there still a significant performance gap, given that the proposed algorithm is supposed to solve this problem exactly? Can you provide more insights or explanations for this discrepancy?
Reply: Thank you for your insightful comment. When , the problem is degenerated into the classical influence maximization problem. However, the focus here is on activating motifs, which are not uniformly distributed in the graphs. It is possible that certain central nodes may serve as members of multiple motifs. Given that the traditional influence maximization approach primarily seeks to activate the maximum number of nodes rather than motifs, there persists a significant performance gap.
I have read the rebuttal where the authors partially addressed my concerns. I will remain my evaluation and score. Als for 2, my concern is more on the side that the number of REQUIRED nodes in a motif instead of the total size. That's the part only really breaks the submodularity.
The paper on motif-oriented influence maximization for viral marketing in large-scale social networks introduces a novel approach that focuses on targeting motifs - strongly connected subgraphs of users - to maximize influence (i.e., targeting motifs rather than individual users for viral marketing campaigns). It formulates the motif-oriented influence maximization problem under the linear threshold model in social networks. The proposed algorithm provides a systematic approach to maximize influence within strongly connected subgraphs by simultaneously optimizing the lower and upper bounds to select the best seed nodes for maximizing motif-oriented influence. The influence function is neither submodular nor supermodular, making direct optimization challenging. The paper establishes lower and upper bounds for the influence function based on thresholds (where the threshold signifies the activated vertices in the motif).
优点
-
It is interesting that the upper and lower bounds have similar formalism ( I haven't seen this in influence maximization approaches).
-
A thorough analysis of Motif based influence maximization is provided in the paper.
缺点
-
The term "Motif" in Network Analysis is traditionally defined as recurrent sub-patterns in the graph. It is better to stay consistent with the literature to avoid confusion for some readers.
-
Group-level influence maximization was performed earlier for independent cascade models (Motifs can be considered as specialized groups), so the problem setting is not entirely novel.
Here are a few other related references apart from the ones already cited in the paper: 1) Zhu, J., Ghosh, S., Wu, W. and Gao, C., 2019, November. Profit maximization under group influence model in social networks. In International conference on computational data and social networks (pp. 108-119).
Minor comments: It is better to have the abstract and parts of the text self-contained where the relevant terms in approximation ratio and time complexity are well defined.
This paper can be strengthened by adapting it to dynamic networks.
问题
-
See the Weakness section
-
Can the term motif defined in the paper be interpreted as a maximal strongly connected component induced by the vertices in (as no other path exists between the vertices in )?
局限性
yes
Q1: The term "Motif" in Network Analysis is traditionally defined as recurrent sub-patterns in the graph. It is better to stay consistent with the literature to avoid confusion for some readers.
Reply: Thank you for your valuable feedback. The term "motif" is commonly employed in the analysis of graph functions, for example,
[1] Ribeiro P, Paredes P, Silva M E P, et al. A survey on subgraph counting: concepts, algorithms, and applications to network motifs and graphlets[J]. ACM Computing Surveys (CSUR), 2021.
[2] Chia X W, Tan J K, Ang L F, et al. Emergence of cortical network motifs for short-term memory during learning. Nature Communications, 2023.
[3] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon, "Network motifs: simple building blocks of complex networks," Science, 2002.
While the term "motif" shares some similarities with recurrent sub-patterns within a graph, it possesses specific characteristics that distinguish it from traditional sub-patterns, i.e., strong connectivity. We will provide further clarification on this matter in the final version.
Q2: Group-level influence maximization was performed earlier for independent cascade models (Motifs can be considered as specialized groups), so the problem setting is not entirely novel. Here are a few other related references apart from the ones already cited in the paper: 1) Zhu, J., Ghosh, S., Wu, W. and Gao, C., 2019, November. Profit maximization under group influence model in social networks. In International conference on computational data and social networks (pp. 108-119).
Reply: We acknowledge that the exploration of group-level influence maximization has been a subject of previous research, and we have indeed addressed the distinctions in our discussion of related work. The referenced paper introduced a heuristic greedy approach, which, in its generalization as presented in reference [7], diverges from our methodology in several critical respects:
(a) The definition of motifs as strongly connected subgraphs contrasts with the concept of groups that do not necessitate such connectivity.
(b) In group-level influence maximization, the emphasis is typically on optimizing a surrogate objective function and providing an approximation ratio for the surrogate objective, rather than guaranteeing an approximation ratio for the original objective.
(c) Our algorithm offers a strict approximation ratio for the original objective function, distinguishing it from the referenced approach.
(d) Our method allows for the simultaneous optimization of upper and lower bounds that share a common formalism.
Q2: Can the term motif defined in the paper be interpreted as a maximal strongly connected component induced by the vertices in (as no other path exists between the vertices in )?
Reply: A motif refers to a small, strongly connected set of nodes, distinct from the strongly connected giant component within . Motifs are commonly regarded as the fundamental functional units within graphs, a concept widely studied in the field of network science, as exemplified by:
[1] Chia X W, Tan J K, Ang L F, et al. Emergence of cortical network motifs for short-term memory during learning. Nature Communications, 2023.
[2] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon, "Network motifs: simple building blocks of complex networks," Science, 2002.
We will provide more background and discussion about the definition of motif as suggested in the final version.
I acknowledge the author’s response and keep my original rating.
This paper proposes the motif-oriented influence maximization problem based on the traditional influence maximization problem. This paper proves that the problem is NP-hard and proposes a new method, LBMOIM, based on upper and lower bounds to solve it. The experimental results demonstrate that LBMOIM can achieve effective expected motif influence across multiple datasets.
优点
-
The paper investigates motif-oriented influence maximization, a new perspective in the field of influence maximization, which typically focuses on individual nodes rather than motifs.
-
This paper provides a rigorous analysis of the motif-oriented IM problem.
-
According to the experiments, the proposed LBMOIM method is both effective and computationally efficient in solving the motif influence maximization problem compared to existing methods.
缺点
- The paper is filled with numerous symbols and proofs. Adding intuitive explanations for the proofs can help readers better understand the content.
- One major concern is the motivation of this paper. In the Section of Introduction, the example of restaurant is not convincing, as the group decision may influenced by many factors.The assumptions made for motif activation may not always align with real-world dynamics.
- There are no data observations or analyses to illustrate the motivation of this work. The rationality of the motif-oritented IM is not justified. It seems that this paper highly relies on some assumptions.
- The investigated problem still seems limited to a few fixed propagation models, the results and the effectiveness of the method are tightly coupled with the assumptions of these models. However, real-world propagation patterns are much more complex than these fixed models.
- How to set the hyperparameter is not introduced. There is also no ablation study.
问题
- Considering the dynamism of real-world networks, can you provide more justifications that motifs are defined reasonably in actual scenarios?
- Can the authors explain why, in Figure 6, even though there is a significant difference in the size of the Flickr and Amazon datasets, the runtime appears to be on the same order of magnitude?
局限性
The authors' discussion of the limitations is not particularly clear. It seems that the authors should consider more how to ensure the expected motifs actually exist in real-world social networks.
Q1: The paper is filled with numerous symbols… Adding intuitive explanations …
Reply: Thank you for your feedback. To improve the readability and flow of the main text, we will move the detailed proofs of the upper and lower bounds (currently on Page 4) to the supplementary material in the final version. Additionally, we will provide more detailed explanatory examples for Figure 1 to help readers clarify the concepts. Furthermore, we will introduce a case study related to Figure 1 before the method section to better understand the problem.
Q2: One major concern is the motivation of this paper...
Reply: We agree that group decisions can be significantly influenced by various factors. We note that most previous works about influence maximization also had no real data. To further illustrate the motivation, we provide two additional examples:
(a) The advertisement of the ‘KFC Family Bucket’ in the online social network: When determining whether to purchase a KFC Family Bucket, every family member must consent to consume KFC.
(b) Farmigo platform (https://www.farmigo.com/): It offers subscription services for group customers when a specific number of members in the group purchase the foods, which can reduce the average delivery cost.
We recognize that the assumptions made for motif activation may not always align perfectly with real-world dynamics, and obtaining real-world data is challenging due to privacy concerns and commercial interests. We acknowledge this as a limitation of our research. However, our work represents one of the first systematic explorations of this problem. We will explicitly clarify this limitation in the final version.
Q3: There are no data observations or analyses to illustrate the motivation of this work. The rationality of the motif-oriented IM is not justified…
Reply: The motif-oriented influence maximization is prevalent in real-world scenarios, such as the advertisement for "KFC Family Bucket" and group tours. However, due to commercial interests and privacy security concerns, we cannot obtain the real data associated with group decision-making processes. We note that most previous works about influence maximization problem also have no real data (e.g., refs. [1-3]). We plan to clarify the limitation as suggested.
[1] C. Ling, J. Jiang, J. Wang, M. et al., Deep graph representation learning and optimization for influence maximization, ICML, 2023.
[2] H. Liao, S. Bi, J. Wu, et al., Popularity Ratio Maximization: Surpassing Competitors through Influence Propagation, SIGMOD, 2023.
[3] Q. Guo, S. Wang, Z. Wei, et al., Influence Maximization Revisited: Efficient Reverse Reachable Set Generation with Bound Tightened, SIGMOD, 2020.
Q4: The investigated problem still seems limited to a few fixed propagation models…
Reply: Our propagation models are extensively utilized in the simulation of cascade dynamics and viral marketing within social networks. For more complex scenarios, we could modify eqs. 2-3 to handle more complex propagation models. We have already generalized our model to accommodate more intricate triggering mechanisms (see Appendix D). In the final version, we will also include a discussion on the generalization and limitations of applying more complex spreading models.
Q5: How to set the hyperparameter is not introduced. There is also no ablation study.
Reply: Our method has parameter that has been investigated in Fig. 5. There are no other hyperparameters that need to be trained. Hence, we omit the ablation study.
Q6: … can you provide more justifications that motifs are defined reasonably in actual scenarios?
Reply: We introduce a novel case study of the "KFC Family Bucket" advertisement on online social networks. In this scenario, all members of the family (or a motif) must agree to purchase the KFC Family Bucket. See the response of Q2 for other cases.
In actual scenarios, motifs represent the function units of complex graphs, which have been well investigated in the field of network science. For example, in the paper " Network motifs: simple building blocks of complex networks, Science, 2002”, the authors examined motifs with various shapes, including loops, chains, bi-fans, and triangles, among others. To illustrate, a chain motif signifies a path from node to its 2-length neighbor . A triangle motif denotes a group of fully connected three nodes, such as three close friends. Our definition of motif agrees with previous research. Hence, we could ensure the expected motifs actually exist in real social networks.
We will discuss more justifications of motifs as suggested in the final version.
Q7: Can the authors explain why, in Figure 6, even though there is a significant difference in the size of the Flickr and Amazon datasets, the runtime appears to be on the same order of magnitude?
Reply: The runtime of our algorithm is mainly influenced by the number of targeted motifs rather than the number of nodes in the network. Given that we have set the number of targeted motifs to 2000 for all graphs, the runtime remains within the same order of magnitude, regardless of the differences in dataset size, which is consistent with our theoretical analysis. We will provide further clarification on this matter in the revised version.
Q8: The authors' discussion of the limitations is not particularly clear. It seems that the authors should consider more how to ensure the expected motifs actually exist in real-world social networks.
Reply: Motifs serve as functional units within complex graphs, a concept extensively explored in the domain of network science, which exists in various real-world social networks. The time complexity of our method depends much on the number of motifs. If a small graph has a large number of motifs, the time consumption is also high. We will discuss the limitations as suggested in the final version.
Thanks for the authors' rebuttal. I have checked the responses and other reviewers' reviews. Some of my concerns have been addressed. Hence, I would increase my scores.
Dear Reviewer,
I hope this message finds you well.
As today marks the final day of the discussion period, we wanted to respectfully follow up on our rebuttal to see if we have satisfactorily addressed your concerns regarding our paper.
If there are any remaining issues that require our attention, please let us know. We would be more than glad to address them. If the revisions meet your expectations, we would greatly appreciate it if you could reconsider your ratings.
Thank you again for your time and invaluable feedback.
Best regards,
Authors
The appendix file includes a figure that shows the influence of motif size on the performance.
The paper introduces an approach to influence maximization that targets interconnected subgraph of nodes (aka motifs) instead of individual nodes as existing algorithms. All the reviewers were positive or mildly positive about the paper and I recommend accepting the paper. I ask the authors to address the reviewers' comments in the final version of their paper.