5.8

/10

Poster4 位审稿人

最低4最高7标准差1.3

3.8

置信度

正确性3.0

贡献度2.8

表达3.0

NeurIPS 2024

GraphMorph: Tubular Structure Extraction by Morphing Predicted Graphs

Zhao Zhang,Ziwei Zhao,Dong Wang,Liwei Wang

OpenReview PDF

提交: 2024-05-15更新: 2024-12-22

TL;DR

We propose GraphMorph, which enhances tubular structure extraction by focusing on branch-level features, using a Graph Decoder and Morph Module to achieve topologically accurate predictions, and demonstrating effectiveness across multiple datasets.

摘要

关键词

Image SegmentationTubular Structure ExtractionBranch-level FeaturesGraph Representation

评审与讨论

审稿意见

评分: 7置信度: 42024-07-08

The paper presents a technique (called GraphMorph) for extracting tubular patterns as found for instance in retinal images (veins) and aerial images (roads). The proposed pipeline is made of several modules: a segmentation network providing centerline probability and features, a graph decoder for detecting nodes and connectivity and a morph module to get centerline masks. Experimental results demonstrate the efficiency of the proposed pipeline against state of the art methods on several benchmarks.

优点

The paper is well written and clear. Experimental results against SOTA are convincing.

缺点

Some hyperparameters (e.g thresholds) have to be set. The methodology is not always clear to set these parameters e.g.
are experiments for setting $p_{thresh}$ (lines 560-563 ) done on test sets?

minor remarks:

Fig 2 "Modified Deformbale DETR" -> Deformable
Figure 3 caption: "Inferece"

问题

are experiments for setting $p_{thresh}$ (lines 560-563 ) done on test sets?

局限性

Limitations are addressed in the paper (e.g. in appendix I line 586, and in appendix H line 576-578).

作者回复

2024-08-07

Thank you for your constructive suggestions! We will correct the typo error you raised in the revised version. Please find our reply to your questions below.

Q1: Some hyperparameters (e.g thresholds) have to be set. The methodology is not always clear to set these parameters. e.g. are experiments for setting $p_{thresh}$ (lines 560-563 ) done on test sets?

A1: Thanks for the suggestions. Parameters like $\lambda_{\text{class}}=0.2$ , $\lambda_{\text{coord}}=0.5$ , $\alpha=0.6$ , and $\gamma=2$ in the loss function (lines 462) are adopted from established settings in previous works, such as Deformable DETR[1]. We adapted $\alpha=0.75$ specifically for the MassRoad dataset to better handle the sparse nature of road networks. Parameters like learning rate and weight decay were set in line with those used in Pointscatter[2] to ensure a fair comparison. For $p_{thresh}$ , we empirically set it to 0.5 for all experiments on the four used datasets. In Table 9 of Appendix, we compared it to other thresholds on the test set of DRIVE dataset and found the values close to 0.5 yielded better results.

References:

[1] Xizhou Zhu, et al. "Deformable detr: Deformable transformers for end-to-end object detection." arXiv preprint arXiv:2010.04159, 2020.

[2] Dong Wang, et al. "Pointscatter: Point set representation for tubular structure extraction." European Conference on Computer Vision, pages 366–383. Springer, 2022.

评论- Thank you for your response.

2024-08-10

Thanks for clarifying this. I am happy with rebuttal answers.

审稿意见

评分: 5置信度: 32024-07-10

In this work, the authors tackle curvilinear image segmentation. In particular, they move away from pixel-level prediction which has limitations when predicting thin structures. They propose GraphMorph which predicts location of endpoints of each branch and finds the optimal path between them. In this way, they are able to improve upon both False Positive and False Negative errors. Their method can be applied to any segmentation backbone.

优点

Their method is able to handle both FP and FN errors. Otherwise, most existing methods in literature tend to tackle only FN errors.
The authors compare against adequate baselines and have improvements over them.

缺点

For the performance mentioned in the tables, the authors should provide standard deviation to understand whether their method is statistically significant or not. The authors should conduct t-test [1] to confirm this.
The authors should provide the computational complexity of their method with respect to the baselines. The complexity should be about the training and inference time of their method. I suspect the algorithm is slow due to iterating over each edge.

References

[1] Student, 1908. The probable error of a mean. Biometrika, pp.1–25.

问题

See weaknesses section

局限性

The authors have adequately mentioned the limitations of their method, specifically the small ROI which can miss spatial context necessary for good performance.

作者回复

2024-08-06

Thank you for your constructive feedback! We will answer your questions below.

Q1: For the performance mentioned in the tables, the authors should provide standard deviation to understand whether their method is statistically significant or not. The authors should conduct t-test to confirm this.

A1: Thanks for your suggestion. We have provided error bars in our rebuttal pdf (detailed in Table 1). As can be seen from the results, most improvements of our method over baselines are significant. We will definitely include error bars for all experimental results in the revised version of the paper.

Regarding the t-test, we demonstrate the comparison of segmentation performance on the PARSE and ISBI12 datasets based on UNet, and the results are shown in the following table. PARSE is a 3D pulmonary artery dataset, and we show the results of GraphMorph's experiments on it in the rebuttal pdf.

Dataset	Metric	t-Statistic	p-Value
PARSE	Dice	2.0749	0.0386
	clDice	3.9496	<0.0001
	$\beta_0$ error	-12.3858	<0.0001
	$\beta_1$ error	-4.1148	<0.0001
	$\chi$ error	-11.1030	<0.0001
ISBI12	Dice	2.6297	0.0137
	clDice	2.9060	0.0071
	$\beta_0$ error	-7.1229	<0.0001
	$\beta_1$ error	-2.9271	0.0035
	$\chi$ error	-7.4095	<0.0001

The statistical results from our t-tests clearly demonstrate that GraphMorph significantly improves segmentation performance on both 2D and 3D datasets, as evidenced by the significant p-values across all metrics for the PARSE and ISBI12 datasets. Notably, while improvements in volumetric scores such as Dice and clDice are significant, the most striking results are observed in the topological metrics ( $\beta_0$ error, $\beta_1$ error, and $\chi$ error), where our method consistently demonstrates substantial enhancements. These results underline the capability of GraphMorph to effectively leverage branch-level features to optimize the extraction of tubular structures' topology.

Q2: The authors should provide the computational complexity of their method with respect to the baselines. The complexity should be about the training and inference time of their method. I suspect the algorithm is slow due to iterating over each edge.

A2: We appreciate the request for detailed computational complexity comparisons. Below is a table that outlines the resource utilization during the training process on the DRIVE dataset with a UNet backbone.

Method	Params	FLOPs	Time per iteration/s	GPU Memory
SoftDice	39M	187G	0.203	5.4 GB
softDice+Ours	48M	268G	0.589	11.8 GB

The increase in parameters and FLOPs in our approach primarily stems from the integration of the Graph Decoder featuring a DETR module. This component is crucial for predicting accurate topological structures of the graphs. Advancements in transformer architectures that reduce computational overhead could potentially enhance the efficiency of our model during training. Implementations such as Lite-DETR [1], noted for their efficiency in handling transformers, could serve as alternatives to the current DETR module, potentially reducing training time and computational resource usage.

As detailed in Appendix F, the inference time analysis has been conducted in Table 5. The results show that the Morph Module is the most time-consuming part during inference process. However, the iterative execution over each edge using the SkeletonDijkstra algorithm is not the primary contributor to this time consumption. This is because the algorithm operates on much smaller patches. For example, a $384 \times 384$ image is divided into numerous $32 \times 32$ patches. Due to the limited size, running SkeletonDijkstra on one patch is highly efficient (about 0.01s per patch). The primary time expenditure currently arises from processing these patches sequentially. However, as each patch operates independently, there is significant potential to enhance efficiency by parallelizing the computations of all patches. Recognizing this opportunity, we plan to focus future work on optimizing the Morph Module by implementing parallel processing techniques to accelerate inference. Additionally, your question has inspired us to explore the parallelization of edge operations within each patch, which may further reduce time cost. We intend to detail these improvements in revised version of our paper and believe this will substantially elevate the operational efficiency of our model.

References:

[1] Li, Feng, et al. "Lite detr: An interleaved multi-scale encoder for efficient detr." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023.

2024-08-10

Thank you for the rebuttal. As it answers my questions, and the performance on 3D PARSE dataset is adequate, I will increase my final rating.

评论- Thank you for your response!

2024-08-10

Thank you very much for your feedback and for the decision to increase the rating. We are pleased that our rebuttal addressed your concerns, and the performance on the 3D PARSE dataset received your recognition. We will ensure that your suggestions are incorporated in our revised version.

评论- Looking forward to your further feedback!

2024-08-12

Dear Reviewer 4DfY,

Thank you once again for your efforts in reviewing our paper. We sincerely appreciate your re-evaluation and the decision to increase the final rating. However, we noticed that the final score remains at 5, suggesting there might still be unresolved concerns regarding our work. As the deadline for revisions is quickly approaching, we would be grateful if you could share any remaining issues or feedback that could help us enhance the quality of our paper and align it more closely with your expectations. We look forward to your valuable feedback!

Paper 13902 Authors

审稿意见

评分: 7置信度: 42024-07-12

This paper proposes a method to extract tubular structures from images. Specifically, the authors have integrated graph extraction into the segmentation architecture. They have proposed a link prediction part to predict graph connectivity in tandem with the segmentation network. They combined the predicted graph into a segmentation branch via a morphing module during inference.

优点

The method is well-motivated, and the paper is well-written.
The link prediction idea is technically sound for computationally efficient connectivity prediction.
The morph module, which integrates the predicted graph in the segmentation, is novel. It seems effective, as demonstrated in Table 1 and Figure 6.
The proposed method seems to improve segmentation results across different metrics over multiple datasets.

缺点

Is the threshold in the SkeletonDijkstra algorithm dataset-dependent?
Is the segmentation network trained simultaneously with the graph decoder or one before the other?
Are there any metrics computed on the extracted centerlines?

问题

See weaknesses
I was wondering whether the morph module could also be used during training to correct some segmentation errors.

局限性

Limitations have been discussed well.

作者回复

2024-08-06

Thank you for your detailed feedback! Please see our answers for your questions below.

Q1: Is the threshold in the SkeletonDijkstra algorithm dataset-dependent?

A1: $p_{thresh}$ is a hyper-parameter which is dataset-independent. It is set to 0.5 as the default setting across all experiments. We will add this setting to the "Implementation Details" in the revised version.

Q2: Is the segmentation network trained simultaneously with the graph decoder or one before the other?

A2: The segmentation network and the graph decoder are trained simultaneously to optimize both components effectively and enhance overall training efficiency. This joint training approach allows the segmentation network to leverage the branch-level features provided by the graph decoder, leading to more coherent and accurate predictions of tubular structures.

Q3: Are there any metrics computed on the extracted centerlines?

A3: In addressing the metrics for evaluating the quality of extracted centerlines, Table 1 in our main paper presents a detailed breakdown of the metrics. These metrics are calculated by comparing the centerline masks predicted by our model with the ground truth masks, detailed below:

Volumetric Scores: The first three columns of Table 1 focus on volumetric scores which assess different aspects of quality of extracted centerlines:
- Dice evaluates the overlap between the predicted centerline and the ground truth. A higher Dice score indicates better alignment between the predicted centerline and its corresponding label.
- AUC and ACC provide a comprehensive evaluation of the model’s ability to classify pixels correctly, indicating the model's performance in distinguishing between the centerline of tubular structures and the background.
Topological Metrics: The last three columns of Table 1 introduce topological metrics, which indicate accurate representation of topology:
- $\beta_0$ error measures the error in the number of connected components of the extracted centerline. Predicted broken or redundant vessels will cause this metric to rise.
- $\beta_1$ error assesses the accuracy in the number of loops or cycles within the centerline, which is effective for evaluating complex vascular topology.
- Euler characteristic error ( $\chi$ error) combines the first two topological aspects, providing a holistic view of the topological accuracy.

Metrics like Dice, ACC, $\beta_0$ error, $\beta_1$ error, and $\chi$ error are also utilized in Table 2 and Table 3 of our main paper, which evaluate the performance of our model on the segmentation task. By using share metrics across different contexts, we establish a consistent framework for evaluating our approach against both tasks.

Q4: I was wondering whether the morph module could also be used during training to correct some segmentation errors.

A4: Currently, the morph module is designed to operate during the inference phase and does not propagate gradients, which precludes its direct use in training the segmentation network. However, the patterns of connectivity it identifies have significant potential to refine the training process.

Recognizing this potential, the possibility of modifying the morph module to enhance error correction during training is promising. One feasible approach could involve developing a differentiable version of the morph module that can be integrated into the back propagation process. Another one is to enhance the training process by assigning increased weights to path points identified by the morph module as crucial for connectivity. Such developments could enhance the robustness and accuracy of segmentation networks, and we plan to pursue this direction in our future work.

审稿意见

评分: 4置信度: 42024-07-14

This paper presented a method called GraphMorph, for tubular structure extraction to achieve more topologically accurate predictions. GraphMorph consists of a Graph Decoder and a Morph Module: 1) the Graph Decoder generates/decodes/learns a graph structure from the ouput of segmentation network and segmentation image features; 2) then the Morph Module processes the graph provided by the Graph Decoder and the centerline probability map given from the segmentation network by employing a novel SkeletonDijkstra algorithm, to output the final centerline mask. Various popular datasets are used for empirical validation.

优点

1, The paper is well written with good clarity. 2, The method proposed overall is unsuprising but logical (but definately not a breakthrough).

缺点

My main questions are more related to the overall technical contributions to this line of work: a) The datasets are pretty small and have been extensively tested in the past. There is an over-fitting concern on these small and widely used datasets, in analogy to the "peek too much" bias. b) all examples are primarily 2D patches. From the qualitative examples shown in the paper, they looks more like "toy examples". Does the methods work on full 3D CT scans, like extracting lung airway trees? c) The numerical improvements reported in the paper, overall are marginal.

Authors makes claims as follows: "Broader Impacts: The GraphMorph framework significantly enhances medical diagnostics by improving the accuracy of tubular structure extraction, such as blood vessels, which is crucial in AI-assisted medical image analysis. These advancements can lead to more precise diagnostics, potentially reducing unnecessary medical procedures and costs. Furthermore, if the capability for real-time analysis is further enhanced, it could significantly impact emergency medical responses by accelerating decision-making and improving patient outcomes. Ultimately, our research not only advances technical knowledge in medical image analysis, but also holds potential for significant positive impacts in healthcare efficiency and patient care."

These claims are significantly over-claimed. The examples shown here are very much toy examples on small image patches. In real world healthcare diagnosis, robust 3D tubular structures like vessels for heart and lung, or airway trees need to be precisely extracted at certian depth from 3D CT scans. Topology constrained tubular structures are parsed from medical image analysis literature (there are many previous work on vessel extraction, and lung airway extraction). The method proposed here does not seem to work directly in such real applications. So authors need to be very careful about making such claims.

问题

1, The SkeletonDijkstra Algorithm is proposed by authors alone, any reference? 2, My main questions are more related to the overall technical contributions to this line of work: a) The datasets are pretty small and have been extensively tested in the past. There is an over-fitting concern on these small and widely used datasets, in analogy to the "peek too much" bias. b) all examples are primarily 2D patches. From the qualitative examples shown in the paper, they looks more like "toy examples". Does the methods work on full 3D CT scans, like extracting lung airway trees? c) The numerical improvements reported in the paper, overall are marginal.

3, Authors makes claims as follows: "Broader Impacts: The GraphMorph framework significantly enhances medical diagnostics by improving the accuracy of tubular structure extraction, such as blood vessels, which is crucial in AI-assisted medical image analysis. These advancements can lead to more precise diagnostics, potentially reducing unnecessary medical procedures and costs. Furthermore, if the capability for real-time analysis is further enhanced, it could significantly impact emergency medical responses by accelerating decision-making and improving patient outcomes. Ultimately, our research not only advances technical knowledge in medical image analysis, but also holds potential for significant positive impacts in healthcare efficiency and patient care."

4, The results here are far from clinically useful. on smaller datasets for evaluation, error bars or std should also be reported.

局限性

as weakness

作者回复

2024-08-06

Thank you for your constructive feedback! We will respond to your concerns below.

Q1: The SkeletonDijkstra Algorithm is proposed by authors alone, any reference?

A1: The SkeletonDijkstra algorithm proposed by our paper represents an innovative adaptation of the classic Dijkstra's algorithm, specifically tailored to the unique challenges of tracing centerlines in tubular structures. This adaptation was necessary to handle the specific topology and morphology of tubular structures that are not addressed by traditional shortest path algorithms. While we have no external references for this particular adaptation, our extensive experiments in the paper confirm its efficacy and accuracy.

Q2: The datasets are pretty small and have been extensively tested in the past.

A2: Thanks for the comments. In medical imaging, especially for tubular structure analysis, large-scale public datasets are scarce due to privacy concerns and data collection complexities. The datasets we selected, while smaller relative to those used in broader computer vision tasks, are established benchmarks within the medical imaging community. Despite their frequent use in previous studies, these datasets still present unresolved challenges [1] [2], especially related to topological errors that happen regularly, which is derived from under-utilisation of branch-level features of tubular structures. Our approach mitigates this problem through the efficient exploitation of branch-level features.

To demonstrate the broader applicability of our approach, we have conducted experiments on the Massachusetts Roads dataset with 1171 aerial images in the paper. Moreover, we provided further validations with 3D CT scans in our rebuttal pdf. These results not only demonstrate the method's effectiveness but also its potential in more clinically relevant, real-world settings.

Q3: All examples are primarily 2D patches. The effectiveness of the method on 3D data needs to be validated.

A3: We recognize the need to validate our method, GraphMorph, on 3D datasets for its clinical relevance. Thus, we have extended GraphMorph to use the 3D UNet architecture and tested it on the pulmonary arterial vascular segmentation dataset from the PARSE challenge [3], which includes 100 annotated 3D CT scans. These cases were divided in a 7:1:2 ratio for training, validation, and testing.

The preliminary results, as detailed in Table 2 of our rebuttal pdf, show that our method consistently outperforms existing baselines across all metrics, mirroring the success we observed with 2D data. This alignment between 2D and 3D results not only underlines the effectiveness of our method but also its adaptability to 3D vessel segmentation task. We plan to apply our method to more diverse 3D medical datasets to further validate the effetiveness of our method.

Q4: The numerical improvements reported in the paper, overall are marginal.

A4: Thanks for the comments. Our method, GraphMorph, is specifically designed to enhance the accuracy of topological features in tubular structure extraction, which is a critical aspect often overlooked in most traditional pixel-level frameworks. The focus of our method on topological accuracy is reflected in the significant improvements in topological metrics, such as the reduction of $\beta_0$ error by 25% to 45% as reported in Table 2 of our paper.

While the improvements in volumetric metrics like Dice and clDice are relatively modest (approximately 1%), the advancements in topological integrity are significant. Accurately restoring topology is both challenging and crucial in tubular structure extraction tasks. The greatly reduced topology errors are derived from our direct exploitation of branch-level features. This can also be illustrated by our experiments on the 3D pulmonary artery dataset PARSE.

Q5: On smaller datasets for evaluation, error bars or std should also be reported.

A5: Thanks for the suggestion. We have provided error bars in our rebuttal pdf (detailed in Table 1). As can be seen from the results, most improvements of our method over baselines are significant. We will definitely include error bars for all experimental results in the revised version of the paper.

Q6: "Ultimately, our research not only advances technical knowledge in medical image analysis, but also holds potential for significant positive impacts in healthcare efficiency and patient care." These claims are significantly over-claimed.

A6: We appreciate your feedback on our claims regarding the broader impacts of GraphMorph. Our results on 2D datasets demonstrate the clinical potential of our method for 2D tubular structure extraction. However, we acknowledge that its performance in complex 3D vascular tasks was not fully explored in the initial submission. In our rebuttal pdf, we have included preliminary results from applying GraphMorph to 3D pulmonary artery segmentation, where we observed obvious improvements.

Moving forward, we plan to extend our evaluations to more diverse 3D datasets to further validate and demonstrate the clinical applicability of our method. We will also ensure that our claims will be carefully measured to reflect the current state and future potential of our research accurately in the revised version.

References:

[1] Hu, Xiaoling, et al. "Topology-preserving deep image segmentation." Advances in neural information processing systems 32 (2019).

[2] Dong Wang, et al. "Pointscatter: Point set representation for tubular structure extraction." European Conference on Computer Vision, pages 366–383. Springer, 2022.

[3] Luo, et al. "Efficient automatic segmentation for multi-level pulmonary arteries: The PARSE challenge." arXiv preprint arXiv:2304.03708, 2023.

评论- Looking forward to your re-evaluation.

2024-08-12

Dear Reviewer Ysem,

Thank you for the time and effort you've invested in reviewing our paper. We have carefully responded to each of your comments and questions. As the deadline for the author-reviewer discussion is nearing, we would greatly appreciate it if you could kindly take a look at our responses and provide your valuable feedback. We are fully prepared to engage in further discussions should you have any additional concerns or suggestions.

Thank you once again and we greatly appreciate your contributions to refining our work and eagerly await your re-evaluation.

Paper 13902 Authors

评论- thanks for rebuttal

2024-08-12

This paper provided some clever technical novelty to improve topology constrained tubular structure extraction. From a paper perspective, this is a reasonably all-round paper that has an interesting idea and consistent empirical performance improvement. For a NeurIPS paper, this is probably ok.

However authors should not over claim too much on the clinical impact side of business. From what I can see, this paper has very limited utility in real clinical applications. From the submitted version, it has not been rigorously tested. With rebuttal, it is on some quite limited public datasets which say very little on real clinical application complexity. You would need at least participate in some well known medical imaging public challenges and achieve some encouraging results to start with.

https://atm22.grand-challenge.org/

评论- Thanks for your feedback

2024-08-14

Thank you for your feedback and the points you've raised. We appreciate the opportunity to address your concerns and clarify aspects of our work.

Dataset and Experimental Validation: Our paper utilizes several 2D medical datasets that have been commonly employed in previous leading studies [1-3], validating the effectiveness and generalizability of our method. Additionally, the supplementary experiments conducted on PARSE dataset [4], which is sourced from the public PARSE challenge (https://parse2022.grand-challenge.org/) and comprises authentic CT scans, further demonstrate the applicability and robustness of GraphMorph to 3D data.
Concerns Regarding Claims: We appreciate your concerns regrading potential overstatements about the clinical impact of our work. And we are in the process of revising our manuscript to better align our claims with the experimental conclusions presented, as detailed in A6 of our rebuttal. Additionally, we are open to the possibility of participating in further public challenges to provide additional empirical validation for our approach in more settings.

We hope that our clarifications and revisions address your concerns satisfactorily.

[1] Shit, Suprosanna, et al. "clDice-a novel topology-preserving loss function for tubular structure segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.

[2] Dong Wang, et al. "Pointscatter: Point set representation for tubular structure extraction." European Conference on Computer Vision, pages 366–383. Springer, 2022.

[3] Qi, Yaolei, et al. "Dynamic snake convolution based on topological geometric constraints for tubular structure segmentation." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023.

[4] Luo, et al. "Efficient automatic segmentation for multi-level pulmonary arteries: The PARSE challenge." arXiv preprint arXiv:2304.03708, 2023.

作者回复

2024-08-06

Dear Reviewers and ACs,

We were encouraged to receive the following positive comments from reviewers: "The method is well-motivated"(CJeh), "The paper is well written with good clarity"(Ysem), "Their method is able to handle both FP and FN errors."(4DfY), "Experimental results against SOTA are convincing."(S6G1). Your comments are very valuable in improving our paper.

Reviewers Ysem and 4DfY both suggested adding error bars to the experimental results. We have updated Table 2 of the main paper and put it in the rebuttal pdf (detailed in Table 1). As can be seen from the results, most improvements of our method over baseline is significant. We will ensure that error bars for all experiments are provided in the revised version of the paper.

Below, we address specific concerns raised by individual reviewers in detail.

评论- Reviewers, please update reviews if you did not already

2024-08-12

Dear reviewers,

Thank you again for your efforts in creating the best possible NeurIPS 2024!

If you did not already, please check the rebuttal and other reviews to check whether there are points that you missed that change your assessment. If you have further questions for the authors, the reviewer-author discussion period ends tomorrow, so now is the time.

Your AC

最终决定Accept (poster)

2024-09-25

The reviewers appreciate the authors' contribution to tubular segmentation, which is a challenging problem -- here, the authors combine the task of pixel classification with the tasks of identifying underlying graph structure and centerlines, with the goal of producing more topologically faithful segmentations. This problem is difficult yet important, and the authors' contributions are interesting; I therefore recommend acceptance.

In their final version, the authors should take good care to incoroporate the reviewers' comments in to their paper. Particularly important is the comments from reviewer Ysem, who points out that the authors are drawing far too strong conclusions about their paper. I agree with the reviewer, and the authors should modify their statements for the cameraready.