DFR3 – Analysis of the results
After the conclusion of DFR3, as usual, we analyzed the results in order to gain deeper insights into the events that led to this round’s results, just as we did in similar articles on DFR1 and DFR2. We hope that these insights will help us all to improve from round to round by adjusting the round’s processes and parameters.
First, let’s start with an overview of the statistics comparing this round to previous rounds:
DFR1
DFR2
DFR3
Total amount available
$1.0 M in AGIX
$0.5 M in AGIX
$1.53 M in AGIX
Total amount awarded
$ 741.000
$ 449.212
$1,454,396
Nr. of pools
2
5
5
Nr. of submissions
47
89
139
Nr. of eligible projects
28
57
117
Nr. of awarded projects
12
17
43
Nr of wallets that voted
158
187
220
Nr of collections
n/a
144
182
Gaming the system for reputation rewards.
What clearly stood out and has been discussed at length with the community, are comments and feedback that seem to have the primary goal of harvesting reputation rewards, and not adding value to the program. In itself, this is a waste (apart from the effort creating this feedback, there is waste in everyone reading these comments and the waste of rewards to those that try to game the system.). What is even more damaging is that these activities take attention away from well-meant and thoughtful feedback, and potentially demotivate the community members that put a lot of attention to their comments and all readers that spend too much time on low-value contributions.
Important to note is that we didn’t see any significant behavior from these ‘bad actors’ in the voting activities. Apparently, their goal was to gain immediate rewards, not to influence the outcomes of the voting process.
We have addressed this situation and took some measures, as explained in more detail in this Deep Funding Focus Group Advice
How will we address this situation in the next rounds?
This is still open to debate. But with our current knowledge, there are several measures to consider:
- Not giving any (direct) rewards anymore for contributions, thereby removing the main incentive.
- Requiring some kind of identification that will at least prohibit one actor from assuming multiple identities.
- Improve the algorithm, so that it will become harder to game.
- Balance the algorithmic approach with a more (group-based) manual approach. (as we are adding/experimenting with in DFR3 as an outcome of our conversations on the topic.
- A combination of the above.
We’ll have to see what is possible to accomplish in DFR4. Option 5 may be the most resilient approach and hardest to game but will take the longest time to create and configure effectively. Therefore, this will likely be a step-by-step process. It is possible we will temporarily reduce or remove any rewards, until we are more confident of the reliability of the outcomes. Or we may rely more on manual group reviews than on algorithms until we have improved the algorithmic approach. At the time of writing this article, there are ongoing discussions on the topic. At the same time, we see multiple teams engaged in creating and improving the current “Community Contribution (or reputation) Score” tooling. We hope all these efforts will coalesce and lead to a solution that offers fair rewards, drives constructive community engagement and value of the feedback process, and is sufficiently scalable.
The voting process
We stuck to our previous process with ratings from 1-10 (or unrated) and a threshold of 6.5. We kept the concept of requiring a minimum amount of voters for each project but adapted the way we implemented the metrics a bit to accommodate the surge of projects. Like previous rounds, no project was negatively impacted by this minimum % of votes required, but we like to keep it in just to avoid a situation where some projects will only be voted on by a very small subset of voters, potentially leading to an extreme outcome.
Strategic voting
A number of voters seemed to be preoccupied in advancing particular projects, at the expense of other projects. As stated before we see this kind of strategic voting as undesired. Each project should be assessed on its own merits, and not be graded lower than desired just to improve the chances of another project. We have mentioned that we may implement a ‘wallet reputation system’ that will assess both voting behavior and trading behavior that is deemed undesired. We have not yet prioritized this and therefore were not able to counter this kind of behavior in this round.
Let’s take a look at the grading statistics:
All grades including ‘skip’ (meaning that no grade is given)
with ‘skip’ filtered:
What is especially noticeable is the large number of ‘1’ grades given. When assessing each project on its true merits, it is highly unlikely for any project to receive this score, unless there is something genuinely wrong about the project. The same argument may hold for ’10’ ratings, although to a lesser extent. The conclusion from this graphic is that some wallets did engage in strategic voting behavior.
For a better view of voting behavior, we can best assess the visualizations created by Robert Haas with his open-source graph visualization library gravis.
View the screenshots below and click on a graphic to play around with the interactive version.
New Projects
Marketing:
Tooling:
SNET RFPs:
What stands out is that many wallets only voted to the benefit of one specific project.
In the interactive version, with the project highlighted you will see this picture:
It received a lot of votes from many smaller wallets. Moreover, most of these wallets were only interested in giving this project a ‘10’ and didn’t vote for anything else, except for a few that went to the extreme of giving all other projects a ‘1’. It is clear that this project has benefited from the kind of strategic voting behavior that we view as ‘undesired’, to put it mildly. Since we do not have the means yet to counter this, we decided to leave it as is. They almost received a grant, but just became short. We haven’t analyzed what this behavior has meant to other projects, which would have been a larger concern, had they been awarded.
This example does show however 2 things:
- Without the reputation-based weights to voting the project would have been elected. So the reputation approach has accomplished what it was aimed for in this case.
- The voting process would be even more resilient by having a wallet reputation algorithm that reduces the weight of wallets giving only or mostly extreme ratings and wallets that have indicators that they have been created just for the sake of voting.
What also stands out from the screenshots alone is that most low grades were given to the ‘Ideation’ and ‘New Projects’ pool. This makes sense for the ‘new projects’ pool since the pool was the largest and the awarded amounts were not bound to a maximum. It is not quite clear why this reaction is also visible in ‘Ideation’.
Impact of reputation weights on the results
The graphics also show which wallets have added reputation weights (the pink dots). Apart from countering wallet splitting, these weights did have other impacts on the voting results. Without reputation weights, 9 other projects would have been awarded while 6 others would not have been. On 43 awarded projects this is not a landslide, but still quite substantial. Some analysis shows that wallets with added reputation weights don’t necessarily engage in more nuanced voting behavior than other wallets. One would hope that contributors who have reviewed many projects and gave valuable input would also give nuanced, well-thought-through voting scores. This is often the case, but not always. This raises the question: while reputation is a countermeasure against wallet splitting, it might not necessarily lead to better, more nuanced, or less biased voting results.
In other words; Should we have another way to mitigate the practice of wallet splitting, (such as a form of identification) the question is, if reputation should still be a measure for voting weights, and to what extent.
Conclusion
All in all, we had a great round, and as always, learned a lot. There are still many things we would like to improve, but we are aware that they will take time. We are extremely lucky to have such an engaged community of contributors, both in the voting and feedback process and in the operational tasks. We are especially proud of the speed with which the newly created focus group rose to the challenge of handling the ‘bot/generated content’ issues and created a prototype of a process, and similarly, of the review group that created a new way of manually assessing all feedback. Therefore, big thanks once again to all contributors, reviewers, and participants, and hope to see you all again in Deep Funding Round 4!
After the conclusion of DFR3, as usual, we analyzed the results in order to gain deeper insights into the events that led to this round’s results, just as we did in similar articles on DFR1 and DFR2. We hope that these insights will help us all to improve from round to round by adjusting the round’s processes and parameters.
First, let’s start with an overview of the statistics comparing this round to previous rounds:
DFR1 |
DFR2 |
DFR3 |
|
Total amount available |
$1.0 M in AGIX |
$0.5 M in AGIX |
$1.53 M in AGIX |
Total amount awarded |
$ 741.000 |
$ 449.212 |
$1,454,396 |
Nr. of pools |
2 |
5 |
5 |
Nr. of submissions |
47 |
89 |
139 |
Nr. of eligible projects |
28 |
57 |
117 |
Nr. of awarded projects |
12 |
17 |
43 |
Nr of wallets that voted |
158 |
187 |
220 |
Nr of collections |
n/a |
144 |
182 |
Gaming the system for reputation rewards.
What clearly stood out and has been discussed at length with the community, are comments and feedback that seem to have the primary goal of harvesting reputation rewards, and not adding value to the program. In itself, this is a waste (apart from the effort creating this feedback, there is waste in everyone reading these comments and the waste of rewards to those that try to game the system.). What is even more damaging is that these activities take attention away from well-meant and thoughtful feedback, and potentially demotivate the community members that put a lot of attention to their comments and all readers that spend too much time on low-value contributions.
Important to note is that we didn’t see any significant behavior from these ‘bad actors’ in the voting activities. Apparently, their goal was to gain immediate rewards, not to influence the outcomes of the voting process.
We have addressed this situation and took some measures, as explained in more detail in this Deep Funding Focus Group Advice
How will we address this situation in the next rounds?
This is still open to debate. But with our current knowledge, there are several measures to consider:
- Not giving any (direct) rewards anymore for contributions, thereby removing the main incentive.
- Requiring some kind of identification that will at least prohibit one actor from assuming multiple identities.
- Improve the algorithm, so that it will become harder to game.
- Balance the algorithmic approach with a more (group-based) manual approach. (as we are adding/experimenting with in DFR3 as an outcome of our conversations on the topic.
- A combination of the above.
We’ll have to see what is possible to accomplish in DFR4. Option 5 may be the most resilient approach and hardest to game but will take the longest time to create and configure effectively. Therefore, this will likely be a step-by-step process. It is possible we will temporarily reduce or remove any rewards, until we are more confident of the reliability of the outcomes. Or we may rely more on manual group reviews than on algorithms until we have improved the algorithmic approach. At the time of writing this article, there are ongoing discussions on the topic. At the same time, we see multiple teams engaged in creating and improving the current “Community Contribution (or reputation) Score” tooling. We hope all these efforts will coalesce and lead to a solution that offers fair rewards, drives constructive community engagement and value of the feedback process, and is sufficiently scalable.
The voting process
We stuck to our previous process with ratings from 1-10 (or unrated) and a threshold of 6.5. We kept the concept of requiring a minimum amount of voters for each project but adapted the way we implemented the metrics a bit to accommodate the surge of projects. Like previous rounds, no project was negatively impacted by this minimum % of votes required, but we like to keep it in just to avoid a situation where some projects will only be voted on by a very small subset of voters, potentially leading to an extreme outcome.
Strategic voting
A number of voters seemed to be preoccupied in advancing particular projects, at the expense of other projects. As stated before we see this kind of strategic voting as undesired. Each project should be assessed on its own merits, and not be graded lower than desired just to improve the chances of another project. We have mentioned that we may implement a ‘wallet reputation system’ that will assess both voting behavior and trading behavior that is deemed undesired. We have not yet prioritized this and therefore were not able to counter this kind of behavior in this round.
Let’s take a look at the grading statistics:
All grades including ‘skip’ (meaning that no grade is given)
with ‘skip’ filtered:
What is especially noticeable is the large number of ‘1’ grades given. When assessing each project on its true merits, it is highly unlikely for any project to receive this score, unless there is something genuinely wrong about the project. The same argument may hold for ’10’ ratings, although to a lesser extent. The conclusion from this graphic is that some wallets did engage in strategic voting behavior.
For a better view of voting behavior, we can best assess the visualizations created by Robert Haas with his open-source graph visualization library gravis.
View the screenshots below and click on a graphic to play around with the interactive version.
New Projects
Marketing:
Tooling:
SNET RFPs:
What stands out is that many wallets only voted to the benefit of one specific project.
In the interactive version, with the project highlighted you will see this picture:
It received a lot of votes from many smaller wallets. Moreover, most of these wallets were only interested in giving this project a ‘10’ and didn’t vote for anything else, except for a few that went to the extreme of giving all other projects a ‘1’. It is clear that this project has benefited from the kind of strategic voting behavior that we view as ‘undesired’, to put it mildly. Since we do not have the means yet to counter this, we decided to leave it as is. They almost received a grant, but just became short. We haven’t analyzed what this behavior has meant to other projects, which would have been a larger concern, had they been awarded.
This example does show however 2 things:
- Without the reputation-based weights to voting the project would have been elected. So the reputation approach has accomplished what it was aimed for in this case.
- The voting process would be even more resilient by having a wallet reputation algorithm that reduces the weight of wallets giving only or mostly extreme ratings and wallets that have indicators that they have been created just for the sake of voting.
What also stands out from the screenshots alone is that most low grades were given to the ‘Ideation’ and ‘New Projects’ pool. This makes sense for the ‘new projects’ pool since the pool was the largest and the awarded amounts were not bound to a maximum. It is not quite clear why this reaction is also visible in ‘Ideation’.
Impact of reputation weights on the results
The graphics also show which wallets have added reputation weights (the pink dots). Apart from countering wallet splitting, these weights did have other impacts on the voting results. Without reputation weights, 9 other projects would have been awarded while 6 others would not have been. On 43 awarded projects this is not a landslide, but still quite substantial. Some analysis shows that wallets with added reputation weights don’t necessarily engage in more nuanced voting behavior than other wallets. One would hope that contributors who have reviewed many projects and gave valuable input would also give nuanced, well-thought-through voting scores. This is often the case, but not always. This raises the question: while reputation is a countermeasure against wallet splitting, it might not necessarily lead to better, more nuanced, or less biased voting results.
In other words; Should we have another way to mitigate the practice of wallet splitting, (such as a form of identification) the question is, if reputation should still be a measure for voting weights, and to what extent.
Conclusion
All in all, we had a great round, and as always, learned a lot. There are still many things we would like to improve, but we are aware that they will take time. We are extremely lucky to have such an engaged community of contributors, both in the voting and feedback process and in the operational tasks. We are especially proud of the speed with which the newly created focus group rose to the challenge of handling the ‘bot/generated content’ issues and created a prototype of a process, and similarly, of the review group that created a new way of manually assessing all feedback. Therefore, big thanks once again to all contributors, reviewers, and participants, and hope to see you all again in Deep Funding Round 4!
BOAZ BANDU BALUME
Feb 26, 2024 | 7:10 PMEdit Comment
Processing...
Please wait a moment!
Hey. Do we have Proposals reviewers here?