Towards web3-native intelligence: tools for protocol comprehension and stewardship

I’m working on web3/DeFi-native modelling and am seeking funding + colab opportunities for the next stage. Wanted to gauge community mood regarding match of interests before submitting formal applications.

TL;DR A web3 protocol is an evolving shared narrative: it lives simultaneously in two domains open for exploration, onchain transactions and community interactions. They inform and impact each other in both directions in a myriad of ways creating a bipartite heterogenous living organism. To comprehend and be able to predict this organism means to be able to model and predict evolution of its two integral parts in a coherent, unified way. Currently available intelligence tools don’t leverage web3 nature, but merely translate approaches from web2/TradFi, hence falling short of providing deep insight. This proposal strives to make the first crucial step towards solving this problem.

web3/DeFi intelligence at the moment is at the skeuomorphic stage — web2/TradFi tools like SQL data queries, basic accounting, node ranking, outdated VaR risk modelling, agent-based modelling backed by the rational choice theory are translated as if there’s no fundamental difference between TradFi, where part of data is simply not digitised, and the bulk of digital data is private, and web3, which leaves opportunities for insight provided by the latter unexplored.

Expanding on a16z’ Chris Dixon we can postulate that a web3 protocol with its products is an evolving shared narrative: be it a token, NFT, web3 games, decentralized social media, or any yet to be invented thing, its utility and value is a function of a changing commonly shared narrative, where the essence of the narrative, the nature of changes and commonality play equally important parts.

A web3 protocol lives simultaneously in two domains open for exploration: onchain transactions and community interactions. They inform and impact each other in both directions in a myriad of ways creating a bipartite heterogenous living organism. Loop: people engage in discussions, track discussions, track onchain data, make decisions, execute transactions, other people track these decisions and rationalisations and make their own decisions. To know and comprehend this organism means to be able to model and predict evolution of its two integral parts in a coherent, unified way. Luckily, in DeFi both financial and contextual data, transactions and public sentiment about them on Twitter/Discord/Discourse, is fully open and available for modelling and insight.

Hence, in DeFi/web3 we can build native decision making around native, non-skeuomorphic intelligence. Specifically, two major opportunities here are to learn models from real time data of public sentiment towards specific protocols/products and learn models from onchain data with vault-level granularity. And since they impact each other supermodels uniting both forces can be built as well.

UST sentiment => crash case

A few words about opportunities around sentiment. Why sentiment matters? The recent UST crash was the result of a gradual crowd sentiment deterioration => loss of confidence => panic => bank run => death spiral supported by the mechanics of the protocol. We just witnessed how public sentiment dynamics wiped off $40 bln of value over several days.

Twitter scraping code || Raw tweet data

Language model: BERT_base model (12-layer, 768-hidden, 12-heads, 110M parameters), RoBERTa pretraining procedure. Pretrained on the corpus of 850M English Tweets (16B word tokens ~ 80GB) streamed from 01/2012 to 08/2019. Fine-tuned for sentiment analysis with the SemEval-2017: Task 4A dataset.

Sentiment analysis code || Sentiment data: result of sentiment analysis || Visualise the result

Rest assured it’s not an isolated case. Already in 2010 it was shown that the accuracy of a model predicting daily closing values of Dow Jones Industrial Average is significantly improved by the inclusion of specific public mood dimensions as measured from Twitter feed.

Moreover, realising the importance of public mood TradFi public institutions, including Federal Reserve and World Bank, are tracking public sentiment.

Back in 2010 sentiment was measured with a fixed lexicon, now we can use large language models fine-tuned with domain-specific annotated datasets, which would provide precision insight into events like. UST crash.

Models learned from social networks & onchain time series can also be used in asset management and broader DAO decision making mapping public mood and onchain dynamics to (future) TVL, usage or token price, discovering patterns of challenges and opportunities, running simulations, optimising vault design and parameter set. Among many other things.

For instance, UST crash provides an excellent dataset of matching sentiment and onchain dynamics of real-time DeFi-native panic pattern, which could be learned into a model and used to discover future similar cases.


The big opportunity here is to design, build and train a model, which intakes community and onchain data relevant to the protocol/product and outputs respective evolving shared narrative. Then build decision support tools around this web3-native intelligence.

I demonstrated above a proof of concept. To move it towards production progress should go in a number of parallel ways.

The first stage — three-four months — of this work will encompass the following :

I. Build a custom DeFi-focused annotated community interactions dataset to fine-tune a large language model for multidimensional affect analysis

To gauge public sentiment I use a general purpose language model pretrained on the large corpus of texts and then fine-tuned with the SemEval-2017: Task 4A dataset for sentiment analysis.

The result is impressive, but we can get far deeper and richer insight.

Currently state-of-the-art natural language processing results are achieved with Transformer-based large language models (OpenAI’s GPT-3 — 175B, DeepMind’s Gopher — 280B params). Until most recently such models were accessible only to the BigTech, but the Big Science project is changing the game: an international non-profit consortium is working on the SoA multilingual (46 langs) BLOOM LLM, 176B params, pretrained on 350B words, which will be fully open from day one. Pretraining started on May, 11 and will finish within 3-4 months. It’s our chance to ride the wave.

To leverage the full potential of this opportunity we need to use coming months to get prepared: build a custom DeFi-native labelled community interactions dataset to fine-tune BLOOM LLM for our purposes. DeFi conversation on Twitter/Discord/Discourse is full of domain-specific slang, nuances, deep context and a general purpose model must understand these details natively to output the most adequate results. Moreover, the SemEval-2017-4A dataset, although popular, contains 50k tweets and only 3 classes of sentiment (positive/neutral/negative). The bigger dataset for fine-tuning is the better results our model will produce.

So we’ll use the most recent DynaSent sentiment analysis benchmark dataset as a size reference (130k). Also, clearly affect expressed by people is far more nuanced than what could be captured by a three-class scale, which will be reflected in a bigger, more nuanced scale.

Although there are multiple commercial companies offering data labelling services, it’s been shown that retail annotated datasets contain systemic labelling errors. This is especially urgent for us given that we’re looking into identifying affect of a tweet/post. Hence, we need to assemble a group of handpicked amply qualified annotators, train them to properly understand DeFi landscape and instruct them to annotate a dataset assembled from Twitter/Discord/Discourse discussions with labels reflecting mood of each tweet or post. We’ll be using Heartex to run bespoke labelling.

Nobody has ever assembled such dataset for DeFi; if performed, this asset will give us a huge edge; this investment will bring immense return down the road.

Future versions of the community mood model could be enriched with a community social graph data. Pure language models presume that all social interactions influence equally the state of community mood, which is clearly not the case. A message from a protocol founder vs an average troll clearly should have different weight in the resulting picture.

One way to approach this is to use a Temporal Graph Network (TGN) to represent an evolving community graph. TGN generates temporal embedding (a real-valued vector) for each user (graph node) i. This embedding is a learnable function of their history of interactions (messages) and that of their n-hop neighbours (social ties akin to PageRank).

II. DL experiments on the second component of the Big model, the vault-level model of onchain transactions.

A Spatio-Temporal Graph Network (STGN) can be used to represent an evolving graph of onchain transactions. Here, say for a given vault, STGN generates a temporal embedding (a real-valued vector) for each wallet (graph node) i. This embedding is a learnable function of their history of transactions (messages) and that of their n-hop neighbours:

A network (ETH)-wide representation graph can also be built.

A vault model can be used for prediction: running test scenarios for different assets/platforms/DAO decision parameter sets; classification: risk/value profiling assets/platforms. Tx patterns forecasting. Fraudulent activity detection. Detecting other patterns/clusters on the trie providing certain insights and serving as a decision support tool for all stakeholders: DAOs, investors, community members, users. I invite everybody to brainstorm further possible applications.

Again, to the best of my knowledge nobody has ever done this before, this is an investment with huge potential return.

The most interesting and complicated part further down the road is to merge two components of the bipartite web3 model.

One approach: both models can be regarded as functions, each in their respective function space. Then temporal evolution of these models can be modeled as an operator on this function space; then a learnable mapping between these function spaces will be a representation of mutual impact of community life and onchain transactions bringing unity we’re seeking here.

Another approach: use Bayesian Networks to identify mutual causation.

I’ve studied under Prof. Michael Bronstein, a specialist in geometric DL and Head of Graph Learning Research at Twitter, and I plan to reach out to him regarding this project; his expertise would be particularly helpful here.

III. Finally, build a UX/UI design concept to get adequate insight into the data.

All data and models in the world are worthless in the absence of adequate HCI tools. We must be able to dive into the evolving narrative about community mood and its bijective mapping to the onchain dynamics in an effortless, yet profound way.

For this part I plan to bring in the art director I know personally. He is responsible, among many other projects, for the Offshore pipeline management system interface for South Stream (iF Design Award '22; Red Dot Design Award '21)

and Moscow Metro map 3.0 (iF Design Award '17)

The purpose of this part of the proposal is to create a concept of a design system/design language, which will later be leveraged to build an interface into the shared evolving narrative. The concept will be built in Unreal Engine, it will be a live simulation aka infinite game-inspired tool for diving into the evolving shared narrative == a web3 product.

It will also hopefully bring some good publicity once built.

A recently published DeepMind’s Gato model represents an early step towards what I’m implying as a horizon for this proposal: a multi-modal, multi-task model, which intakes heterogenous data, like social network discussions and onchain data time series, and outputs an integrative domain-specific prediction like a shared narrative regarding a certain topic over a certain period.


Hi @pavel, thanks for your effort in submitting this proposal! I can tell that you put some serious thought into it. I think it would be helpful if you present your submission to the DAO in the upcoming community call this Friday (soon TBA), so more members will learn about your proposal and have the opportunity to raise questions. What do you think?

In the meantime, I post some questions here for short answers:

  • What is your estimate of the time required for development level two and up? At what point is the model functional to spit out coherent outputs?
  • What use-cases do you see in which we can apply the model besides treasury management, since we are not an active asset manager but a software layer?
  • How can we feed this data into the governance structure?
  • What are other skills in addition to yours needed to get this live (i.e., starting at stage 3)? What are the areas you will need support with?
  • What is the funding needed to build this?
  • How would the solution differ from the offerings by Gauntlet?
1 Like

What’s Gauntlet’s approach?

In a nutshell:

At a high level, ABS allows a set of ‘agents’ (pieces of code meant to mimic actual user behavior) to make rational actions against DeFi protocols according to some ‘what-if’ market scenario.

How close is this mimicked user behaviour to the actual user behaviour?

Each agent acts according to a set of rules to carry out rational (profit-maximizing) actions.

That’s basically XIXth century economics. The UST crash case, as I demonstrate above, is an example of emotions (a gradual crowd sentiment deterioration => loss of confidence => panic => bank run => death spiral) having impact on the financial and economic activity. Nobel Prizes have already been given out for debunking the pure rational choice theory:

Richard Thaler: for his pioneering work in establishing that people are predictably irrational — that they consistently behave in ways that defy economic theory.

Daniel Kahneman: for having integrated insights from psychological research into economic science, especially concerning human judgment and decision-making under uncertainty

See also
Robert Shiller

Behavioral economics

The approach employed by Gauntlet was developed for TradFi, where the bulk of real life data is either not digitised at all (much of the b2c interaction happens offline) or isn’t available (much of market data is private). Hence, agent-based modelling with theoretical assumptions about incomplete data is justifiable. However, for DeFi agent-based only modelling is a suboptimal legacy framework. Since all data about transactions and user interactions with the protocol is open and available for modelling, we can learn from real life data a model of a living protocol, or parts of it, and models of user interactions with it. Moreover, this model will be continuously fine-tuned with new data emerging.

Gauntlet is using VaR as a risk measure.

In fact, it’s outdated and being phased out in financial stress-test simulations due to its inability to capture tail risk in favour of Expected Shortfall.

Just to give some perspective here is a quote from “A revised market risk framework” published by the Basel Committee on Banking Supervision (kinda regulation/overseeing body for word’s central banks):

Move from Value-at-Risk (VaR) to Expected Shortfall (ES): A number of weaknesses have been identified with using VaR for determining regulatory capital requirements, including its inability to capture “tail risk”. For this reason, the Committee proposed in May 2012 to replace VaR with ES. ES measures the riskiness of a position by considering both the size and the likelihood of losses above a certain confidence level. The Committee has agreed to use a 97.5% ES for the internal models-based approach and has also used that approach to calibrate capital requirements under the revised market risk standardised approach.

That was almost 10 years ago.

To sum up in Gauntlet’s own words:

$38 billion in assets now depend on Gauntlet’s financial modeling framework, signaling that the financial systems of the future are those that bridge the gap between the transparency of crypto with the risk mitigation tactics of traditional finance (TradFi).

Couldn’t agree less. DeFi needs its own native risk mitigation tactics rooted in its open and decentralised nature and leveraging opportunities provided by it. The fact that Gauntlet still dominates DeFi risk management is worrisome, but it could also be an edge for a forward looking DAO.

Here’s Gauntlet’s proposal for the Tribe DAO:

  • Minimal contract value of $1M (denominated in USD) for 1-year engagement.
  • Minimum commitment of 12 months
  • 50% of each quarterly payment to be transferred with a 6-month linear vest
  • 50% of each quarterly payment to be transferred with no vesting period
  • Service fee of 12.125% of revenue generated from Gauntlet parameterized strategies

Everybody can make up their own opinion of this proposal’s merits.

Risk modelling for the the DAO’s products with recommendations for the protocol’s parameters, as done by Maker’s Risk-Core-Unit, Gauntlet or Block Analitica.

Early-warning/discovery: risks and opportunities. Staff similar to what I demonstrated for the UST sentiment => crash case. I certainly expect more such cases because DeFi is currently basically still Wild West with almost no understanding of the onchain internal dynamics among its operators and users.

The 3 parts I described in the post should be tackled in parallel over the first 3-4 months stage. By the end of the first stage we’ll have an annotated DeFi-native mood gauging dataset, the initial version of the onchain activity model and a UX design concept. By that time BLOOM LLM will be ready, which we’ll be able to fine-tune, and after that start gauging DeFi community mood with precision impossible with off-the-shelf tools. Building an MVP integrating an onchain activity model and a community mood model under the coherent UX will take probably at least another 3-4 months.

For now, it’s a decision support tool with the potential for future direct integration of the model data output feed into the protocol logic. As a decision support tool it provides insight and recommendations for the DAO and/or subDAO governing bodies in the vein of work done by Maker’s Risk-Core-Unit, Gauntlet, Block Analitica, GFX Labs, Nansen, Llama etc.

I already have an excellent community organiser (a vice dean at a big university) who will help with all data annotation organisational issues (I plan to hire students to do the job); I also have an award wining art director I mentioned in the post, whom I plan to get onboard for the UX part; I plan to approach Prof. Bronstein I mentioned above for help with the DL on onchain data modelling. so far I think the team is strong enough for the first 3-4 month stage.

I suggest reserving 200k over the next 4 months with the understanding that most likely real reported spending will be smaller, but we need some leeway given that some expenditure items of the project are variables.

Estimates for the expenses:

I. Build a custom DeFi-focused annotated community interactions dataset to fine-tune a large language model for multidimensional affect analysis.

Using some of the commercial services as a price reference and acknowledging that our labelling task of identifying affect of an article on a relatively complex space and the fact that standard practice is to get one article labelled by several (5) annotators to avoid subjectivity, taking DynaSent sentiment analysis benchmark dataset as a size reference we estimate expenses:

= $0,1 * 5 *130k = $65K

II. Cloud computing services to run first DL experiments on the second component of the Big model, the vault-level model of onchain transactions: approximately USDC 15k. You can check their rates here: Pricing | Vertex AI | Google Cloud

III. Build a UX/UI concept to get adequate insight into the data: USDC 15k

IV. Compensation for @pavel = USDC 12k * 3 months = USDC 36k MakerDao reference dev budget
MakerDao reference budget

V. Contingency + expenses (travel, etc) buffer = USDC 15k

Thanks @pavel. The differentiation to Gauntlet seems very reasonable to me. How would the DAO / DAO members have/own the rights to use this model?

Thanks for adding some more info Pavel!

I’m trying to consider this from Gro Protocol’s perspective, and there are a couple aspects I am wondering about (in addition to Marvin’s recent question). If you could answer briefly:

  • How is this better than Gro’s current methodology?
  • Why is your solution and your team better for solving this instead of Gro hiring a new risk management employee in London?
  • Why should this be a top priority for the Gro treasury to fund in Gro’s current stage of development? (in comparison, projects like Fei/Tribe, Maker, and Compound have already had substantial growth before funding research on a similar scale)
  • Why should this modeling project be funded and administered by the Gro DAO? For example, why not submit this project to Gitcoin Grants or other public goods funding/grant options?
  • How will this bring revenue to Gro Protocol?
  • Can you give us a 2 sentence, ELI5 (explain like I’m five years old) summary of what this project is and how it will have specifically benefitted Gro Protocol after 4 months?

Thanks again Pavel. I can see that you’ve put a lot of thought into this and it’s great that the community has such varied expertise to draw from.

1 Like