Liquidity provider yield & risk intelligence system: building accurate LP analytics across chains

Summary

A cross-chain data system to decompose LP returns into fees, incentives, and impermanent loss for accurate capital allocation.

Article

Liquidity providing in DeFi looks simple on the surface. Deposit assets into a pool, earn fees, collect rewards. But anyone who has actually deployed capital across AMMs knows the reality is very different.

APY is not a single number. It’s a combination of moving parts, many of which are not directly observable.

This project focused on building a data and research system to break down LP returns into their real components and make them comparable across protocols and chains.

Why LP yield is harder than it looks

Most dashboards show a single APY figure. But, LP returns actually come from multiple sources: trading fees, liquidity incentives (token rewards), price divergence between assets (impermanent loss), and gas costs along with rebalancing overhead.

The problem is that these components are distributed across contracts, calculated differently across protocols and often not exposed via clean APIs

What looks like a “20% APY” could be a mix of 5% fees, 25% incentives, and -10% impermanent loss.

Without breaking this down, capital allocation becomes guesswork.

Key challenges in building LP intelligence

1. Fragmented yield data

Each protocol exposes data differently. Uniswap-style AMMs rely heavily on event logs (swaps, mints, burns), while other designs distribute rewards through separate contracts. There is no unified schema.

2. Impermanent loss is path-dependent

Impermanent loss cannot be derived from a snapshot. It depends on:

entry price
exit price
price path in between
liquidity distribution over time

This requires historical reconstruction at block-level granularity.

3. Lack of reliable historical indexing

Most RPC endpoints and APIs are optimized for current state, not historical analytics. Reconstructing LP positions requires:

parsing raw logs
tracking position changes
aligning with price oracles

At scale, this becomes an indexing problem-not just a querying problem.

4. Infrastructure cost vs accuracy

There are two extremes:

full indexing (accurate, but expensive)
API aggregation (cheap, but lossy)

The challenge was finding a middle ground: high enough fidelity to model real returns, without building a full archival node stack for every chain.

The system was designed as a modular data pipeline with protocol-aware logic, rather than a generic aggregator.

What we built

The system was designed as a modular data pipeline with protocol-aware logic, rather than a generic aggregator.

1. Unified LP yield framework

At the core is a standardized model that decomposes LP returns into fee income, incentive rewards and impermanent loss. Each component is computed independently and then combined into a normalized APY. This allows:

This enables apples-to-apples comparison across pools and helps isolate the true drivers of yield.

2. Protocol-aware aggregation layer

Different AMMs behave differently. So instead of forcing a generic model, we implemented protocol-specific logic:

Uniswap v2/v3-style pools → event-driven fee tracking, liquidity range handling
Curve-style pools → invariant-based pricing, reward emission tracking

This avoids inaccuracies that come from treating all AMMs the same.

3. Hybrid data sourcing architecture

To balance cost and accuracy, the system uses a hybrid approach: managed indexers for common datasets, direct on-chain queries for critical state, and custom indexing pipelines for missing or high-resolution data.

This reduces infrastructure overhead while preserving analytical precision where it matters.

4. Scalable research pipeline

The pipeline is designed as a set of independent modules covering ingestion (events, prices, rewards), transformation (position reconstruction, yield decomposition), and analytics (APY, volatility, correlations).

New chains or protocols can be added by extending only the relevant modules, making the system highly adaptable.

Architecture overview

A typical flow can be represented as:

Data sources
- on-chain logs (swaps, liquidity events)
- price oracles
- reward distribution contracts
Ingestion layer
- indexers + RPC pipelines
Processing layer
- position reconstruction
- fee calculation
- impermanent loss modeling
Analytics layer
- APY breakdown
- cross-pool comparison
- strategy metrics

Modeling impermanent loss correctly

A key part of the system is accurate IL modeling. Instead of using simplified formulas, the system reconstructs LP positions over time, aligns them with price movements and calculates divergence vs HODL baseline. This makes it possible to answer practical questions like:

at what point do fees actually compensate for IL?

Which is critical for real-world LP strategies.

What this enables

With the system in place, LP analysis becomes significantly more actionable.

A. Accurate yield attribution

Instead of a single APY number, users can see how much yield comes from fees, how much from incentives, and how much is effectively lost to impermanent loss.

B. Cross-chain comparability

By normalizing metrics, LP positions across Ethereum, BSC, and Avalanche can be evaluated within the same analytical framework.

C. Better capital allocation

Strategies can now be built based on stable fee-generating pools, high-incentive short-term opportunities, and low-volatility LP positions, rather than relying on headline APY figures.

D. Reusable research foundation

The architecture extends naturally to staking analytics, correlation modeling, and strategy backtesting, making it a broader research foundation rather than a single-use system.

Final thoughts

DeFi has no shortage of yield. What it lacks is clarity on where that yield actually comes from. LP returns are not just rewards; they are a combination of market structure, price movement and incentive design

Without breaking these down, optimization is impossible.

this system turns LP from a black box into something measurable, comparable, and ultimately optimizable.

You can read complete case study here: https://www.zobyt.com/work/liquidity-provider-yield-and-risk-intelligence-system

....................................................................................................................................................................

At Zobyt, we have built several systems like this to enable transparency and efficiency through technology . If you’re interested in something similar, do reach out to discuss@zobyt.com