Confidential and Proprietary. © 2024 Moesif, Inc. All Rights Reserved
Optimizing consumption and cost of AI
APIs in the enterprise
Confidential and Proprietary. © 2024 Moesif, Inc. All Rights Reserved
Who am I?
● CEO of Moesif, the leading API
observability and analytics
platform
● Helping companies grow their
API businesses and improve API
experience
● San Francisco, CA
derric@moesif.com
Confidential and Proprietary. © 2024 Moesif, Inc. All Rights Reserved
1 2 3 4 5
How AI Changes
Cost Planning
AI Usage
Observability
Departmental
Chargeback
Optimizing Usage
and Cost
Conclusion
3
Outline
Confidential and Proprietary. © 2024 Moesif, Inc. All Rights Reserved
4
The AI wave is changing how software is consumed
Users log in
Upgrade Subscriptions
Flat Fees
Automation / AI Agents
Scales with Usage
Pay Per Use/Outcome
Confidential and Proprietary. © 2024 Moesif, Inc. All Rights Reserved
2005
On-Prem
2015
SaaS
2025
AI
Risk of cost
overruns
Cost to procure
software
Upfront Costs Shifted to Runtime
Risks
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
6
Uncontrolled AI Usage Causes Unpredictable Business
▪ Real costs become unplanned and unpredictable
Costs are unpredictable or deviate dramatically from cost estimates
▪ Hard to define a strong pricing and monetization strategy
For revenue generating apps, hard to determine appropriate pricing levels
▪ Susceptible to economic abuse and arbitrage
Without observability and governance, AI APIs can be subject to abuse
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
7
Understanding Cloud Cost Was Simple and Predictable
Engineering Marketing Sales
Compute 100 VMs 10 VMs 1 VM
Storage 100TB 1TB 100GB
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
8
Understanding AI Usage is Complex
Input Tokens
Count
Output Tokens
Count
Model
Version
Input
Caching
Context
Window Storage
N-dimensional data to meter
Batch
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
Unexpected usage spikes
Data transfer costs
Suboptimal pay-per-use pricing
Incorrect models and usage patterns
1
3
4
2
Key Cost Drivers
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
10
Centralized Usage Observability
▪ Measure usage in real-time
Monitor traffic to different APIs and models
▪ Tied to revenue/cost data
Different usage patterns has different value
▪ Tagged and attributed to different consumers/use cases
Usage attributed to specific users, tenants, departments
▪ Detect usage anomalies
Notify relevant parties on anomalous usage patterns
Confidential and Proprietary. © 2024 Moesif, Inc. All Rights Reserved
Uptime
and SLA
Memory
Usage
Latency
Request
Per Minute
Errors
Per Minute
1 2 3 4 5
Today’s observability tooling focuses on infrastructure
Confidential and Proprietary. © 2024 Moesif, Inc. All Rights Reserved
Model
Accuracy
Usage
by User
Tokens
Used
Relevant
Outcome
Metrics should focus on business costs / outcomes
Confidential and Proprietary. © 2024 Moesif, Inc. All Rights Reserved
Cost per
Result
Cost per
User
Cost per
Token
Cost per
Outcome
Can smaller models be
used without sacrificing
accuracy?
Are there any outlier users
unintentionally abusing API?
Are requests efficiently
using too many tokens?
Was the output
garbage or not
relevant?
Normalize cost metrics and find outliers
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
14
Can a smaller model be used?
▪ Not all models are ideal
By measuring accuracy and relevancy
of requests, you can better optimize
which model is best for your use case
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
15
Is a specific consumer driving unnecessary cost?
▪ Usage attributed to each user
By attributing usage to different
users/apps, you can understand
which consumers are driving most
cost.
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
Direct Allocations
Usage-based allocation
Outcome-based allocation
Even-Split Allocation
1
3
4
2
Departmental Chargeback Models
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
Promote fairness
Encourage efficient usage
Enable better budgeting and planning
1
3
2
Goals of a Robust Chargeback Model
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
Strategies
Direct
Usage-based
Outcome-based
Even-Split
Easiest for experimentation and low scale
Promotes fairness and efficiency
Aligned to results and impact driving value.
Lowest administrative overhead.
Pros Cons
Not suitable for shared resources and ignores
discounts. Can have high overhead.
Requires robust metering and tracking.
Even more complex metering around outcomes and
results. Can limit transparency.
Discourages larger users from optimizing
usage while hinders adoption of new users.
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
19
AI Governance Best Practices
▪ Enforce strict quotas and governance
Prevent usage spikes from causing massive cost overruns.
▪ Monitor for usage spikes or budgets exceeded
Notify relevant parties on anomalous usage patterns
▪ Conduct periodic audits of AI usage and costs.
Costs can slowly increase without knowledge
Confidential and Proprietary. © 2025 Moesif, Inc. All Rights Reserved
20
Keep Teams Informed
▪ Keep Stakeholders Informed
Implement a system of alerts that
notify departments when they are
approaching or exceeding their
allocated budget for AI API usage.
▪ Communicate Pricing
Clearly communicate the pricing
structure for the AI APIs, whether
external or internally developed.
Confidential and Proprietary. © 2024 Moesif, Inc. All Rights Reserved
API Analytics and Monetization
Derric Gilling
derric@moesif.com
https://www.moesif.com/casestudies

Optimizing-consumption-and-cost-of-AI-APIs-in-the-enterprise.pdf

  • 1.
    Confidential and Proprietary.© 2024 Moesif, Inc. All Rights Reserved Optimizing consumption and cost of AI APIs in the enterprise
  • 2.
    Confidential and Proprietary.© 2024 Moesif, Inc. All Rights Reserved Who am I? ● CEO of Moesif, the leading API observability and analytics platform ● Helping companies grow their API businesses and improve API experience ● San Francisco, CA derric@moesif.com
  • 3.
    Confidential and Proprietary.© 2024 Moesif, Inc. All Rights Reserved 1 2 3 4 5 How AI Changes Cost Planning AI Usage Observability Departmental Chargeback Optimizing Usage and Cost Conclusion 3 Outline
  • 4.
    Confidential and Proprietary.© 2024 Moesif, Inc. All Rights Reserved 4 The AI wave is changing how software is consumed Users log in Upgrade Subscriptions Flat Fees Automation / AI Agents Scales with Usage Pay Per Use/Outcome
  • 5.
    Confidential and Proprietary.© 2024 Moesif, Inc. All Rights Reserved 2005 On-Prem 2015 SaaS 2025 AI Risk of cost overruns Cost to procure software Upfront Costs Shifted to Runtime Risks
  • 6.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved 6 Uncontrolled AI Usage Causes Unpredictable Business ▪ Real costs become unplanned and unpredictable Costs are unpredictable or deviate dramatically from cost estimates ▪ Hard to define a strong pricing and monetization strategy For revenue generating apps, hard to determine appropriate pricing levels ▪ Susceptible to economic abuse and arbitrage Without observability and governance, AI APIs can be subject to abuse
  • 7.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved 7 Understanding Cloud Cost Was Simple and Predictable Engineering Marketing Sales Compute 100 VMs 10 VMs 1 VM Storage 100TB 1TB 100GB
  • 8.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved 8 Understanding AI Usage is Complex Input Tokens Count Output Tokens Count Model Version Input Caching Context Window Storage N-dimensional data to meter Batch
  • 9.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved Unexpected usage spikes Data transfer costs Suboptimal pay-per-use pricing Incorrect models and usage patterns 1 3 4 2 Key Cost Drivers
  • 10.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved 10 Centralized Usage Observability ▪ Measure usage in real-time Monitor traffic to different APIs and models ▪ Tied to revenue/cost data Different usage patterns has different value ▪ Tagged and attributed to different consumers/use cases Usage attributed to specific users, tenants, departments ▪ Detect usage anomalies Notify relevant parties on anomalous usage patterns
  • 11.
    Confidential and Proprietary.© 2024 Moesif, Inc. All Rights Reserved Uptime and SLA Memory Usage Latency Request Per Minute Errors Per Minute 1 2 3 4 5 Today’s observability tooling focuses on infrastructure
  • 12.
    Confidential and Proprietary.© 2024 Moesif, Inc. All Rights Reserved Model Accuracy Usage by User Tokens Used Relevant Outcome Metrics should focus on business costs / outcomes
  • 13.
    Confidential and Proprietary.© 2024 Moesif, Inc. All Rights Reserved Cost per Result Cost per User Cost per Token Cost per Outcome Can smaller models be used without sacrificing accuracy? Are there any outlier users unintentionally abusing API? Are requests efficiently using too many tokens? Was the output garbage or not relevant? Normalize cost metrics and find outliers
  • 14.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved 14 Can a smaller model be used? ▪ Not all models are ideal By measuring accuracy and relevancy of requests, you can better optimize which model is best for your use case
  • 15.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved 15 Is a specific consumer driving unnecessary cost? ▪ Usage attributed to each user By attributing usage to different users/apps, you can understand which consumers are driving most cost.
  • 16.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved Direct Allocations Usage-based allocation Outcome-based allocation Even-Split Allocation 1 3 4 2 Departmental Chargeback Models
  • 17.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved Promote fairness Encourage efficient usage Enable better budgeting and planning 1 3 2 Goals of a Robust Chargeback Model
  • 18.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved Strategies Direct Usage-based Outcome-based Even-Split Easiest for experimentation and low scale Promotes fairness and efficiency Aligned to results and impact driving value. Lowest administrative overhead. Pros Cons Not suitable for shared resources and ignores discounts. Can have high overhead. Requires robust metering and tracking. Even more complex metering around outcomes and results. Can limit transparency. Discourages larger users from optimizing usage while hinders adoption of new users.
  • 19.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved 19 AI Governance Best Practices ▪ Enforce strict quotas and governance Prevent usage spikes from causing massive cost overruns. ▪ Monitor for usage spikes or budgets exceeded Notify relevant parties on anomalous usage patterns ▪ Conduct periodic audits of AI usage and costs. Costs can slowly increase without knowledge
  • 20.
    Confidential and Proprietary.© 2025 Moesif, Inc. All Rights Reserved 20 Keep Teams Informed ▪ Keep Stakeholders Informed Implement a system of alerts that notify departments when they are approaching or exceeding their allocated budget for AI API usage. ▪ Communicate Pricing Clearly communicate the pricing structure for the AI APIs, whether external or internally developed.
  • 21.
    Confidential and Proprietary.© 2024 Moesif, Inc. All Rights Reserved API Analytics and Monetization Derric Gilling derric@moesif.com https://www.moesif.com/casestudies