New

Highlights

Who the Organization Is

Verra is a nonprofit organization that operates the world’s leading carbon crediting program, the Verified Carbon Standard (VCS), alongside standards programs for sustainable development and plastic. With more than 18 years of experience, Verra has set the benchmark for quality and integrity in environmental and social markets.

Verra’s methodologies are built on rigorous science. They raise the bar on transparency and credibility, all while embedding stronger safeguards and benefit-sharing.

Today, Verra standards programs enable companies, countries, and communities to turn goals into action, and its digitalization efforts play a vital role in this context. From digital MRV tools to establishing jurisdictional deforestation data, from transparent documentation to streamlined project registration processes, Verra is powering the transformation we need. To date, Verra has registered more than 3,400 projects in 125+ countries and has issued over 1.3+ billion carbon credits.

While Verra is working to digitize its project workflow, most historical project information still exists in PDF format. These documents contain complex templates, forms, and agreements, and many cover technical details such as emission reduction calculations, as well as structured and unstructured data. Digitizing them enables greater interoperability, automation, transparency, and scalability in managing the many projects Verra certifies.

To illustrate this potential, an AI-driven initiative demonstrated that data from these PDFs can be extracted accurately and reliably, confirming the value of modernizing legacy documents.

The Problem

Verra faced three major challenges.

First, they were dealing with a massive and diverse document backlog. The organization had roughly 8,000 historical PDFs that needed to be digitized and included a large number of unique document types. There was no fixed layout, no consistent placement of fields, and no reliable structure to target.

Second, Verra’s extraction approach was not scalable. The engineering team previously relied on extensive regex logic to pull out fields such as greenhouse gas figures, project identifiers, and monitoring periods. Each field needed its own pattern. A single regex could take 2-4 days to design and refine because every document had edge cases. Scaling this approach to hundreds or thousands of fields would have taken months of development time.

Third, Verra needed a solution that their internal developers could maintain independently. They wanted to onboard new document types, update schemas, and integrate with internal systems without relying on an external vendor or recurring custom engineering work.

The Solution

Through the AWS BOX program, Verra partnered with Upstage and the systems integrator Pariveda to create an automated document extraction pipeline to progressively replace the entire regex-based workflow.

The new system uses Upstage’s Information Extract technology, an agentic solution which combines OCR with a language model. Instead of relying on templates or static rules, the system interprets documents the way an expert would. It understands page layout, reads tables accurately, follows the flow of multi-column structures, and recognizes the semantic relationships between fields.

A key part of the solution is its schema-driven design. Verra’s developers now provide a simple JSON schema that describes the fields they want. The AI interprets the schema, locates the correct information, validates the extraction, and produces structured output. No regex. No hand-coded rules. No brittle code that breaks when formatting changes.

‍

To support long-term autonomy, the project was delivered with:

Infrastructure as code for predictable deployment
DynamoDB for structured storage
Cognito for secure authentication
Amplify for front-end hosting
CloudWatch for monitoring and auditing
API endpoints that make downstream integration straightforward

‍

The BOX-funded engagement moved quickly. After the matchmaking event in April, the project received funding approval in June, began development in July, and completed delivery and knowledge transfer by late August. Verra was able to onboard a new document to type entirely on their own shortly afterward, confirming that the system was truly self-extensible.

The Impact

The results were immediate and meaningful. In the initial phase, the system extracted data from more than 7,000 pages across roughly 50 documents. The MVP mainly focused on project description template and monitoring report document types. Accuracy results were about 90- 100% for critical fields and 80-90% for secondary fields.

More importantly, Verra’s developers can now process up to a thousand fields in days or weeks. The regex approach would require months of engineering effort and constant maintenance. The new workflow significantly reduces that burden.

Verra’s CTO described the Upstage contribution as a combination of robust architecture and strategic guidance that is unlocking insights trapped in legacy PDFs. Data can be more instantly accessed which will help to enable quicker review times and digital documents can be indexed and analyzed for trends, and compliance and performance metrics, and will help to enable AI driven risk assessment and predictive modeling for carbon projects.

Before and After Upstage

The following comparison highlights the shift in Verra’s document processing capabilities following the implementation of the Upstage AI solution.

Before

Dozens of regex patterns for each document type
2-4 days of work per field
Constant breakage when document formats changed
Slow, fragile, and difficult to maintain
Potentially thousands of fields from 8,000-document in backlog

After

Simple schema describing the information needed for the document type
AI that analyzes layout and content automatically • Clean, validated structured output ready for internal systems • New document types onboarded in hours or days • A backlog that is now fully manageable

How Verra is Streamlining Data Management with Upstage AI

Joe Dell'Orfano

•

Industry

•

December 11, 2025

Who the Organization Is

Verra’s methodologies are built on rigorous science. They raise the bar on transparency and credibility, all while embedding stronger safeguards and benefit-sharing.

To illustrate this potential, an AI-driven initiative demonstrated that data from these PDFs can be extracted accurately and reliably, confirming the value of modernizing legacy documents.

The Problem

Verra faced three major challenges.

The Solution

‍

To support long-term autonomy, the project was delivered with:

Infrastructure as code for predictable deployment
DynamoDB for structured storage
Cognito for secure authentication
Amplify for front-end hosting
CloudWatch for monitoring and auditing
API endpoints that make downstream integration straightforward

‍

The Impact

Before and After Upstage

The following comparison highlights the shift in Verra’s document processing capabilities following the implementation of the Upstage AI solution.

Before

Dozens of regex patterns for each document type
2-4 days of work per field
Constant breakage when document formats changed
Slow, fragile, and difficult to maintain
Potentially thousands of fields from 8,000-document in backlog

After

Simple schema describing the information needed for the document type
AI that analyzes layout and content automatically • Clean, validated structured output ready for internal systems • New document types onboarded in hours or days • A backlog that is now fully manageable

Who the Organization Is

Verra’s methodologies are built on rigorous science. They raise the bar on transparency and credibility, all while embedding stronger safeguards and benefit-sharing.

To illustrate this potential, an AI-driven initiative demonstrated that data from these PDFs can be extracted accurately and reliably, confirming the value of modernizing legacy documents.

The Problem

Verra faced three major challenges.

The Solution

‍

To support long-term autonomy, the project was delivered with:

Infrastructure as code for predictable deployment
DynamoDB for structured storage
Cognito for secure authentication
Amplify for front-end hosting
CloudWatch for monitoring and auditing
API endpoints that make downstream integration straightforward

‍

The Impact

Before and After Upstage

The following comparison highlights the shift in Verra’s document processing capabilities following the implementation of the Upstage AI solution.

Before

Dozens of regex patterns for each document type
2-4 days of work per field
Constant breakage when document formats changed
Slow, fragile, and difficult to maintain
Potentially thousands of fields from 8,000-document in backlog

After

Simple schema describing the information needed for the document type
AI that analyzes layout and content automatically • Clean, validated structured output ready for internal systems • New document types onboarded in hours or days • A backlog that is now fully manageable

Highlights

Who the Organization Is

The Problem

The Solution

The Impact

Before and After Upstage

Before

After

How Verra is Streamlining Data Management with Upstage AI

We build intelligence for the future of work—now it’s your turn.

Who the Organization Is

The Problem

The Solution

The Impact

Before and After Upstage

Before

After

Who the Organization Is

The Problem

The Solution

The Impact

Before and After Upstage

Before

After

The 90-Day path to Underwriting Reinvention

Download the White Paper

A look back on 2023 AI trend keywords

A look back on 2023 AI trend keywords

When Ontology Moves Faster Than IT — How Upstage Keeps You Ahead

When Ontology Moves Faster Than IT — How Upstage Keeps You Ahead

Upstage brand system building Part 1: Why Startups Need Branding

Upstage brand system building Part 1: Why Startups Need Branding

Who the Organization Is

The Problem

The Solution

The Impact

Before and After Upstage

Before

After

Related posts

We build intelligence for the future of work—now it’s your turn.

Who the Organization Is

The Problem

The Solution

The Impact

Before and After Upstage

Before

After

Who the Organization Is

The Problem

The Solution

The Impact

Before and After Upstage

Before

After

The 90-Day path to Underwriting Reinvention

Download the White Paper

Related blog posts

A look back on 2023 AI trend keywords

A look back on 2023 AI trend keywords

When Ontology Moves Faster Than IT — How Upstage Keeps You Ahead

When Ontology Moves Faster Than IT — How Upstage Keeps You Ahead

Upstage brand system building Part 1: Why Startups Need Branding

Upstage brand system building Part 1: Why Startups Need Branding