Countdown: How statistical benchmarking for default emission factors supports an effective NZF

Why this matters

Decarbonizing shipping is a long-term commitment. The default emission factor (DEF) values we set today will anchor the International Maritime Organization Net Zero Framework (IMO NZF) and help determine the framework’s effectiveness as a driver of global climate action.

However, today’s maritime ecosystem is clouded by fragmented data and inconsistent methodologies. To tackle this challenge, this edition of Countdown introduces ‘statistical benchmarking’: a simple, transparent, and comparable approach developed at the Center and designed to make DEF calculations more reliable and consistent.

The role of default emission factors (DEFs) in the IMO NZF

A previous Countdown newsletter explained the regulatory role of the IMO’s 2024 LCA Guidelines. In this edition, we dive deeper into the Center’s statistical benchmarking approach to calculating DEFs for marine fuels.

DEFs are pre-defined GHG intensity values assigned to specific fuel pathways. These values serve as the baseline for compliance with the IMO NZF when primary, batch-specific data is unavailable.

The IMO has specified that DEFs should be calculated using both “representative” and “conservative” assumptions (see the 2024 LCA Guidelines). The Center has operationalized these concepts into a clear technical framework (Figure 1). In practice, “representativeness” means reflecting the actual emissions typically produced by a fuel or technology, using credible and up-to-date data. “Conservativeness” means erring on the side of caution when uncertainty exists, ensuring emissions are not underestimated. This approach safeguards environmental integrity and maintains stakeholder trust.

Figure 1: Approach to operationalizing “representative” and “conservative” principles when calculating default emission factors (DEFs).

It’s a bit like baking a cake for a group: you decide the size of the cake based on the number of people who are likely to attend (representative), but you make it a bit bigger just in case more people show up, or someone wants seconds (conservative). This way, you’re prepared for both the typical situation and any surprises, just as DEFs should be.

That’s why clear rules for defining DEFs are essential. Establishing common ground for boundaries, sources, and methods helps ensure numbers stand up to scrutiny and are meaningful everywhere they’re used. Reaching consensus here means everyone, from industry to government, knows what to expect.

How DEF submission works at the IMO

Member States can submit proposed default emission factors (DEFs) using the IMO’s standardized template and Excel tool, along with supporting data such as LCA models, primary data, or peer-reviewed studies. Each proposed value must be submitted separately and follow the structure of the 2024 LCA Guidelines.

Submissions are sent to the LCA Working Group of the Group of Experts on the Scientific Aspects of Marine Environmental Protection (GESAMP-LCA WG) at least 28 weeks before the relevant MEPC session. However, ongoing submission is also allowed.

The GESAMP-LCA WG reviews proposals for completeness, data quality, transparency, and whether at least three independent studies support the pathway. It also checks calculations and assesses alignment with criteria. When multiple values exist, the highest credible value is recommended as the default.

After peer review by GESAMP, recommended DEFs are sent to the MEPC for approval and later incorporated into updates of the LCA Guidelines. Approved default values become public, while confidential data included in submissions remains protected. See MEPC.1/Circ.916 for more details.

Why we need a standardized method to calculate DEFs

The current approach to calculating proposed DEFs for GESAMP submission requires the DEF to be ‘representative’ and ‘conservative’. However, there is a need for further clarity on how to implement these principles if the industry is to avoid significant variations in submitted values, unclear system boundaries, inconsistent assumptions, and data gaps.

This lack of consistency is more than just a technical problem; it poses a real risk to effective policy. Consistency inspires trust. Without this common ground, DEFs cannot reliably support regulations, market mechanisms, or investment decisions, and instead contribute to confusion and uncertainty.

A clear, shared approach to developing DEFs solves this problem. Adopting a standardized method would provide structure and transparency in how DEFs are set. It would:

Define clear system boundaries and consistent assumptions for all fuels and technologies (following the 2024 LCA Guidelines)
Set minimum standards for data sources and validation
Create an understanding of uncertainty and enable fair, like-for-like comparisons
Help align regulatory frameworks internationally
Build trust among stakeholders, including industry, governments, and civil society

Put simply, a shared method for calculating DEFs is critical for credible, science-based maritime policy. It ensures that decisions rest on consistent, comparable, and robust information. Statistical benchmarking offers a way to put such a shared method into practice.

Introducing statistical benchmarking

Statistical benchmarking is a method for calculating representative and conservative emissions values (Figure 2). Combining real-world industry data with rigorous statistical comparisons ensures that the calculated DEF values reflect both typical operations and uncertainties. The method reduces uncertainty, improves comparability, and provides an evidence-based, scalable approach to developing robust DEFs. Let’s run through the key steps in the method.

Figure 2: Overview of the statistical benchmarking method.

Gathering real-world industry data

In this method, calculating DEFs begins with data straight from the source: industry stakeholders. Collecting detailed information on the actual resources used and emissions produced at each step ensures that findings are firmly grounded in current industry practice. Model outputs, when built on transparent and validated datasets, can also serve as a credible starting place – provided they represent real-world industry conditions and can be benchmarked against independent evidence. In the absence of primary data, a reliable secondary data source can be used here as a starting point.

The benchmarking process: ensuring data is representative

Once we have the data, how can we be sure that it reflects the industry as a whole? This is where statistical benchmarking takes center stage. Here, the method brings in numbers from respected scientific literature and reputable industry reports for comparison. Only studies that use the same technology and mimic real-world, industrial-scale operations are included, and results from lab experiments stay out. For each part of the process, at least three (preferably more) reliable reference values are gathered to build a solid benchmark.

Once all the numbers are in hand, the method calculates the average (mean) and spread (standard deviation) of the literature values. Each piece of industry data is then placed under the microscope using a Z-score, which shows how far it is from the average. If a value falls within the normal range (within 95% of literature average), it’s considered a fair reflection of typical practice and used in the calculations. If a value lands outside this range, the process switches to the literature average instead, always leaning toward a careful, conservative estimate.

Sensitivity analysis: embracing real-world uncertainty

The next stage of the method is sensitivity analysis. This means testing how different choices, like material types, energy sources, or technology performance, could affect the final emission factor. By poking and prodding at the inputs, this step highlights which elements have the biggest effect on emissions, ensuring that no single assumption quietly tips the scale.

To truly capture real-world complexity, the method employs Monte Carlo simulation to propagate uncertainty values. This approach draws on thousands of possible input combinations, revealing a spectrum of potential emissions outcomes. This simulated “cloud” of results shows not just the most likely DEF, but also the range of possibilities. If results are tightly clustered, confidence is high; a wider range signals more uncertainty.

Representative and conservative values

In the end, the method reports two values: a representative DEF and a conservative DEF (Figure 3). The representative value marks the best estimate of typical industry performance, while the conservative value adds an extra buffer to account for uncertainty – specifically, the standard deviation from the Monte Carlo results.

Transparent, dual-value reporting makes it clear where both the middle ground and the upper boundary lie. This approach helps regulators, policymakers, and industry alike make confident, evidence-based decisions.

Figure 3: How representative and conservative default emission factors (DEFs) are calculated in the statistical benchmarking method.

Limitations of the method

While the statistical benchmarking method is rigorous, a few limitations remain – together with opportunities for improvement (Table 1).

From theory to action

The Center has applied this statistical benchmarking method in two DEF submissions (login required) to GESAMP, covering e-ammonia and liquefied bio-methane. The method’s replicable nature means that others can confidently use it for their own DEF submissions, supporting consistent and credible results across the industry.

Robust and transparent DEFs are essential for guiding the maritime sector’s transition to cleaner fuels. By combining representative data with a conservative approach, and using methods like statistical benchmarking, we can ensure that DEFs earn trust through transparency – continuously tested, openly refined over time, and built to support a fuel transition grounded in the best available evidence.

Want to learn more?

Additional technical details on the statistical benchmarking method can be found in a submission to the GESAMP-LCA working group, available on IMODOCS (login required).

Authors: Harshil Desai, Annabell Johanna John, Ann O’Connor, Megan Roux and Thor Sodha

Download the slides from this newsletterHow statistical benchmarking for default emission factors supports an effective NZF

This is part of a series of newsletters on the IMO NZF, find previous editions and learn more hereIMO Net-Zero Framework

Feedback or suggestions for future editions? Reach out:

Joe Bettles

countdown@zerocarbonshipping.com