Project Estimation with T-Shirt Sizing & Evidence Based Scheduling Models for Scrum Teams

Introduction

“How long will this project take?” and “How much will it cost?” — two questions that can strike fear into the hearts of even the most seasoned project managers. In today’s fast-paced software development environment, accurate estimation isn’t just helpful—it’s essential for setting expectations, allocating resources, and delivering successful products.

Effective project estimation is crucial for successful project management. In the world of Agile software development, the T-shirt sizing model is a popular high-level estimation technique that helps predict project scope and resource allocation. In this blog post, I will discuss the T-shirt sizing model for Scrum teams, its limitations, and how it can be used to estimate project costs while considering evidence-based scheduling to create more realistic forecasts.

Scrum Teams & Sprints

Scrum teams are cross-functional groups composed of engineers, designers, and product staff, widely adopted across industries for software development. They are supported by shared roles such as QA testers, dev ops, project managers, and user researchers, which can be part of the Scrum teams or work across multiple teams based on workload.

These teams typically consist of 7 +/- 2 members, following Jeff Bezos’ “two-pizza team rule,” which states that a team should be no larger than what can be fed by two pizzas (around 8 to 9 people). This size constraint helps maintain communication efficiency and minimizes coordination overhead, allowing teams to move quickly and decisively.

Sprints, or iterations, last two weeks, allowing teams to deliver software increments regularly. This cadence provides several advantages:

Frequent delivery of working software
Regular opportunities for stakeholder feedback
Ability to adapt quickly to changing requirements
Consistent checkpoints for monitoring progress
Manageable planning horizons that reduce uncertainty

Each Scrum team is assigned to a workstream that runs parallel to other workstreams. Reducing the number of Scrum teams leads to the combination of parallel workstreams, resulting in longer timelines for the same project scope. This fundamental relationship between team count and overall project duration creates important tradeoffs between time-to-market and resource allocation.

The T-Shirt Sizing Model

The T-shirt sizing model categorizes projects into five sizes based on scope, complexity, duration, and resource requirements. Similar to how clothing sizes provide an intuitive understanding of relative measurements, T-shirt sizes offer a clear, accessible way to communicate project scale without getting lost in the minutiae of hourly estimates.

Extra Small (XS): Projects or tasks with minimal features and complexity, requiring 2 sprints or less of 1 to 3 team members’ work.
- Example: Adding a simple form validation feature to an existing page
- Typical characteristics: Well-understood requirements, limited dependencies, existing design patterns to follow
Small (S): Simple projects with few features, minimal complexity, and requiring 1 to 2 sprints of a single scrum team.
- Example: Implementing a basic search functionality with pre-defined filters
- Typical characteristics: Clear requirements, limited external dependencies, well-defined acceptance criteria
Medium (M): Moderate projects with more features, moderate complexity, and requiring 3 to 4 sprints of a single scrum team.
- Example: Building a user profile management system with customization options
- Typical characteristics: Some requirements flexibility, moderate integration needs, potential for scope refinement during development
Large (L): Complex projects with a larger scope, significant complexity, and requiring 5 to 8 sprints of a single scrum team.
- Example: Developing a complete e-commerce checkout flow with payment processing
- Typical characteristics: Multiple stakeholders, significant integration requirements, potential technical challenges, higher uncertainty
Extra Large (XL): Highly complex projects with multiple components, high complexity, and requiring 8 or more sprints of a single scrum team.
- Example: Building a comprehensive analytics dashboard with real-time data processing
- Typical characteristics: Extensive requirements, multiple external dependencies, innovative features without clear precedents, high technical uncertainty

This sizing approach provides several benefits over more granular estimation techniques:

Creates a common language for stakeholders across technical and business domains
Reduces the false precision that can occur with hour-based estimates
Acknowledges the inherent uncertainty in software development
Simplifies initial planning conversations
Facilitates quick comparison between different initiatives

Estimating Project Costs

To estimate project costs using the T-shirt sizing model, first determine the cost per team member per sprint. This can vary based on factors such as salaries, location, and overheads. For example, the median annual wage for a software developer in New York City is around $132,000. Given this salary and a two-week sprint, the cost per software developer per sprint is approximately $5,280.

Using this cost per sprint, we can provide ballpark dollar costs for each T-shirt size:

XS: $5,280 to $31,680
S: $26,400 to $94,920
M: $79,200 to $189,840
L: $132,000 to $379,680
XL: $211,680 and above

These ranges account for variation in team composition, sprint count, and other factors that influence the final cost. It’s important to note that these figures represent direct labor costs and may not include additional expenses such as infrastructure, licenses, or specialized services.

Case Study: Marketing Website Redesign

To illustrate how this works in practice, consider a marketing website redesign project:

Initially, the project might be sized as a “Medium (M)” initiative, estimated to require 3-4 sprints of a full Scrum team. With a team of 7 members at the NYC developer rate mentioned above, the estimated cost range would be:

Low estimate: 3 sprints × 7 members × $5,280 = $110,880
High estimate: 4 sprints × 7 members × $5,280 = $147,840

This T-shirt sizing provides stakeholders with a useful ballpark figure without creating the illusion of precision that can occur with more detailed estimates. As the project progresses and more information becomes available, these estimates can be refined through evidence-based scheduling techniques.

The Limitations of Estimations and the Planning Fallacy

It’s essential to remember that estimates are just that—estimates. They are subject to numerous factors that can affect a project’s actual duration and cost. One such factor is the Planning Fallacy, a cognitive bias that causes people to underestimate the time and resources needed to complete a task (Kahneman & Tversky, 1979).

This bias can lead to overly optimistic estimates that may not reflect the project’s true complexity and scope. The Planning Fallacy manifests in several common ways:

Focusing on the most optimistic scenario (“best case”)
Underestimating the impact of integration challenges
Failing to account for routine interruptions and context-switching
Not considering the full range of potential obstacles
Over-estimating productivity and efficiency

The Planning Fallacy is a common human bias that I am fascinated by. I often encounter it, including in my own personal and professional work, despite my being keenly aware of it.

Research has shown that even experienced professionals fall victim to this bias, and software development is particularly susceptible due to its inherent complexity and unpredictability. One effective countermeasure is to use historical data to inform and calibrate estimates—which is where evidence-based scheduling comes in.

Evidence-Based Scheduling

Joel Spolsky, co-founder of Stack Overflow and Trello, introduced the concept of “Evidence-Based Scheduling” to address some of the issues with traditional estimation techniques. By collecting historical data on completed projects, evidence-based scheduling allows teams to create more accurate and realistic estimates. This approach considers the actual time taken by individual team members to complete tasks, rather than relying solely on expert judgment or high-level models like the T-shirt sizing model.

Spolsky’s method involves breaking down tasks into smaller units, tracking each team member’s performance, and using statistical techniques to generate a probability distribution of the project’s completion time. This helps teams to better understand the range of possible outcomes, rather than focusing on a single deadline. As a result, evidence-based scheduling can lead to better risk management and more informed decision-making throughout the project lifecycle.

The key components of evidence-based scheduling include:

Task decomposition: Breaking down work into smaller, more manageable pieces (typically 1-3 days of effort)
Velocity tracking: Measuring how quickly team members complete tasks relative to their estimates
Historical calibration: Using past performance to adjust future estimates
Monte Carlo simulation: Running thousands of simulated project completions based on historical performance data
Probability curves: Generating completion date probabilities (e.g., “90% chance of completion by July 15”)

Bridging T-Shirt Sizing and Evidence-Based Scheduling

While T-shirt sizing and evidence-based scheduling may seem like different approaches, they can be effectively combined to create a more robust estimation process:

Initial assessment: Use T-shirt sizing for early project evaluation and high-level resource allocation.
Progressive refinement: As the project moves forward, break down T-shirt sized components into more granular tasks.
Ongoing calibration: Apply evidence-based scheduling techniques to these granular tasks, using historical performance data.
Forecast updates: Regularly update project forecasts based on actual team velocity and completed work.

This combined approach leverages the simplicity and accessibility of T-shirt sizing for initial planning while incorporating the statistical rigor of evidence-based scheduling as more information becomes available.

Benefits of Evidence-Based Scheduling

There are several key benefits to using evidence-based scheduling in combination with the T-shirt sizing model:

Improved accuracy: By basing estimates on historical data, evidence-based scheduling helps to account for the variability in team member performance and other factors that can influence project timelines.
Reduced uncertainty: With a probability distribution of completion times, teams can identify and plan for potential risks more effectively, leading to more robust project planning.
Continuous improvement: By regularly updating the historical data used in evidence-based scheduling, teams can identify trends and areas for improvement, leading to better overall performance over time.
Better stakeholder communication: Probability-based forecasts provide a more honest and transparent way to discuss timelines with stakeholders, moving from “We’ll be done by May 1st” to “We have an 85% chance of completion by May 1st.”
Early warning signals: Deviations from historical patterns can be detected early, allowing teams to address issues before they become critical problems.

Implementation Considerations

While both T-shirt sizing and evidence-based scheduling offer significant benefits, implementing them effectively requires attention to several factors:

Team buy-in: Both approaches require honest input and participation from team members.
Cultural readiness: Organizations must be willing to acknowledge uncertainty and embrace probabilistic thinking.
Data collection infrastructure: Tools and processes must be in place to gather and analyze historical performance data.
Continuous calibration: Estimates should be regularly reviewed and adjusted based on actual performance.
Transparent communication: Stakeholders need to understand the nature of the estimates and their inherent uncertainty.

Conclusion

The T-shirt sizing model is a valuable high-level estimation tool for Agile Scrum teams, helping predict project scope and resource allocation. However, it’s essential to be aware of the limitations of estimates and the influence of cognitive biases like the Planning Fallacy. Combining the T-shirt sizing model with more detailed planning and estimation techniques, such as evidence-based scheduling, can lead to better decision-making and more realistic expectations for project outcomes.

By starting with T-shirt sizing for initial planning and resource allocation, then transitioning to evidence-based scheduling as projects progress, teams can benefit from both the simplicity of categorical estimates and the statistical rigor of data-driven forecasting. This combined approach acknowledges the inherent uncertainty in software development while providing stakeholders with increasingly refined projections as more information becomes available.

In the ever-evolving landscape of software development, the most valuable estimation approaches aren’t those that promise perfect accuracy, but rather those that help teams and organizations make better decisions in the face of uncertainty.

Acknowledgments

Thanks to my colleagues April Lane and Robert Gash for a discussion that included these topics that inspired me to publish this blog post.

References

Kahneman, D., & Tversky, A. (1979). Intuitive prediction: Biases and corrective procedures. In J. S. Carroll & J. W. Payne (Eds.), Cognition and social behavior. Lawrence Erlbaum Associates.
Spolsky, J. (2007, October 26). Evidence-Based Scheduling . Joel on Software.