Skip to main content

16 posts tagged with "typechecking"

View All Tags

Talk: Type Checking in Agentic Workflows

· 8 min read

Does adding type checking to an agentic workflow really help agents?

We ran an experiment recently to determine whether there are improvements in the success rate for completing different kinds of tasks. In theory, having a type checker present should help the agent catch type errors earlier, validate fixes incrementally as it works, and reduce the need for slow, test-driven iterative feedback loops.

This talk was originally presented at the PyCon US 2026 Typing Conference. The slides and edited transcript are provided below for your convenience.

📎 Slides (PDF)

Right Types, Wrong Code: Surprising Bugs A Type Checker Catches

· 4 min read

A type checker, as its name suggests, catches type mismatches: things like passing a str to a function that expects an int. But to understand your code's types, a type checker also has to understand its structure: control flow, scoping, class hierarchies, and more. This lets it detect a surprisingly wide range of issues that have nothing to do with int vs. str.

Here are five real categories of bugs that Pyrefly catches, none of which are straightforward type mismatches.

Adding Pyrefly Type Checking to Your Agentic Loop

· 5 min read

Coding agents are writing more Python than ever. Tools like Claude, Copilot, Cursor, and Codex generate entire features with little-to-no user interaction. But in large projects, this generated code is prone to type errors, mismatched signatures, and subtle API misuse. Incorporating static analysis directly into the agentic loop can mean the difference between returning from your break with a production-ready feature or needing several more correction cycles.

Type checking sits right in the sweet spot for agents. It's fast enough for iterating small fixes, robust enough to catch issues of varying complexity, and actionable enough for an agent to make changes. In this post, we walk through how to integrate Pyrefly into your agentic workflow so that every piece of generated code can get type checked automatically.

TL;DR: We recommend:

  • Adding a skill file for your agent with an AGENTS.md directive to ensure the project checks clean before finishing a feature.
  • If your model doesn't trigger this reliably, set up a hook on the Stop event.

Python Type Checker Comparison: Speed and Memory Usage

· 10 min read

Python Type Checker Comparison: Speed and Memory Usage

We frequently hear from developers who are excited about the new generation of checkers (Ty and Pyrefly) and want to know how they stack up against each other and the existing, established tools (Mypy and Pyright). In this comparison, we'll focus purely on performance (time to run a full check) and talk a little bit about how design choices, architecture, and features impact that latency.

Evaluating a type checker's performance presents a challenge due to many variables, including diverse evaluation metrics and varying results across different operating systems and hardware configurations. Furthermore, unlike the official test suite for typing specification conformance, there is no universally adopted benchmark for performance used by all type checker maintainers.

Nonetheless, in this blog post, we will attempt to compare speed and memory usage when checking several dozen packages from the command line. We use this performance data to catch regressions in Pyrefly changes that impact OSS packages — we previously only measured type checking performance on internal projects with Pyre1.

Before we start, we'd like to caution that these numbers are only a snapshot at the time of publication and will be out of date quickly. Performance numbers can swing wildly from release to release, because the type checkers are under active development.

Reaching 100% Type Coverage by Deleting Unannotated Code

· 4 min read

Hero image

At Pyrefly, we've always believed that type coverage is one of the most important indicators of code quality. Over the past year, we've worked closely with teams across large Python codebases here at Meta - improving performance, tightening soundness, and making type checking a seamless part of everyday development.

But one question kept coming up: What would it take to reach 100% type coverage?

Today, we're excited to share a breakthrough.

Lessons from Pyre that Shaped Pyrefly

· 10 min read

Pyrefly is a next-generation Python type checker and language server, designed to be extremely fast and featuring advanced refactoring and type inference capabilities. This isn’t the Pyrefly team’s first time building a type checker for Python: Pyrefly is a successor to Pyre, the previous type checker our team developed.

A lot of Pyrefly’s design comes directly from our experience with Pyre. Some things worked well at scale, while other things were harder to live with day-to-day. After running a type checker on massive Python codebases for a long time, we got a clearer sense of which trade-offs actually mattered to users.

This post is a write-up of a few lessons from Pyre that influenced how we approached Pyrefly.

pandas' Public API Is Now Type-Complete!

· 5 min read

At time of writing, pandas is one of the most widely used Python libraries. It is downloaded about half-a-billion times per month from PyPI, is supported by nearly all Python data science packages, and is generally required learning in data science curriculums. Despite modern alternatives existing, pandas' impact cannot be minimised or understated.

In order to improve the developer experience for pandas' users across the ecosystem, we at Quansight Labs (with support from the Pyrefly team at Meta) decided to focus on improving pandas' typing. Why? Because better type hints mean:

  • More accurate and useful auto-completions from VSCode / PyCharm / NeoVIM / Positron / other IDEs.
  • More robust pipelines, as some categories of bugs can be caught without even needing to execute your code.

By supporting the pandas community, pandas' public API is now type-complete (as measured by Pyright), up from 47% when we started the effort last year. We'll tell the story of how it happened - but first, we need to talk more about type completeness, and how we measure it.

Python Type Checker Comparison: Empty Container Inference

· 9 min read

Empty containers like [] and {} are everywhere in Python. It's super common to see functions start by creating an empty container, filling it up, and then returning the result.

Take this, for example:

def my_func(ys: dict[str, int]):
x = {}
for k, v in ys.items():
if some_condition(k):
x.setdefault("group0", []).append((k, v))
else:
x.setdefault("group1", []).append((k, v))
return x

This seemingly innocent coding pattern poses an interesting challenge for Python type checkers. Normally, when a type checker sees x = y without a type hint, it can just look at y to figure out x's type. The problem is, when y is an empty container (like x = {} above), the checker knows it's a list or a dict, but has no clue what's going inside.

The big question is: How is the type checker supposed to analyze the rest of the function without knowing x's type?