Abstract dependency graph under human review

A small build flag with a large question behind it

On 2 July 2026, Joey Hess, the long-time free software developer behind git-annex and a familiar name in Debian circles, published a note that sounds narrow at first and becomes larger the longer one looks at it. He had spent about 100 hours over the previous month making it possible for git-annex to build without dependencies that contain LLM-generated code. The work produced a build option called NoLLMDependencies, but the more important output was a map of a new kind of supply-chain pressure.

git-annex is a particularly revealing project for this problem. It is not a throwaway web experiment, and it is not a showcase for AI tooling. It is a long-lived tool for managing large files with Git, often used by people who care about reproducibility, offline workflows, archival habits and the ability to understand what their tools are doing years later. A dependency question that reaches git-annex is therefore not just about style. It touches the assumptions that make durable free software feel durable.

Hess says git-annex itself does not contain LLM-generated code and states that it will not. That is the easy boundary. The hard boundary is the dependency tree. A project can decide what enters its own repository, but it does not fully control the compilers, build tools, libraries and version ranges that make a release possible. The git-annex wiki page on “no llm code” lists known dependencies where LLM-generated code or commits have appeared, including GHC, ram, persistent, yesod, Cabal and Git. That list matters because it spans infrastructure, not merely optional application packages.

The immediate mechanism is pragmatic rather than theatrical. NoLLMDependencies lets git-annex be built with versions of dependencies that predate the code Hess identified. It is not the default path, and the project warns that it may become impossible to maintain. The warning is not cosmetic. If a security flaw is fixed only in a newer dependency, building with an older dependency may preserve a provenance preference while weakening security. That is the kind of trade-off serious maintainers recognize: one risk is reduced, another risk may grow.

The story spread quickly because it turned an abstract argument into a concrete engineering bill. The Hacker News discussion around Hess’s post was active and polarized. Some participants treated the stance as a reasonable hygiene measure for software freedom; others saw it as self-sabotage in a world where AI-assisted programming is becoming ordinary. The useful part of that disagreement is not the temperature. It is the fact that both sides are pointing at real constraints.

Why this is not just another quality complaint

Open source has always absorbed bad code, rushed patches, abandoned libraries and overconfident contributors. Skeptics of anti-LLM policies often begin there: if humans can write poor code, why single out machines? That question is fair, but it is incomplete. The git-annex case is not only about whether generated code passes tests today. It is about whether downstream users can understand the code’s origin, legal status, modification path and maintenance burden tomorrow.

The copyright question must be stated carefully. It is not accurate to say that AI-generated code definitely infringes copyright. The legal status and provenance of particular outputs are uncertain and disputed, and they vary by jurisdiction, model, prompt, training data, human editing and factual circumstances. For open-source distribution, uncertainty is itself operationally meaningful. A maintainer may be unwilling to accept code when they cannot confidently explain who authored it, what license chain supports it, and whether the contributor has the rights they claim to grant.

The software freedom question is related but not identical. Free software culture has long cared about the “preferred form of modification”, especially in GPL contexts. If the editable source is a human-maintained tree with meaningful review history, ordinary open-source practice fits that model. If a large patch is mainly the artifact of a hidden prompt, a proprietary model and unstored intermediate outputs, some maintainers will ask whether the repository still contains the real source of the change or merely the final artifact that happened to compile.

Hess’s examples point to the same unease. The git-annex page describes large LLM-generated changes that were reverted in a following release without clear explanation, an incoherent 1,489-line commit message accompanying about 10,000 lines of changes in a 26,000-line codebase, and a prompt that asked an LLM to copy code from another project but apparently avoided copyright trouble by luck rather than by a robust process. These are not proofs that all generated code is unusable. They are signs that normal review rituals can be overwhelmed by scale and opacity.

Even when code is technically correct, maintainability can be damaged by missing intent. A human contributor can answer why a design was chosen, can learn from review, can accept responsibility for a regression and can explain what was deliberately not implemented. An LLM output cannot be mentored, cannot remember the project’s philosophy, and cannot be held accountable. The human who submits the patch may provide that accountability, but only if they actually understand and own the result. That distinction is becoming central.

The dependency tree is where policy meets reality

A project-level ban is comparatively simple. A maintainer can tell contributors not to submit LLM-generated code, can reject suspicious patches, and can document the rule in contribution guidelines. A dependency-level ban is much harder because modern software is assembled through layered ecosystems. git-annex depends on the Haskell toolchain and libraries, on Git, on package metadata and on distribution practices. The moment an upstream project accepts generated code, every downstream project inherits a decision it did not make.

That is why the named dependencies in the git-annex wiki are important. GHC is not an obscure helper; it is the Glasgow Haskell Compiler. Cabal is part of Haskell’s build and package story. persistent and yesod sit in a Haskell web ecosystem. ram and reverse dependencies represent a narrower but still real package path. Git is the foundational version-control system that git-annex extends. When provenance concerns reach components at this level, “just choose another library” stops being a serious answer.

The NoLLMDependencies flag is therefore best read as an experiment in measuring friction. It asks: what does it cost, in hours and upgrade constraints, to keep one serious project on a dependency path that avoids known LLM-generated code? Hess’s estimate of about 100 hours in one month is the first answer. The second answer is that the result is conditional. It depends on older versions staying buildable, on distributions packaging compatible versions, on vulnerabilities not forcing upgrades, and on upstreams not making the old path untenable.

There is also a detection problem. LLM use may be disclosed in a commit message, a pull request, a maintainer note or a generated-code marker. It may also be hidden, ambiguous or impossible to infer reliably. OpenSSF’s AI-SLOP discussion is partly about this asymmetry: generating low-quality reports or code is cheap, while reviewing and classifying the output is expensive. There is no dependable detector that turns provenance into a simple pass-fail test across a full dependency graph.

Package managers were not designed for this class of metadata. They can encode versions, checksums, maintainers, dependency bounds and sometimes license fields. The broader supply-chain world has SLSA, Sigstore, SBOMs, in-toto attestations and GitHub-style provenance work, but most ecosystems do not yet have a standard way to say whether a release contains AI-generated code, AI-assisted edits, AI-written documentation, AI-generated tests or AI-produced vulnerability reports. Without shared vocabulary, policy becomes local and brittle.

Institutions are moving, but not in unison

The Software Freedom Conservancy’s 2026 recommendations for LLM-backed generative AI in FOSS contributions are notable because they do not pretend one rule fits every project. SFC recognizes that some project leaders take a zero-tolerance approach to LLM-generated AI contributions and says the community should support projects that reject such systems. At the same time, the recommendations are framed as guidance for contributors who decide to use LLM-backed tools, not as a universal prohibition.

That dual posture reflects the current state of open source. Many maintainers want disclosure, review discipline and a way to refuse output they cannot trust. Many contributors already use Copilot, ChatGPT, Claude or similar systems for exploration, scaffolding, tests, refactoring suggestions or documentation. A policy that treats every autocomplete suggestion the same as a wholesale generated subsystem may be too blunt. A policy that treats undisclosed generated patches as ordinary human work may be too naive.

Debian’s debate shows the same difficulty at distribution scale. LWN’s coverage of Debian deciding not to decide on AI-generated contributions emphasized that Debian developers were not of one mind and had not converged on a shared definition of what counts as an AI-generated contribution. That is not bureaucratic failure so much as a sign that the category is messy. Is code AI-generated if an LLM wrote the first draft and a human rewrote half of it? What about a function suggested by autocomplete? What about AI-produced tests that shape the implementation?

OpenSSF’s AI-SLOP issue approaches the topic from a different direction: vulnerability disclosures and maintainer burden. The issue describes a high volume of low-quality AI-generated vulnerability reports and related contributions, with a DDoS-like effect on maintainers. It also references the curl experience: halfway through 2025, only about 5% of bug bounty submissions were genuine vulnerabilities, while around 20% appeared to be AI-generated slop; curl ended its bug bounty program in January 2026. The problem is not only code entering repositories. It is also attention being consumed before real review can begin.

Hacker News supplied the rough public argument in miniature. One camp argued that LLM output can be treated like output from a weak developer: review it, test it, reject it if it is bad. Another camp answered that weak human developers can be mentored, can explain themselves, and can become stronger maintainers; machine output changes the economics by making it easy to produce large, plausible diffs without comparable responsibility. Both claims can be true in different contexts, which is exactly why maintainers need policy rather than slogans.

A taxonomy maintainers can actually use

The first practical step is to separate categories that are often collapsed. AI-assisted code is not necessarily the same as AI-generated code. A developer may use an assistant to search documentation, ask for an explanation of an API, draft a test name or suggest a small refactor. AI-generated code usually implies that a material portion of the submitted code was produced by the system and then accepted by a human. AI-reviewed code means a model commented on or checked human-written code. AI-generated vulnerability reports are another category entirely.

Those distinctions should appear in contributor policy. A project can require disclosure when an LLM produced a material part of a patch, while allowing ordinary editor completion or documentation lookup. It can require the submitter to certify that they understand the change, have reviewed it line by line, and have the right to license it under the project’s terms. It can require prompts or generated outputs to be retained for large changes when provenance is relevant. It can also decide that certain areas, such as cryptography, parsers, sandboxing or release infrastructure, are stricter zones.

The policy should define what happens when disclosure is missing. Immediate rejection may be appropriate for some projects; a request for clarification may fit others. What matters is consistency. Maintainers already spend emotional and technical energy explaining decisions. If AI-related rules are improvised in every pull request, the policy itself becomes another source of conflict. A short, explicit rule is often better than a perfect taxonomy that nobody can apply.

Large generated patches deserve special handling. The review burden grows faster than the value of the contribution when a patch arrives with thousands of lines, weak commit history and little design explanation. Maintainers can ask for smaller changes, human-written rationale, tests tied to requirements and evidence that the contributor can maintain the code after merge. That is not anti-tool. It is ordinary maintainership applied to a world where producing bulk code has become cheap.

Projects should also decide how to treat dependencies. A full audit of every upstream commit is unrealistic for most teams. But critical projects can track explicit upstream AI policies, record known provenance concerns, and avoid pretending the issue is invisible. Downstream packagers can add notes when a package intentionally pins an older dependency for provenance reasons, and they should document the security consequences of doing so. Users deserve to know whether a choice was made for license, security, reproducibility or AI-provenance reasons.

What users and downstreams should take from git-annex

For users, the wrong lesson is panic pinning. Freezing dependencies indefinitely because a newer version may include generated code can create serious security and compatibility problems. The git-annex page itself acknowledges that the NoLLMDependencies path can build with older versions and may therefore miss security fixes that exist only upstream. A provenance policy that ignores vulnerabilities is not a complete trust strategy.

The better lesson is to ask more precise questions. Does the project have a contribution policy for LLM-generated code? Does it require disclosure? Does it distinguish generated code from AI-assisted editing? Does it have maintainers who understand the merged code? Does it keep enough design history for future modification? Does it participate in normal supply-chain practices such as signed releases, reproducible builds where feasible, SBOMs or attestations? AI provenance becomes one risk dimension among license clarity, security response, maintainer health and release discipline.

Distributions and package repositories face a more systemic version of the same problem. Debian, NixOS, Fedora, Homebrew, Hackage, npm and PyPI cannot reasonably become courts of origin for every line of code. They can, however, encourage disclosure norms, preserve upstream metadata, make policy debates visible and avoid collapsing legal uncertainty, quality concerns and maintainability concerns into a single emotional label. The strongest ecosystems will likely treat provenance as metadata and governance, not as rumor.

For companies using open source, the git-annex case is a reminder that software composition analysis is expanding. Traditional scans look for known vulnerabilities, declared licenses and sometimes abandoned packages. AI-era review may need to track contribution policies, attestations, generated-code declarations and the risk of massive machine-written patches in critical components. That does not mean every AI-touched package is unsafe. It means procurement and security teams need better questions than “was AI used?”

For contributors, the practical advice is simple: do not make maintainers guess. If an LLM produced a meaningful part of a patch, say so where the project asks you to say so. Review the code as if you will maintain it. Avoid submitting giant generated diffs. Do not ask a model to copy from another project unless the license and attribution path are clear. Treat generated output as a draft that requires ownership, not as a shortcut around responsibility.

The early warning

git-annex is an early warning case because it makes the cost visible. One maintainer tried to keep a serious project buildable without known LLM-generated dependencies and spent about 100 hours in a month doing it. The outcome was not a clean separation from the rest of the ecosystem. It was a conditional flag, a list of concerning dependencies, and a warning that security and compatibility may eventually force hard choices.

That is why the story should not be reduced to “AI bad” or “maintainer overreacts”. The valuable signal is that open source now has to reason about authorship and preferred modification form at dependency-graph scale. The old questions remain: is the code licensed, secure, maintained and compatible? A new question has joined them: can we explain where this code came from and who is accountable for changing it later?

Open source does not need a single global answer by next week. It does need clearer project policies, better metadata, contributor honesty and institutional patience. The git-annex experiment shows that provenance cannot be solved by vibes, and it cannot be ignored just because auditing is hard. In the AI-code era, trust will depend less on whether a project claims purity and more on whether it can make its trade-offs explicit.