# 🌞 Google Summer of Code 2026

## About Google Summer of Code

The [Google Summer of Code](https://summerofcode.withgoogle.com) is a global,
online program focused on bringing new contributors into open source software
development.  Contributors work with an open source organization on a 12+
week-long program under the guidance of mentors.

## Project Proposal Guidelines

Contributors/applicants are responsible for writing a proposal and submitting
it to Google before the application deadline.

See the Project Ideas for a starting point for project ideas.  We welcome
proposals which are variations of these project ideas and new project ideas
as well. Please reach out to us on the development mailing list or Matrix
room to discuss project proposals.

In order to ensure the projects run healthily, your proposal must contain
a clear yes/no statement about whether you used LLM-based AI tools (ChatGPT,
Gemini, etc.) to help you write it. Include exactly one of the following
statements in your proposal:

`This proposal was written with the assistance of [ChatGPT/Gemini/etc] to [check spelling / check accuracy / format text].`

or:

`This proposal was written without the use of any AI tools.`

If your proposal does not contain either of these statements, your proposal
will not be considered.

Using AI tools for refining project proposals is allowed, but please try to
keep the proposal within your own development capability.

Please **do not submit a proposal completely generated by AI** since it's
just a waste of our time.

## Project Ideas

(multi_threaded_decompression)=

### Multi-threaded Decompression Support in fsck.erofs

Proposed mentors: Yifan Zhao ([@SToPire](https://github.com/SToPire)), Chunhai Guo ([@speedan1](https://github.com/speedan1)), Gao Xiang ([@hsiangkao](https://github.com/hsiangkao))  
Languages: C  
Estimated project length: 350 hours  
Difficulty: hard  
Skills:  
 - Proficiency in C programming;
 - Experience with multi-threaded programming;
 - Experience with file system concepts and operations.

**Description**

EROFS is designed for modern high-performance immutable use cases and is
widely used in OS images such as Android system images and container images.
Its userspace tool, fsck.erofs, is critical for filesystem integrity checking,
image unpacking, and regression testing.

Currently fsck.erofs is strictly single-threaded, therefore it unpacks each
file one-by-one, even though:

 - EROFS compression units (e.g., pclusters) are largely independent;
 - Modern computer systems provide abundant CPU parallelism;
 - Unpacking workloads for compressed files are often CPU-bound rather than
   I/O-bound.

For large images, extraction time scales poorly, which negatively impacts
image installation, CI pipelines and/or developer workflows.  Therefore,
introducing multi-threaded decompression would unlock significant
performance improvements of these use cases.

The primary goal is to design and implement a new multi-threaded
decompression pipeline for fsck.erofs that is efficient, safe and
maintainable.

Specific goals:

- Parallelized decompression at different levels (compressed clusters,
  inodes, etc.)
- Always perform better than the single-threaded approach among all
  supported algorithms (LZ4, LZMA, DEFLATE and Zstandard);
- Maintain correctness and stability;
- Minimize memory overhead and contention;
- The unpacking performance of metadata compressed filesystems should
  be greatly improved too;
- Add tests to look after this new feature (e.g. stress tests).

### Support generating filesystems from manifests with mkfs.erofs

Proposed mentors: Chengyu Zhu ([@ChengyuZhu6](https://github.com/ChengyuZhu6)), Gao Xiang  
Languages: C  
Estimated project length: 175 hours  
Difficulty: medium  
Skills:  
 - Proficiency in C programming;
 - Experience with file system concepts and operations.

**Description**

For local builds, currently mkfs.erofs only supports generating from source
directories or tarballs. However, building from source directories has the
following disadvantages:

 - Limited inode metadata control;
 - Requires privileges for special inodes;
 - No explicit data dump ordering control;
 - Slower for very large filesystem trees.

The primary goal is to implement the functionality to generate EROFS images
from manifest files.  Note that manifest formats are highly fragmented, so you
should implement support for at least two common manifest formats, for example:

 - [composefs-dump(5)](https://manpages.debian.org/testing/composefs/composefs-dump.5.en.html);

 - Modified [unix proto files](https://www.ibm.com/docs/en/aix/7.3.0?topic=m-mkproto-command);

 - BSD [mtree(5)](https://man.freebsd.org/cgi/man.cgi?query=mtree&sektion=5&format=html).

Dedicated test cases should be added to ensure its correctness.

### Complete Filesystem Feature Support for erofs-rs

Proposed mentors: [@Dreamacro](https://github.com/Dreamacro), Gao Xiang  
Languages: Rust  
Estimated project length: 350 hours  
Difficulty: hard  
Skills:  
 - Proficiency in Rust programming;
 - Experience with file system concepts and operations.

**Description**

EROFS is a high-performance, space-efficient read-only filesystem widely used
in the Linux ecosystem.

Currently, there is no mature, full-featured EROFS implementation in pure Rust.
However, [erofs-rs](https://github.com/Dreamacro/erofs-rs) has the following
features that make it particularly appealing:

 - `no_std` support for embedded systems;

 - Zero-copy parsing via mmap (std) or byte slices (no_std).

However, it currently only includes a reader, and even this reader lacks the
following basic features:

 - Support for extended attributes (xattrs);
 - Support for multiple devices;
 - Support for two compression layouts;
 - Support for 48-bit addressing layouts;
 - Support for metadata compression.

The primary goal is to complete the erofs-rs feature set, including the missing
reader features listed above, and additionally implement a writer with an
efficient block allocator as a plus.

(porting-erofs-to-freebsd)=

### Porting EROFS to BSD Kernels (FreeBSD Focus)

Proposed mentors: Gao Xiang, Hongbo Li ([@hb-lee](https://github.com/hb-lee))  
Languages: C  
Estimated project length: 350 hours  
Difficulty: hard  
Skills:  
 - Proficiency in C programming;
 - Experience with file system concepts and operations.

**Description**

EROFS is a high-performance, space-efficient read-only filesystem widely
used in the Linux ecosystem, particularly for immutable infrastructure,
containers, and embedded systems.

Today, EROFS is heavily used in:

 - Android system images;
 - Container images and snapshotters (containerd, gVisor, Kata containers);
 - Immutable OS (AWS Bottlerocket) and cloud infrastructure.

However, BSD kernels currently lack native implementation for EROFS, forcing
interested users to rely on other alternatives. Porting EROFS to BSDs would:

 - Enable direct mounting of native EROFS images;
 - Improve interoperability between Linux and BSD systems;
 - Provide FreeBSD with a modern high-performance immutable filesystem
   optimized for container and immutable OS workloads.

The primary goal is to implement EROFS filesystem support in the FreeBSD
kernel, following FreeBSD VFS conventions while remaining compatible with
the standard EROFS on-disk format.

Key objectives:

 - Implement a FreeBSD kernel EROFS filesystem driver;
 - Support mounting and reading standard EROFS images;
 - Integrate EROFS with FreeBSD’s VFS, buffer cache, and VM systems;
 - Validate correctness and performance using real-world workloads;
 - Lay groundwork for future BSD ports (OpenBSD, NetBSD).

### Advanced Fuzzing and Image Injection for the Kernel and erofs-utils

Proposed mentors: Yifan Zhao, Hongbo Li, Gao Xiang  
Languages: C, Go and/or Rust  
Estimated project length: 175 hours  
Difficulty: medium  
Skills:  
 - Proficiency in C programming;
 - Experience with file system concepts and operations;
 - Familiarity with fuzzing frameworks (e.g., AFL++, libFuzzer) is a plus.

**Description**

EROFS aims to be a secure, immutable image-based kernel filesystem by design.
Because its on-disk format contains less redundant metadata and is designed
to tolerate bogus or corrupted values, EROFS behaves differently from generic
writable filesystems. In addition, its immutable design means that all writable
data is copied up (aka copy-on-write) into another local trusted filesystem.
This makes it safer than writing directly to an untrusted and potentially
inconsistent generic writable filesystem.

We pay particular attention to the EROFS core on-disk format. Although the
format design is simple and the implementation (especially for the core format)
is straightforward, it is highly beneficial to develop more advanced tools
alongside the current syzkaller and the existing erofs-utils fuzzer. These tools
will keep the codebase robust and allow us to address random human-introduced
bugs more actively and in time.

The main goal is to implement an advanced fuzzing tool and an image injection
tool. These tools may be easier to implement using go-erofs (Go) or erofs-rs
(Rust), for example. We also intend to enable a new GitHub Actions CI workflow
to perform periodic fuzzing.

This will allow us to maintain the kernel and erofs-utils implementations in
better shape.
