当前位置：首页 > news >正文

【源码解析】Java NIO 包中的 MappedByteBuffer

news 2025/7/1 11:02:31

文章目录

1. 前言
2. MappedByteBuffer
3. 例子
4. 属性
5. 构造器
6. mappingOffset、mappingAddress、mappingLength
7. isLoaded 判断内存是否还在内存中
8. load 方法将 ByteBuffer 加载到 Page Cache 中
9. force 刷盘

1. 前言

上一篇文章我们介绍了 HeapByteBuffer 的源码，这篇文章我们来介绍下 MappedByteBuffer，这个 MappedByteBuffer 是 DirectByteBuffer 的父类。

【源码解析】Java NIO 包中的 Buffer
【源码解析】Java NIO 包中的 ByteBuffer
【源码解析】Java NIO 包中的 HeapByteBuffer

2. MappedByteBuffer

MappedByteBuffer 是 ByteBuffer 的子类，表示一个直接字节缓冲区，其内容是文件的内存映射区域。通过 MappedByteBuffer，程序可以直接对文件内容进行读写操作，而无需通过传统的 I/O 流或通道。

相比传统的文件 IO 操作，比如 FileInputStream 和 FileOutputStream，这种方式可以直接对内存中的数据进行操作，操作系统会负责将内存中的更改同步到磁盘文件中。

MappedByteBuffer 通过 FileChannel 的 map 方法创建，创建的时候可以设置三种模式：

MapMode.READ_ONLY：只读模式，映射的缓冲区是只读的
MapMode.READ_WRITE：读写模式，映射的缓冲区是可读写的，对缓冲区的修改会同步到文件中
MapMode.PRIVATE：私有模式，映射的缓冲区是可写的，但修改不会同步到文件中，而是创建一个私有副本

但是由于 MappedByteBuffer 使用的是堆外内存，所以如果尝试映射过大的文件，可能会导致内存不足（OutOfMemoryError），毕竟内存映射文件的大小受操作系统和可用物理内存的限制。

所以最后总结一下，当需要频繁读写大文件，或者需要随机文件访问的时候就可以使用这个 MappedByteBuffer。

3. 例子

首先我们需要生成一个 1G 的文件。

public class FileTest {public static void main(String[] args) {String filePath = "D:\\学习资料\\计算机编程语言java学习\\后台\\JDK源码\\jdk1.8Source\\src\\test\\file\\hello.txt"; // 生成的文件路径long fileSizeInBytes = 1024L * 1024 * 1024; // 1GBtry {generateFile(filePath, fileSizeInBytes);System.out.println("文件生成成功！路径: " + filePath);} catch (IOException e) {System.err.println("文件生成失败: " + e.getMessage());}}/*** 生成指定大小的文件，内容为 "helloWorld" 的重复填充** @param filePath      文件路径* @param fileSizeInBytes 文件大小（字节）* @throws IOException 如果写入失败*/public static void generateFile(String filePath, long fileSizeInBytes) throws IOException {// "helloWorld" 的字节数byte[] content = "helloWorld".getBytes();int contentLength = content.length;try (FileOutputStream fos = new FileOutputStream(filePath);BufferedOutputStream bos = new BufferedOutputStream(fos)) {// 写入次数long len = fileSizeInBytes / contentLength;// 一次写入 helloWorld 字节数for (long i = 0; i < len; i++) {bos.write(content);}// 剩余字节long remainingBytes = fileSizeInBytes % contentLength;if (remainingBytes > 0) {bos.write(content, 0, (int) remainingBytes);}}}
}

生成了 hello.txt 之后，可以看下面图。
在这里插入图片描述
生成 1G 的文件之后我们再来看下传统的 IO 读取数据和 MappedByteBuffer 读取数据的效率。

public class MappedByteBufferPerformance {public static void main(String[] args) throws Exception {String filePath = "D:\\学习资料\\计算机编程语言java学习\\后台\\JDK源码\\jdk1.8Source\\src\\test\\file\\hello.txt"; // 生成的文件路径long fileSize = 1024 * 1024 * 1024; // 1GBlong startTime = System.currentTimeMillis();try (RandomAccessFile file = new RandomAccessFile(filePath, "r");FileChannel channel = file.getChannel()) {MappedByteBuffer buffer = channel.map(FileChannel.MapMode.READ_ONLY, 0, fileSize);// 读取文件内容while (buffer.hasRemaining()) {buffer.get(); // 读取一个字节}}long endTime = System.currentTimeMillis();System.out.println("MappedByteBuffer 读取时间: " + (endTime - startTime) + " ms");}
}

首先上面是 MappedByteBuffer 的读取，总共用了 317 ms，如下图所示。
在这里插入图片描述

下面我们再来看下使用传统 IO 来读取文件的耗时。

public class BufferedIOPerformance {public static void main(String[] args) throws Exception {String filePath = "D:\\学习资料\\计算机编程语言java学习\\后台\\JDK源码\\jdk1.8Source\\src\\test\\file\\hello.txt"; // 生成的文件路径long startTime = System.currentTimeMillis();try (FileInputStream fis = new FileInputStream(filePath);BufferedInputStream bis = new BufferedInputStream(fis)) {// 读取文件内容while (bis.read() != -1) {// 读取一个字节}}long endTime = System.currentTimeMillis();System.out.println("普通 I/O 读取时间: " + (endTime - startTime) + " ms");}
}

普通 IO 的读取耗时如下：
在这里插入图片描述
所以这里总结下读取的结果：

操作方式	文件大小	读取时间	备注
MappedByteBuffer	1 GB	317 ms	直接内存映射，效率极高
BufferedInputStream	1 GB	19552 ms	带缓冲区的普通 I/O，速度较慢

4. 属性

MappedByteBuffer 中只有一个属性 fd，其他属性都在父类 ByteBuffer 中。

private final FileDescriptor fd; 是 Java 中用于表示操作系统文件描述符的对象。它允许 Java 程序与底层的文件系统进行交互。说白了这玩意就是用来映射文件到内存的。

5. 构造器

MappedByteBuffer(int mark, int pos, int lim, int cap, // package-privateFileDescriptor fd)
{super(mark, pos, lim, cap);this.fd = fd;
}MappedByteBuffer(int mark, int pos, int lim, int cap) {super(mark, pos, lim, cap);this.fd = null;
}

这两个构造器其实就是一个指定了文件描述符，一个没有指定。

6. mappingOffset、mappingAddress、mappingLength

private long mappingOffset() {// 页大小int ps = Bits.pageSize();// 求直接内存的偏移量long offset = address % ps;// 确保一定是正数return (offset >= 0) ? offset : (ps + offset);
}private long mappingAddress(long mappingOffset) {// address 表示缓冲区的起始地址// mappingOffset 是上面的偏移量return address - mappingOffset;
}private long mappingLength(long mappingOffset) {return (long)capacity() + mappingOffset;
}

第一个方法 mappingOffset 获取的是 MappedByteBuffer 的内存地址相对于内存页面起始位置的偏移量， Bits.pageSize(): 这里面返回的是操作系统的内存分页大小，一般是 4KB 或者 8KB，这里取决于用什么操作系统。在进行内存映射的时候可以用这个方法求出偏移量来进行内存对齐。

第二个方法 mappingAddress 用来计算内存页面的起始地址，这里的 mappingOffset 一般就是上面的 mappingOffset 方法。address - mappingOffset 这个方法就是使用缓冲区 ByteBuffer 的起始地址减去偏移量。

第三个方法 mappingLength 求出的是内存映射文件的总长度，也就是 mmap 文件映射的内容区域。

上面这几个方法就是获取 MappedByteBuffer 的各种地址信息，那为什么又要有一个偏移量呢？我们知道操作系统分配最小单位是一个页，所以当使用 mmap 映射操作系统内存的时候分配的内存总是一个页的起始位置。
在这里插入图片描述
虽然我们获取了 MappedByteBuffer，但是这个 MappedByteBuffer 的起始位置有可能不是一个页的起始位置，也就是说上面图中 mappingAddress 是页的起始位置，但是 MappedByteBuffer 里面的起始地址是 address。操作系统分配内存肯定是一个页来分配的，所以 MappedByteBuffer 的起始地址和实际分配的有可能不一样，相差就是 mappingOffset。上面的 mappingOffset 求出来的就是 mappingAddress -> address 之间的距离，而 mappingAddress 求出来的就是操作系统内核实际调用 mmap 分配的内存页起点，就是上图的 mappingAddress，最后一个方法 mappingLength 求出来的就是 mmap 实际分配的内存容量。
在这里插入图片描述

7. isLoaded 判断内存是否还在内存中

public final boolean isLoaded() {// 判断 MappedByteBuffer 有没有映射到一个文件checkMapped();// 如果起始地址为 0 或者容量为 0if ((address == 0) || (capacity() == 0))// 表示已经不在物理内存里面了return true;// 获取 mmap 分配的内存的起始位置，也就是图中的 mappingOffsetlong offset = mappingOffset();// MappedBuffer 实际映射的内存区域大小 也是 mmap 实际分配的内存大小long length = mappingLength(offset);// mappingAddress(offset) 获取实际的映射起始位置 mapPosition// Bits.pageCount(length) 表示分配了多少个页// 调用 native 方法return isLoaded0(mappingAddress(offset), length, Bits.pageCount(length));
}private native boolean isLoaded0(long address, long length, int pageCount);

如果结果是 true，表示缓冲区的内容很可能已经驻留在物理内存中，访问这些数据时不会触发虚拟内存页错误或 I/O 操作。
如果返回 false，并不一定表示缓冲区的内容没有驻留在物理内存中，可能只是部分数据不在物理内存中。

8. load 方法将 ByteBuffer 加载到 Page Cache 中

public final MappedByteBuffer load() {// 判断文件描述符是不是空checkMapped();if ((address == 0) || (capacity() == 0))return this;// 获取 mmap 内存地址到 MappedByteBuffer 的距离long offset = mappingOffset();// 获取 mmap 分配的内存长度long length = mappingLength(offset);// 调用 native 将 MappedByteBuffer 中的内容预读到 page cache 中load0(mappingAddress(offset), length);// Read a byte from each page to bring it into memory. A checksum// is computed as we go along to prevent the compiler from otherwise// considering the loop as dead code.Unsafe unsafe = Unsafe.getUnsafe();// 一个页的大小int ps = Bits.pageSize();// 这个 ByteBuffer 分配了多少个页int count = Bits.pageCount(length);// 获取 mmap 映射地址的起始地址long a = mappingAddress(offset);byte x = 0;// 从 mmap 起始地址开始遍历所有页，每遍历一次访问一下都会发生缺页中断，// 同时将 MappedByteBuffer 和 Page Cache 进行页表映射for (int i=0; i<count; i++) {x ^= unsafe.getByte(a);a += ps;}if (unused != 0)unused = x;return this;
}

这个方法会将 ByteBuffer 内容里面的数据加载到 Page Cache 中，并且这个方法还会遍历所有页预读一次。因为数据加载到 Page Cache 之后，并不会立刻就生成虚拟内存到物理内存的映射。所以加载到 Page Cache 的物理页之后需要访问一次发生缺页中断，这时候才会生成页表项。

9. force 刷盘

public final MappedByteBuffer force() {checkMapped();if ((address != 0) && (capacity() != 0)) {// 核心逻辑，从 mmap 映射的起始位置开始，将映射的内容进行刷盘long offset = mappingOffset();force0(fd, mappingAddress(offset), mappingLength(offset));}return this;
}private native void force0(FileDescriptor fd, long address, long length);