最近在做一個(gè)需求,評估Java列表10萬數(shù)據(jù)加載到內(nèi)容占用空間大小,以及對服務(wù)器內(nèi)存使用影響。以前都是從書上看Java內(nèi)存布局相關(guān)知識,借這個(gè)機(jī)會深入分析Java對象占用內(nèi)存空間及實(shí)戰(zhàn),加深對Java內(nèi)存布局的理解。
簡單回顧Java對象內(nèi)存布局:對象頭(Header),實(shí)例數(shù)據(jù)(Instance Data)和對齊填充(Padding)。另外,不同環(huán)境Java對象占用內(nèi)存空間可能有所差異。本文實(shí)驗(yàn)環(huán)境如下,HotSpot 64-Bit虛擬機(jī),默認(rèn)開啟指針壓縮(-XX:+UseCompressedOops),結(jié)合如圖1,所以Java對象實(shí)例的對象頭大小為12bytes(8bytes makOop + 4 bytes klassOop), Java數(shù)組實(shí)例的對象頭大小為16bytes(8bytes makOop + 4 bytes klassOop + 4 bytes length);64位Linux系統(tǒng),所以字節(jié)對齊必須是8的倍數(shù)。
xxx:~$ java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

本文驗(yàn)證Java對象占用內(nèi)存空間使用的方法是:org.apache.lucene.util.RamUsageEstimator#sizeOf(java.lang.Object),計(jì)算的對象大小包含本體對象和引用對象的大小,對應(yīng)jar包版本:
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>4.2.0</version>
</dependency>
原生類型(primitive type)
一般技術(shù)文章介紹原生類型占用的存儲空間總會列舉下面表格。那new一個(gè)long對象,占用的內(nèi)存空間是不是8 bytes呢?從圖1Java對象內(nèi)存布局分析看,肯定不止8 bytes。
| Primitive Type | Memory Required(bytes) |
|---|---|
| byte, boolean | 1 |
| short, char | 2 |
| int, float | 4 |
| long, double | 8 |
下面舉例分析Java原生類型對象占用內(nèi)存空間。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
boolean bool = true;
byte b = (byte)0xFF;
short s = (short)1;
char c = 'c';
int i = 1;
float f = 1.0f;
long l = 1L;
double d = 1.0;
System.out.printf("sizeOf(byte) = %s bytes\n", RamUsageEstimator.sizeOf(b));
System.out.printf("sizeOf(boolean) = %s bytes\n", RamUsageEstimator.sizeOf(bool));
System.out.printf("sizeOf(short) = %s bytes\n", RamUsageEstimator.sizeOf(s));
System.out.printf("sizeOf(char) = %s bytes\n", RamUsageEstimator.sizeOf(c));
System.out.printf("sizeOf(int) = %s bytes\n", RamUsageEstimator.sizeOf(i));
System.out.printf("sizeOf(float) = %s bytes\n", RamUsageEstimator.sizeOf(f));
System.out.printf("sizeOf(long) = %s bytes\n", RamUsageEstimator.sizeOf(l));
System.out.printf("sizeOf(double) = %s bytes\n", RamUsageEstimator.sizeOf(d));
}
}
執(zhí)行結(jié)果:
sizeOf(byte) = 16 bytes
sizeOf(boolean) = 16 bytes
sizeOf(short) = 16 bytes
sizeOf(char) = 16 bytes
sizeOf(int) = 16 bytes
sizeOf(float) = 16 bytes
sizeOf(long) = 24 bytes
sizeOf(double) = 24 bytes
分析原生類型對象占用內(nèi)存空間情況:
sizeOf(byte)=12(Header) + 1(Instance Data) + 3(Padding)=16 bytes
sizeOf(boolean)=12(Header) + 1(Instance Data) + 3(Padding)=16 bytes
sizeOf(short)=12(Header) + 2(Instance Data) + 2(Padding)=16 bytes
sizeOf(char)=12(Header) + 2(Instance Data) + 2(Padding)=16 bytes
sizeOf(int)=12(Header) + 4(Instance Data)=16 bytes
sizeOf(float)=12(Header) + 4(Instance Data)=16 bytes
sizeOf(long)=12(Header) + 8(Instance Data) + 4(Padding)=24 bytes
sizeOf(double)=12(Header) + 8(Instance Data) + 4(Padding)=24 bytes
下面進(jìn)一步舉例分析Java原生類型的包裝類對象占用內(nèi)存空間。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
Boolean bool = true;
Byte b = (byte)0xFF;
Short s = (short)1;
Character c = 'c';
Integer i = 1;
Float f = 1.0f;
Long l = 1L;
Double d = 1.0;
System.out.printf("sizeOf(Boolean) = %s bytes\n", RamUsageEstimator.sizeOf(b));
System.out.printf("sizeOf(Byte) = %s bytes\n", RamUsageEstimator.sizeOf(bool));
System.out.printf("sizeOf(Short) = %s bytes\n", RamUsageEstimator.sizeOf(s));
System.out.printf("sizeOf(Character) = %s bytes\n", RamUsageEstimator.sizeOf(c));
System.out.printf("sizeOf(Integer) = %s bytes\n", RamUsageEstimator.sizeOf(i));
System.out.printf("sizeOf(Float) = %s bytes\n", RamUsageEstimator.sizeOf(f));
System.out.printf("sizeOf(Long) = %s bytes\n", RamUsageEstimator.sizeOf(l));
System.out.printf("sizeOf(Double) = %s bytes\n", RamUsageEstimator.sizeOf(d));
}
}
執(zhí)行結(jié)果與原生類型對象內(nèi)存布局分析一致。
sizeOf(Boolean) = 16 bytes
sizeOf(Byte) = 16 bytes
sizeOf(Short) = 16 bytes
sizeOf(Character) = 16 bytes
sizeOf(Integer) = 16 bytes
sizeOf(Float) = 16 bytes
sizeOf(Long) = 24 bytes
sizeOf(Double) = 24 bytes
特殊對象
下面舉例分析null和Object對象占用內(nèi)存空間。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
System.out.printf("sizeOf(null) = %s bytes\n", RamUsageEstimator.sizeOf((Object)null));
System.out.printf("sizeOf(new Object()) = %s bytes\n", RamUsageEstimator.sizeOf(new Object()));
}
}
執(zhí)行結(jié)果如下,說明null對象在內(nèi)存中不分配任何空間;
sizeOf(new Object())=12(Header) + 4(Padding)=16 bytes。
sizeOf(null) = 0 bytes
sizeOf(new Object()) = 16 bytes
數(shù)組
下面舉例分析Java數(shù)組對象占用內(nèi)存空間。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
int[] array0 = new int[0];
int[] array1 = new int[1];
int[] array2 = new int[2];
int[] array3 = new int[3];
int[] array8 = new int[8];
int[] array9 = new int[9];
System.out.printf("sizeOf(array0) = %s bytes\n", RamUsageEstimator.sizeOf(array0));
System.out.printf("length(array0) = %s bytes\n", array0.length);
System.out.printf("sizeOf(array1) = %s bytes\n", RamUsageEstimator.sizeOf(array1));
System.out.printf("sizeOf(array2) = %s bytes\n", RamUsageEstimator.sizeOf(array2));
System.out.printf("sizeOf(array3) = %s bytes\n", RamUsageEstimator.sizeOf(array3));
System.out.printf("sizeOf(array8) = %s bytes\n", RamUsageEstimator.sizeOf(array8));
System.out.printf("sizeOf(array9) = %s bytes\n", RamUsageEstimator.sizeOf(array9));
}
}
執(zhí)行結(jié)果:
sizeOf(array0) = 16 bytes
length(array0) = 0 bytes
sizeOf(array1) = 24 bytes
sizeOf(array2) = 24 bytes
sizeOf(array3) = 32 bytes
sizeOf(array8) = 48 bytes
sizeOf(array9) = 56 bytes
參考圖1,Java數(shù)組實(shí)例的對象頭為16bytes,區(qū)別與Java對象實(shí)例,分析數(shù)組實(shí)例占用內(nèi)存空間情況如下:
sizeOf(array0)=16(Header)=16 bytes
length(array0)=0
sizeOf(array1)=16(Header) + 4(int) + 4(Padding)=24 bytes
sizeOf(array2)=16(Header) + 4(int)*2=24 bytes
sizeOf(array3)=16(Header) + 4(int)*3 + 4(Padding)=32 bytes
sizeOf(array8)=16(Header) + 4(int)*8=48 bytes
sizeOf(array9)=16(Header) + 4(int)*9 + 4(Padding)=56 bytes
String
在JDK1.7及以上版本中,String部分源碼如下,包含String的4個(gè)屬性變量,static變量屬于類,不屬于實(shí)例對象,存放在全局?jǐn)?shù)據(jù)段,普通變量才納入Java對象占用空間的計(jì)算,一個(gè)用于存放字符串?dāng)?shù)據(jù)的char[], 一個(gè)int類型的hashcode。關(guān)于static屬性字段不納入Java對象占用堆空間的驗(yàn)證請看下面自定義對象一節(jié)。
public final class String
implements java.io.Serializable, Comparable<String>, CharSequence {
/** The value is used for character storage. */
private final char value[];
/** Cache the hash code for the string */
private int hash; // Default to 0
/** use serialVersionUID from JDK 1.0.2 for interoperability */
private static final long serialVersionUID = -6849794470754667710L;
private static final ObjectStreamField[] serialPersistentFields =
new ObjectStreamField[0];
}
因此,一個(gè)String本身需要 12(Header) + 4(char[] reference) + 4(int) + 4(Padding) = 24 bytes。
除此之外,一個(gè)char[]占用16(Array Header) + length * 2 bytes(8字節(jié)對齊),length是字符串長度,參考圖2,一個(gè)String對象占用的內(nèi)存空間大小為:
40 + length * 2 bytes + Padding

下面舉例分析Java String對象占用內(nèi)存空間。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
String s0 = "";
String s1 = "a";
String s2 = "aa";
String s4 = "aaaa";
String s5 = "aaaaa";
System.out.printf("sizeOf(s0) = %s bytes\n", RamUsageEstimator.sizeOf(s0));
System.out.printf("sizeOf(s1) = %s bytes\n", RamUsageEstimator.sizeOf(s1));
System.out.printf("sizeOf(s2) = %s bytes\n", RamUsageEstimator.sizeOf(s2));
System.out.printf("sizeOf(s4) = %s bytes\n", RamUsageEstimator.sizeOf(s4));
System.out.printf("sizeOf(s5) = %s bytes\n", RamUsageEstimator.sizeOf(s5));
}
}
執(zhí)行結(jié)果:
sizeOf(s0) = 40 bytes
sizeOf(s1) = 48 bytes
sizeOf(s2) = 48 bytes
sizeOf(s4) = 48 bytes
sizeOf(s5) = 56 bytes
對上述字符串執(zhí)行結(jié)果分析:
sizeOf(s0)=40 + 0 * 2 = 40 bytes
sizeOf(s1)=40 + 1 * 2 + 6(Padding) = 48 bytes
sizeOf(s2)=40 + 2 * 2 + 4(Padding) = 48 bytes
sizeOf(s4)=40 + 4 * 2 = 48 bytes
sizeOf(s2)=40 + 5 * 2 + 6(Padding) = 56 bytes
自定義對象
下面舉例分析Java自定義對象占用內(nèi)存空間。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class Employee {
private long id;
private int age;
public Employee(long id, int age) {
this.id = id;
this.age = age;
}
public static void main(String[] args) {
System.out.printf("sizeOf(Employee) = %s bytes\n", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));
}
}
執(zhí)行結(jié)果:
sizeOf(Employee) = 24 bytes
參看圖3,從Java對象內(nèi)存布局分析數(shù)組對象占用內(nèi)存空間:
sizeOf(Employee) = 12(Header) + 8(long) + 4(int) = 24 bytes

Employee自定義對象新增一個(gè)static字段,如下:
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class Employee {
private long id;
private int age;
// static變量屬于類,不屬于實(shí)例,存放在全局?jǐn)?shù)據(jù)段
private static int staticField = 88;
public Employee(long id, int age) {
this.id = id;
this.age = age;
}
public static void main(String[] args) {
System.out.printf("sizeOf(Employee) = %s bytes", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));
}
}
執(zhí)行結(jié)果如下,證明static變量屬于類,不屬于實(shí)例,存放在全局?jǐn)?shù)據(jù)段,普通變量才納入Java對象占用空間的計(jì)算。
sizeOf(Employee) = 24 bytes
Employee自定義對象引用其他Java對象,如下,引用一個(gè)Long和Integer對象:
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class Employee {
private Long id;
private Integer age;
public Employee(long id, int age) {
this.id = id;
this.age = age;
}
public static void main(String[] args) {
System.out.printf("sizeOf(Employee) = %s bytes\n", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));
}
}
執(zhí)行結(jié)果:
sizeOf(Employee) = 64 bytes
參看圖4,從Java對象內(nèi)存布局分析數(shù)組對象占用內(nèi)存空間:
sizeOf(Employee) = 24(Employee Object) + 24(Long Object) + 16(Integer Object) =64 bytes

ArrayList
在JDK1.7及以上版本中,ArrayList部分源碼如下,包含String的6個(gè)屬性,static變量屬于類,不屬于實(shí)例,存放在全局?jǐn)?shù)據(jù)段,普通變量才納入Java對象占用空間的計(jì)算,一個(gè)用于存放數(shù)組元素的Object[], 一個(gè)int類型的size。
public class ArrayList<E> extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
private static final long serialVersionUID = 8683452581122892189L;
/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;
/**
* Shared empty array instance used for empty instances.
*/
private static final Object[] EMPTY_ELEMENTDATA = {};
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
transient Object[] elementData; // non-private to simplify nested class access
/**
* The size of the ArrayList (the number of elements it contains).
*/
private int size;
}
因此,一個(gè)ArrayList本身需要 12(Header) + 4(Object[] reference) + 4(int) + 4(Padding) = 24 bytes。
除此之外,一個(gè)Object[]占用16(Array Header) + length * 4(Object reference) bytes(8字節(jié)對齊),length是Object[]長度,即ArrayList容量,size是ArrayList存放的元素?cái)?shù)量,其中l(wèi)ength >= size,另加數(shù)組初始化的Object占用的內(nèi)存空間,結(jié)合圖5,所以一個(gè)ArrayList占用的內(nèi)存空間大小為:
((40 + length * 4)(8字節(jié)對齊) + size * n bytes)(8字節(jié)對齊),假設(shè)Object對象占用n bytes,size * n表示只有在數(shù)組初始化的Object才需要分配內(nèi)存空間。

下面舉例分析ArrayList對象占用內(nèi)存空間。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
import java.util.ArrayList;
import java.util.List;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
System.out.printf("sizeOf(ArrayList with 0 capacity) = %s bytes\n", RamUsageEstimator.sizeOf(new ArrayList<>(0)));
System.out.printf("sizeOf(ArrayList with default capacity) = %s bytes\n", RamUsageEstimator.sizeOf(new ArrayList<>()));
List<Integer> list1 = new ArrayList<>(1);
list1.add(1);
System.out.printf("sizeOf(list1 with 1 capacity) = %s bytes\n", RamUsageEstimator.sizeOf(list1));
list1 = new ArrayList<>();
list1.add(1);
System.out.printf("sizeOf(list1 with default capacity) = %s bytes\n", RamUsageEstimator.sizeOf(list1));
}
}
執(zhí)行結(jié)果如下:
sizeOf(ArrayList with 0 capacity) = 40 bytes
sizeOf(ArrayList with default capacity) = 40 bytes
sizeOf(list1 with 1 capacity) = 64 bytes
sizeOf(list1 with default capacity) = 96 bytes
sizeOf(ArrayList with 0 capacity) = 40 bytes分析:構(gòu)造函數(shù)指定initialCapacity=0,sizeOf(ArrayList with 0 capacity) = 40 + 0 * 4(int reference) + 0 * 16(int) = 40 bytes
sizeOf(ArrayList with default capacity) = 40 bytes分析:構(gòu)造函數(shù)new ArrayList()創(chuàng)建elementData為空,sizeOf(ArrayList with default capacity) = 40 + 0 * 4(int reference) + 0 * 16(int) = 40 bytes
sizeOf(list1 with 1 capacity) = 64 bytes分析:構(gòu)造函數(shù)指定initialCapacity=1,sizeOf(ArrayList with 0 capacity) = 40 + 1 * 4(int reference) + 1 * 16(int) + 4(Padding) = 64 bytes
sizeOf(list1 with default capacity) = 96 bytes分析:構(gòu)造函數(shù)new ArrayList()創(chuàng)建elementData為空,當(dāng)?shù)谝淮握{(diào)用add()方法添加元素時(shí),初始化elementData默認(rèn)最小容量為10,size=1。所以sizeOf(list1 with default capacity) = 40 + 10 * 4(int reference) + 1 * 16(int) = 96 bytes
Java列表10萬數(shù)據(jù)占用內(nèi)存空間
下面舉例分析如何評估Java列表10萬數(shù)據(jù)占用內(nèi)存空間。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
import java.util.ArrayList;
import java.util.List;
public class Employee {
private long id;
private int age;
public Employee(long id, int age) {
this.id = id;
this.age = age;
}
public static void main(String[] args) {
System.out.printf("sizeOf(Employee) = %s bytes\n", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));
List<Employee> employeeList = new ArrayList<>(100000);
for (int i = 0; i < 100000; i++) {
employeeList.add(new Employee(123456789L, 28));
}
System.out.printf("sizeOf(List<Employee> contains 10000 Employee object with 10000 capacity) = %s bytes\n", RamUsageEstimator.sizeOf(employeeList));
employeeList = new ArrayList<>();
for (int i = 0; i < 100000; i++) {
employeeList.add(new Employee(123456789L, 28));
}
System.out.printf("sizeOf(List<Employee> contains 10000 Employee object) = %s bytes\n", RamUsageEstimator.sizeOf(employeeList));
}
}
執(zhí)行結(jié)果如下:
sizeOf(Employee) = 24 bytes
sizeOf(List<Employee> contains 100000 Employee object with 10000 capacity) = 2800040 bytes
sizeOf(List<Employee> contains 100000 Employee object) = 2826880 bytes
根據(jù)上一節(jié)對ArrayList對象的分析:
sizeOf(List<Employee> contains 100000 Employee object with 100000 capacity) = 2800040 bytes = 40 + 100000 * 4(Employee Reference) + 100000 * 24(Employee Object)
如果在new ArrayList沒有指定capacity或者列表大小大于capacity,列表的elementData會進(jìn)行擴(kuò)容,將老數(shù)組中的元素重新拷貝一份到新的數(shù)組中,每次elementData擴(kuò)容的增長是原容量的1.5倍。所以為了擴(kuò)容ArrayList以放置10000數(shù)據(jù),capacity初始值默認(rèn)為10,capacity最終值為106710,計(jì)算如下:
package study;
public class StudyTest {
public static void main(String[] args) {
int capacity = 10;
while (true) {
capacity += capacity * 0.5;
if (capacity >= 100000) {
break;
}
}
System.out.println(capacity);
}
}
最后,上述執(zhí)行結(jié)果的最后一行分析如下:
sizeOf(List<Employee> contains 100000 Employee object) = 2826880 bytes = 40 + 106710 * 4(Employee Reference) + 100000 * 24(Employee Object) = 2.696MB
延伸實(shí)踐
- 大家可以根據(jù)上面分析方法實(shí)踐HashMap、枚舉類或者自定義對象。
- 結(jié)合上述代碼,大家可以使用-XX:-UseCompressedOops關(guān)閉壓縮指針,執(zhí)行代碼驗(yàn)證對象頭大小變化對Java對象占用內(nèi)存空間的影響。