本文是對(duì)LLVM 7.0.1文檔《MCJIT Design and Implementation》的選擇性意譯,并在關(guān)鍵處附上相應(yīng)源碼。
引言
本文檔描述MCJIT執(zhí)行引擎與RuntimeDyld組件的內(nèi)部過程。這是一份層次比較高的概述,主要展示代碼生成與動(dòng)態(tài)鏈接的流程以及過程中對(duì)象之間的交互。
引擎創(chuàng)建
多數(shù)情況下,我們使用EngineBuilder來(lái)創(chuàng)建MCJIT執(zhí)行引擎的實(shí)例。EngineBuilder的構(gòu)造函數(shù)接受一個(gè)llvm::Module對(duì)象作為參數(shù)。
// lib/ExecutionEngine/ExecutionEngine.cpp
ExecutionEngine::ExecutionEngine(std::unique_ptr<Module> M)
: DL(M->getDataLayout()), LazyFunctionCreator(nullptr) {
Init(std::move(M));
}
此外,可以設(shè)置MCJIT引擎所需的各種選項(xiàng),包括是否使用MCJIT(引擎類型)。
// tools/lli/lli.cpp
int main(int argc, char **argv, char * const *envp) {
...
builder.setMArch(MArch);
builder.setMCPU(getCPUStr());
builder.setMAttrs(getFeatureList());
if (RelocModel.getNumOccurrences())
builder.setRelocationModel(RelocModel);
if (CMModel.getNumOccurrences())
builder.setCodeModel(CMModel);
builder.setErrorStr(&ErrorMsg);
builder.setEngineKind(ForceInterpreter
? EngineKind::Interpreter
: EngineKind::JIT);
builder.setUseOrcMCJITReplacement(UseJITKind == JITKind::OrcMCJITReplacement);
...
}
值得注意的是EngineBuilder::setMCJITMemoryManager函數(shù),如果此時(shí)沒有顯式地創(chuàng)建一個(gè)內(nèi)存管理器,初始化MCJIT引擎時(shí)就會(huì)自動(dòng)創(chuàng)建默認(rèn)的內(nèi)存管理器SectionMemoryManager。
設(shè)置好選項(xiàng)后,EngineBuilder::create開始創(chuàng)建MCJIT引擎實(shí)例,如果沒有傳入TargetMachine參數(shù),將根據(jù)目標(biāo)triple以及創(chuàng)建EngineBuilder時(shí)的模塊自動(dòng)創(chuàng)建一個(gè)合適的TargetMachine。
// include/llvm/ExecutionEngine/ExecutionEngine.h
class EngineBuilder {
...
ExecutionEngine *create() {
return create(selectTarget());
}
ExecutionEngine *create(TargetMachine *TM);
...
}

EngineBuilder::create調(diào)用MCJIT::createJIT函數(shù)(實(shí)際上是指向MCJIT::createJIT的函數(shù)指針ExecutionEngine::MCJITCtor),將模塊、內(nèi)存管理器、TM對(duì)象的指針傳給它,此后就由MCJIT對(duì)象來(lái)管理它們。
// lib/ExecutionEngine/ExecutionEngine.cpp
ExecutionEngine *EngineBuilder::create(TargetMachine *TM) {
...
ExecutionEngine *EE = nullptr;
if (ExecutionEngine::OrcMCJITReplacementCtor && UseOrcMCJITReplacement) {
EE = ExecutionEngine::OrcMCJITReplacementCtor(ErrorStr, std::move(MemMgr),
std::move(Resolver),
std::move(TheTM));
EE->addModule(std::move(M));
} else if (ExecutionEngine::MCJITCtor)
EE = ExecutionEngine::MCJITCtor(std::move(M), ErrorStr, std::move(MemMgr),
std::move(Resolver), std::move(TheTM));
...
}

MCJIT有個(gè)成員變量RuntimeDyld Dyld,它負(fù)責(zé)MCJIT和RuntimeDyldImpl對(duì)象之間的通信,RuntimeDyldImpl對(duì)象在對(duì)象加載時(shí)創(chuàng)建。
MCJIT在創(chuàng)建時(shí)從EngineBuilder手上接過了Module指針,但并不會(huì)立馬生成模塊代碼,代碼生成推遲到調(diào)用MCJIT::finalizeObject或MCJIT::getPointerToFunction時(shí)進(jìn)行(這兩個(gè)函數(shù)lli.cpp將先后調(diào)用)。
代碼生成
進(jìn)入代碼生成后,MCJIT首先嘗試從ObjectCache*類型的成員變量ObjCache中獲取對(duì)象鏡像,如果獲取不到,調(diào)用MCJIT::emitObject。
// lib/ExecutionEngine/MCJIT/MCJIT.cpp
void MCJIT::generateCodeForModule(Module *M) {
···
std::unique_ptr<MemoryBuffer> ObjectToLoad;
// Try to load the pre-compiled object from cache if possible
if (ObjCache)
ObjectToLoad = ObjCache->getObject(M);
assert(M->getDataLayout() == getDataLayout() && "DataLayout Mismatch");
// If the cache did not contain a suitable object, compile the object
if (!ObjectToLoad) {
ObjectToLoad = emitObject(M);
assert(ObjectToLoad && "Compilation did not produce an object.");
}
...
}
MCJIT::emitObject分別創(chuàng)建legacy::PassManager實(shí)例和ObjectBufferStream(raw_svector_ostream)實(shí)例,并在調(diào)用PassManager::run之前傳入TargetMachine::addPassesToEmitMC。
// lib/ExecutionEngine/MCJIT/MCJIT.cpp
std::unique_ptr<MemoryBuffer> MCJIT::emitObject(Module *M) {
...
legacy::PassManager PM;
// The RuntimeDyld will take ownership of this shortly
SmallVector<char, 4096> ObjBufferSV;
raw_svector_ostream ObjStream(ObjBufferSV);
// Turn the machine code intermediate representation into bytes in memory
// that may be executed.
if (TM->addPassesToEmitMC(PM, Ctx, ObjStream, !getVerifyModules()))
report_fatal_error("Target does not support MC emission!");
// Initialize passes.
PM.run(*M);
...
}

PassManager借助成員變量PassManagerImpl *PM完成具體工作,PassManager::run的本質(zhì)就是PassManagerImpl::run。
// lib/IR/LegacyPassManager.cpp
/// run - Execute all of the passes scheduled for execution. Keep track of
/// whether any of the passes modifies the module, and if so, return true.
bool PassManager::run(Module &M) {
return PM->run(M);
}
PassManagerImpl::run在ObjectBufferStream對(duì)象中生成完整的、可重定位的二進(jìn)制對(duì)象鏡像(是ELF還是MachO取決于平臺(tái))。如果啟用了對(duì)象緩存,此時(shí)還會(huì)將鏡像傳給ObjectCache。
至此,ObjectBufferStream中包含了原始的對(duì)象鏡像,在運(yùn)行之前,該鏡像中的代碼段和數(shù)據(jù)段必須加載到合適內(nèi)存空間,必須進(jìn)行重定位,以及內(nèi)存準(zhǔn)許和代碼緩存作廢(如果需要的話)。
對(duì)象加載
現(xiàn)在,從ObjectCache中獲取的也好,直接生成的也罷,總之有了對(duì)象鏡像,將其傳給RuntimeDyld進(jìn)行加載。
// lib/ExecutionEngine/MCJIT/MCJIT.cpp
void MCJIT::generateCodeForModule(Module *M) {
...
// Load the object into the dynamic linker.
// MCJIT now owns the ObjectImage pointer (via its LoadedObjects list).
Expected<std::unique_ptr<object::ObjectFile>> LoadedObject =
object::ObjectFile::createObjectFile(ObjectToLoad->getMemBufferRef());
if (!LoadedObject) {
std::string Buf;
raw_string_ostream OS(Buf);
logAllUnhandledErrors(LoadedObject.takeError(), OS, "");
OS.flush();
report_fatal_error(Buf);
}
std::unique_ptr<RuntimeDyld::LoadedObjectInfo> L =
Dyld.loadObject(*LoadedObject.get());
...
}
RuntimeDyld根據(jù)對(duì)象的文件類型創(chuàng)建RuntimeDyldELF或RuntimeDyldMachO(兩者都是RuntimeDyldImpl的子類)實(shí)例,然后調(diào)用RuntimeDyldImpl::loadObject完成具體的加載動(dòng)作。
// lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
std::unique_ptr<RuntimeDyld::LoadedObjectInfo>
RuntimeDyld::loadObject(const ObjectFile &Obj) {
if (!Dyld) {
if (Obj.isELF())
Dyld =
createRuntimeDyldELF(static_cast<Triple::ArchType>(Obj.getArch()),
MemMgr, Resolver, ProcessAllSections, Checker);
else if (Obj.isMachO())
Dyld = createRuntimeDyldMachO(
static_cast<Triple::ArchType>(Obj.getArch()), MemMgr, Resolver,
ProcessAllSections, Checker);
else if (Obj.isCOFF())
Dyld = createRuntimeDyldCOFF(
static_cast<Triple::ArchType>(Obj.getArch()), MemMgr, Resolver,
ProcessAllSections, Checker);
else
report_fatal_error("Incompatible object format!");
}
if (!Dyld->isCompatibleFile(Obj))
report_fatal_error("Incompatible object format!");
auto LoadedObjInfo = Dyld->loadObject(Obj);
MemMgr.notifyObjectLoaded(*this, Obj);
return LoadedObjInfo;
}

這里使用的是Linux平臺(tái),因此觀察RuntimeDyldELF::loadObject,它調(diào)用了RuntimeDyldImpl::loadObjectImpl:
// lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
std::unique_ptr<RuntimeDyld::LoadedObjectInfo>
RuntimeDyldELF::loadObject(const object::ObjectFile &O) {
if (auto ObjSectionToIDOrErr = loadObjectImpl(O))
return llvm::make_unique<LoadedELFObjectInfo>(*this, *ObjSectionToIDOrErr);
else {
HasError = true;
raw_string_ostream ErrStream(ErrorStr);
logAllUnhandledErrors(ObjSectionToIDOrErr.takeError(), ErrStream, "");
return nullptr;
}
}
RuntimeDyldImpl::loadObjectImpl遍歷object::ObjectFile對(duì)象中的符號(hào),存入JITSymbolResover::LookupSet Symbols結(jié)構(gòu)中,逐一解析,每個(gè)函數(shù)和數(shù)據(jù)的相應(yīng)段都加載到內(nèi)存,隨后調(diào)用RuntimeDyldImpl::emitCommonSymbols為COMMON符號(hào)構(gòu)建段。
隨后,RuntimeDyldImpl::loadObject遍歷object::ObjectFile對(duì)象中的段,使用RuntimeDyldELF::processRelocationRef完成每一段中各項(xiàng)重定位的處理。

至此,代碼和數(shù)據(jù)段已在內(nèi)存就緒,重定位信息已備好,但還沒有進(jìn)行重定位,尚不能運(yùn)行。
地址重映射
如果需要給外部程序使用,代碼生成后,調(diào)用finalizeObject前,使用MCJIT::mapSectionAddress進(jìn)行各段地址的重映射,映射到外部進(jìn)程的地址空間。這一步需在段內(nèi)存被拷到新地址之前完成。
MCJIT::mapSectionAddress調(diào)用RuntimeDyldImpl::mapSectionAddress完成具體工作。
最后的準(zhǔn)備工作(重定位)
MCJIT::finalizeObject使用RuntimeDyld::resolveRelocations完成當(dāng)前對(duì)象的外部符號(hào)重定位。
// lib/ExecutionEngine/MCJIT/MCJIT.cpp
void MCJIT::finalizeObject() {
MutexGuard locked(lock);
// Generate code for module is going to move objects out of the 'added' list,
// so we need to copy that out before using it:
SmallVector<Module*, 16> ModsToAdd;
for (auto M : OwnedModules.added())
ModsToAdd.push_back(M);
for (auto M : ModsToAdd)
generateCodeForModule(M);
finalizeLoadedModules();
}
void MCJIT::finalizeLoadedModules() {
MutexGuard locked(lock);
// Resolve any outstanding relocations.
Dyld.resolveRelocations();
OwnedModules.markAllLoadedModulesAsFinalized();
// Register EH frame data for any module we own which has been loaded
Dyld.registerEHFrames();
// Set page permissions.
MemMgr->finalizeMemory();
}
// lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
// Resolve the relocations for all symbols we currently know about.
void RuntimeDyldImpl::resolveRelocations() {
MutexGuard locked(lock);
// Print out the sections prior to relocation.
LLVM_DEBUG(for (int i = 0, e = Sections.size(); i != e; ++i)
dumpSectionMemory(Sections[i], "before relocations"););
// First, resolve relocations associated with external symbols.
if (auto Err = resolveExternalSymbols()) {
HasError = true;
ErrorStr = toString(std::move(Err));
}
// Iterate over all outstanding relocations
for (auto it = Relocations.begin(), e = Relocations.end(); it != e; ++it) {
// The Section here (Sections[i]) refers to the section in which the
// symbol for the relocation is located. The SectionID in the relocation
// entry provides the section to which the relocation will be applied.
int Idx = it->first;
uint64_t Addr = Sections[Idx].getLoadAddress();
LLVM_DEBUG(dbgs() << "Resolving relocations Section #" << Idx << "\t"
<< format("%p", (uintptr_t)Addr) << "\n");
resolveRelocationList(it->second, Addr);
}
Relocations.clear();
// Print out sections after relocation.
LLVM_DEBUG(for (int i = 0, e = Sections.size(); i != e; ++i)
dumpSectionMemory(Sections[i], "after relocations"););
}

外部符號(hào)解析(resolveExternalSymbols)由RTDyldMemoryManager::getPointerToNamedFunction(暫時(shí)沒找到聯(lián)系)完成:
// lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
Error RuntimeDyldImpl::resolveExternalSymbols() {
...
while (!ExternalSymbolRelocations.empty()) {
...
if (Name.size() == 0) {
...
resolveRelocationList(Relocs, 0);
} else {
...
resolveRelocationList(Relocs, Addr);
...
}
...
}
...
}
RuntimeDyld隨后遍歷重定向列表(resolveRelocationList),逐一調(diào)用RuntimeDyldELF::resolveRelocation實(shí)現(xiàn)針對(duì)不同平臺(tái)的重定向。
// lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
void RuntimeDyldImpl::resolveRelocationList(const RelocationList &Relocs,
uint64_t Value) {
for (unsigned i = 0, e = Relocs.size(); i != e; ++i) {
const RelocationEntry &RE = Relocs[i];
// Ignore relocations for sections that were not loaded
if (Sections[RE.SectionID].getAddress() == nullptr)
continue;
resolveRelocation(RE, Value);
}
}
// lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
void RuntimeDyldELF::resolveRelocation(const SectionEntry &Section,
uint64_t Offset, uint64_t Value,
uint32_t Type, int64_t Addend,
uint64_t SymOffset, SID SectionID) {
switch (Arch) {
case Triple::x86_64:
resolveX86_64Relocation(Section, Offset, Value, Type, Addend, SymOffset);
break;
case Triple::x86:
resolveX86Relocation(Section, Offset, (uint32_t)(Value & 0xffffffffL), Type,
(uint32_t)(Addend & 0xffffffffL));
break;
case Triple::aarch64:
case Triple::aarch64_be:
resolveAArch64Relocation(Section, Offset, Value, Type, Addend);
break;
case Triple::arm: // Fall through.
case Triple::armeb:
case Triple::thumb:
case Triple::thumbeb:
resolveARMRelocation(Section, Offset, (uint32_t)(Value & 0xffffffffL), Type,
(uint32_t)(Addend & 0xffffffffL));
break;
case Triple::ppc:
resolvePPC32Relocation(Section, Offset, Value, Type, Addend);
break;
case Triple::ppc64: // Fall through.
case Triple::ppc64le:
resolvePPC64Relocation(Section, Offset, Value, Type, Addend);
break;
case Triple::systemz:
resolveSystemZRelocation(Section, Offset, Value, Type, Addend);
break;
case Triple::bpfel:
case Triple::bpfeb:
resolveBPFRelocation(Section, Offset, Value, Type, Addend);
break;
default:
llvm_unreachable("Unsupported CPU type!");
}
}
重定位完成后,將段數(shù)據(jù)交給RTDyldMemoryManager::registerEHFrames,由此內(nèi)存管理器可以調(diào)用函數(shù)。
最終,使用MemMgr->finalizeMemory()(SectionMemoryManager::finalizeMemory)獲得內(nèi)存頁(yè)的使用許可。
2020年10月22、23日