Sunday, October 19, 2008

debut JSON

I've checked in some JSON functionality, specifically a lightly modified version of Crockford's json2.js. Crockford's code had global effects, and I dislike that, so I made some changes. The parse and stringify functions appear within extensions.org.icongarden.json (big surprise). As well, the extension doesn't modify the prototype of Date, which means stringify doesn't convert instances of Date, which isn't part of JSON anyway. If I end up doing anything special with Date, it won't involve altering any prototype.

There was a discussion on the V8 list recently about JSON performance, with various folk making all kinds of claims unsupported by data, so I thought I'd generate some. Note this is the farthest thing from scientific or comprehensive, but it's a start. I wrote a test script with collects a hierarchy of objects describing some sub-directories on my disk, converts it to JSON, and finally converts it back to the hierarchy it started with. Each object contains a couple of strings and a number in addition to zero or more child nodes. Here's the output of my test script:
  • milliseconds to enumerate: 203
  • milliseconds to stringify: 2053
  • milliseconds to parse: 1060

Surprisingly, the enumerate operation is by far the fastest. It's surprising because enumerating the sub-directories involves several hundred system calls, whereas I presume the stringify and parse operations involve none (aside, perhaps, from the occasional brk). Worse, at least some of these system calls touch the disk.

Let's ignore the stringify time for a moment and focus on the parse time. It's over five times the enumerate time, which is surprising enough, but when you consider how parse works, it's even more surprising. This function is basically a validating front end for eval involving a regular expression, which makes me wonder if V8's eval or regular expression engine could be a lot faster.

Finally, let's ponder stringify. It's all JavaScript, so it makes sense that it's slower than parse, but an order of magnitude slower than the enumerate operation? That seems crazy.

I'm going to leave things the way they are now because the important thing is to establish an interface for these functions so I can build other things atop it, but I have a feeling I'll be coming back to this.

Update: I renamed parse and stringify as decode and encode, respectively. Now I can refer to the JSON extension as a "codec" and stick out my chest in some kind of geek macho display.

3 comments:

Anonymous said...

You may be able to use the builtin V8 profiler to get information on where time is spent when running the stringify code. On Linux, I would try something ala:

$ ./shell -prof test.js
$ tools/linux-tick-processor.py \
v8.log

The profiler is not very well documented (see issue 83 on the V8 issue tracker), but I hope you'll find it somewhat useful anyway.

Pete said...

Well, OK, the first thing to do here is time the production build instead of the debug build. Duh.

milliseconds to enumerate: 14
milliseconds to stringify: 67
milliseconds to parse: 107

Although the absolute numbers are completely different, the ratios are still not making sense. stringify still takes nearly 5 times as long as hitting the disk, and parse is talking more than 7.5 times as long as hitting the disk.

Here is the output of the Python script you suggested:

Statistical profiling result from /v8.log, (11 ticks, 0 unaccounted, 0 excluded).

[Shared libraries]:
total nonlib name

[JavaScript]:
total nonlib name
9.1% 9.1% LazyCompile: Join

[C++]:
total nonlib name
9.1% 9.1% v8::internal::Scanner::Next()
9.1% 9.1% v8::internal::NewSpace::Size()
9.1% 9.1% .L958
9.1% 9.1% v8::internal::JSObject::SetFastElement(unsigned int, v8::internal::Object*)
9.1% 9.1% v8::internal::JSObject::GetPropertyAttributeWithReceiver(v8::internal::JSObject*, v8::internal::String*)
9.1% 9.1% v8::internal::Heap::AllocateStringFromAscii(v8::internal::Vector<char const>, v8::internal::PretenureFlag)
9.1% 9.1% v8::internal::SweepSpace(v8::internal::PagedSpace*, void (*)(unsigned char*, int))
9.1% 9.1% kjs_pcre_xclass(int, unsigned char const*)
9.1% 9.1% jsRegExpExecute(JSRegExp const*, unsigned short const*, int, int, int*, int)
9.1% 9.1% .L19

This looks like pretty coarse data, so let's try a bigger sub-directory tree, say, I don't know... root? Next comment...

Pete said...

Running the test script against root produces numbers that make a lot more sense.

milliseconds to enumerate: 21357
milliseconds to stringify: 2682
milliseconds to parse: 3995

I suspect what we just learned here is that there's a lot of overhead associated with JSON transformations, but once they get started, they're not so bad.

The profile data looks a lot more digestible, too, but I don't know if I'm even going to look at it very carefully, since the timing data already makes a lot more sense.

Statistical profiling result from /v8.log, (450 ticks, 76 unaccounted, 0 excluded).

[Shared libraries]:
total nonlib name
2.9% 0.0% /lib/tls/i686/cmov/libc-2.7.so

[JavaScript]:
total nonlib name
1.1% 1.1% Function: str
0.5% 0.6% LazyCompile: exec
0.5% 0.6% LazyCompile: ADD
0.5% 0.6% Builtin: KeyedLoadIC_Generic
0.5% 0.6% Builtin: LoadIC_Megamorphic
0.5% 0.6% Stub: CEntryStub
0.3% 0.3% LazyCompile: StringReplaceRegExp
0.3% 0.3% LazyCompile: ConvertToString
0.3% 0.3% LazyCompile: hasOwnProperty
0.3% 0.3% LazyCompile: ToString
0.3% 0.3% Stub: CallFunction
0.3% 0.3% Stub: CallFunction
0.3% 0.3% Stub: ToBoolean
0.3% 0.3% Builtin: FunctionCall
0.3% 0.3% Builtin: KeyedStoreIC_Generic
0.3% 0.3% CallIC: test
0.3% 0.3% LoadIC: global
0.3% 0.3% CallIC: push
0.3% 0.3% LoadIC: global
0.3% 0.3% Eval: $lt;anonymous>

[C++]:
total nonlib name
15.2% 15.7% v8::internal::String::Flatten(v8::internal::String*, v8::internal::String*, int, int, int)
9.6% 9.9% kjs_pcre_xclass(int, unsigned char const*)
3.7% 3.9% __i686.get_pc_thunk.bx
2.7% 2.8% match(unsigned short const*, unsigned char const*, int, MatchData&)
2.4% 2.5% v8::internal::Parser::ParseObjectLiteral(bool*)
2.4% 2.5% v8::String::WriteUtf8(char*, int)
2.4% 2.5% _init
2.1% 2.2% v8::internal::MarkingVisitor::VisitPointers(v8::internal::Object**, v8::internal::Object**)
1.9% 1.9% jsRegExpExecute(JSRegExp const*, unsigned short const*, int, int, int*, int)
1.6% 1.7% v8::internal::String::ReadBlock(v8::internal::String*, unsigned char*, unsigned int, unsigned int*, unsigned int*)
1.6% 1.7% v8::internal::CopyVisitor::VisitPointers(v8::internal::Object**, v8::internal::Object**)
1.3% 1.4% v8::internal::Heap::IterateRSetRange(unsigned char*, unsigned char*, unsigned char*, void (*)(v8::internal::HeapObject**))
1.3% 1.4% v8::internal::RelocIterator::next()
1.3% 1.4% .L37
1.3% 1.4% .L57
1.1% 1.1% v8::internal::MarkCompactCollector::GetForwardingAddressInOldSpace(v8::internal::HeapObject*)
1.1% 1.1% v8::internal::PagedSpace::FindObject(unsigned char*)
1.1% 1.1% v8::internal::HeapObject::SlowSizeFromMap(v8::internal::Map*)
1.1% 1.1% v8::internal::SweepSpace(v8::internal::PagedSpace*, void (*)(unsigned char*, int))
1.1% 1.1% v8::internal::Heap::AllocateConsString(v8::internal::String*, v8::internal::String*)
1.1% 1.1% v8::HandleScope::CreateHandle(void*)
0.8% 0.8% v8::internal::OldSpace::PageAllocationTop(v8::internal::Page*)
0.8% 0.8% v8::internal::MemoryAllocator::InitializePagesInChunk(int, int, v8::internal::PagedSpace*)
0.8% 0.8% v8::internal::UTF8Buffer::AddChar(int)
0.8% 0.8% v8::internal::MarkCompactCollector::EmptyMarkingStack(v8::internal::MarkingVisitor*)
0.8% 0.8% v8::internal::RegExpImpl::StringToTwoByte(v8::internal::Handle<v8::internal::String>)
0.8% 0.8% v8::internal::Heap::AllocateFixedArray(int, v8::internal::PretenureFlag)
0.8% 0.8% v8::internal::Heap::CopyObject(v8::internal::HeapObject**)
0.8% 0.8% v8::internal::Heap::MigrateObject(v8::internal::HeapObject**, v8::internal::HeapObject*, int)
0.8% 0.8% v8::internal::ObjectLiteral::Property::Property(v8::internal::Literal*, v8::internal::Expression*)
0.5% 0.6% v8::internal::Runtime_NumberToString(v8::internal::Arguments)
0.5% 0.6% v8::internal::UTF16Buffer::Advance()
0.5% 0.6% v8::internal::Utf8SymbolKey::IsMatch(v8::internal::Object*)
0.5% 0.6% v8::internal::String::ReadBlock(v8::internal::String*, v8::internal::String::ReadBlockBuffer*, unsigned int*, unsigned int)
0.5% 0.6% v8::internal::String::IsEqualTo(v8::internal::Vector<char const>)
0.5% 0.6% v8::internal::JSObject::SetFastElement(unsigned int, v8::internal::Object*)
0.5% 0.6% v8::internal::JSObject::GetPropertyAttribute(v8::internal::JSObject*, v8::internal::LookupResult*, v8::internal::String*, bool)
0.5% 0.6% v8::internal::JSObject::LocalLookupRealNamedProperty(v8::internal::String*, v8::internal::LookupResult*)
0.5% 0.6% v8::internal::JSObject::LookupInDescriptor(v8::internal::String*, v8::internal::LookupResult*)
0.5% 0.6% v8::internal::String::ReadBlockIntoBuffer(v8::internal::String*, v8::internal::String::ReadBlockBuffer*, unsigned int*, unsigned int)
0.5% 0.6% v8::internal::HeapObject::IterateBody(v8::internal::InstanceType, int, v8::internal::ObjectVisitor*)
0.5% 0.6% v8::internal::MarkCompactCollector::MarkUnmarkedObject(v8::internal::HeapObject*)
0.5% 0.6% v8::internal::RegExpImpl::JsreExecGlobal(v8::internal::Handle<v8::internal::JSRegExp>, v8::internal::Handle<v8::internal::String>)
0.5% 0.6% v8::internal::Heap::AllocateFixedArrayWithHoles(int)
0.5% 0.6% v8::internal::Heap::InitializeJSObjectFromMap(v8::internal::JSObject*, v8::internal::FixedArray*, v8::internal::Map*)
0.5% 0.6% v8::internal::Context::global_context()
0.5% 0.6% v8::internal::Context::Lookup(v8::internal::Handle<v8::internal::String>, v8::internal::ContextLookupFlags, int*, PropertyAttributes*)
0.5% 0.6% .L19
0.5% 0.6% .L48
0.3% 0.3% v8::internal::OldSpace::MCAdjustRelocationEnd(unsigned char*, int)
0.3% 0.3% v8::internal::UsageComputer::ReadList(v8::internal::ZoneList<v8::internal::ObjectLiteral::Property*>*)
0.3% 0.3% unibrow::Utf8::ReadBlock(unibrow::Buffer<char const*>, unsigned char*, unsigned int, unsigned int*, unsigned int*)
0.3% 0.3% v8::internal::LargeObjectSpace::AllocateRawInternal(int, int, v8::internal::Executability)
0.3% 0.3% v8::internal::OldSpace::SlowAllocateRaw(int)
0.3% 0.3% void v8::internal::MarkCompactCollector::EncodeForwardingAddressesInPagedSpace<&(v8::internal::MCAllocateFromOldPointerSpace(v8::internal::HeapObject*, int)), &(v8::internal::IgnoreNonLiveObject(v8::internal::HeapObject*))>(v8::internal::PagedSpace*)
0.3% 0.3% v8::internal::Scanner::ScanString()
0.3% 0.3% v8::internal::Scanner::Next()
0.3% 0.3% v8::internal::LargeObjectSpace::Contains(v8::internal::HeapObject*)
0.3% 0.3% v8::internal::Scope::LookupRecursive(v8::internal::Handle<v8::internal::String>, bool)
0.3% 0.3% v8::internal::Runtime_KeyedGetProperty(v8::internal::Arguments)
0.3% 0.3% v8::internal::Runtime::StringMatch(v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::String>, int)
0.3% 0.3% v8::internal::String::ComputeArrayIndex(unibrow::CharacterStream*, unsigned int*, int)
0.3% 0.3% v8::internal::Runtime_StringBuilderConcat(v8::internal::Arguments)
0.3% 0.3% v8::internal::LoadContextSlotHelper(v8::internal::Arguments, bool)
0.3% 0.3% v8::internal::SymbolsKey::IsMatch(v8::internal::Object*)
0.3% 0.3% v8::internal::Parser::ParseConditionalExpression(bool, bool*)
0.3% 0.3% .L958
0.3% 0.3% v8::internal::Parser::NewNumberLiteral(double)
0.3% 0.3% v8::internal::Parser::GetLiteralUndefined()
0.3% 0.3% v8::internal::Parser::Expect(v8::internal::Token::Value, bool*)
0.3% 0.3% v8::internal::JSObject::IsSimpleEnum()
0.3% 0.3% v8::internal::JSObject::LocalLookup(v8::internal::String*, v8::internal::LookupResult*)
0.3% 0.3% v8::internal::JSObject::Lookup(v8::internal::String*, v8::internal::LookupResult*)
0.3% 0.3% v8::internal::JSObject::GetLocalPropertyAttribute(v8::internal::String*)
0.3% 0.3% v8::internal::Object::GetProperty(v8::internal::Object*, v8::internal::LookupResult*, v8::internal::String*, PropertyAttributes*)
0.3% 0.3% v8::internal::DescriptorArray::BinarySearch(v8::internal::String*, int, int)
0.3% 0.3% v8::internal::Object::GetPrototype()
0.3% 0.3% v8::internal::MarkingVisitor::VisitCodeTarget(v8::internal::RelocInfo*)
0.3% 0.3% v8::internal::SymbolTableCleaner::VisitPointers(v8::internal::Object**, v8::internal::Object**)
0.3% 0.3% v8::internal::OverflowObjectSize(v8::internal::HeapObject*)
0.3% 0.3% v8::internal::MarkCompactCollector::DeallocateOldDataBlock(unsigned char*, int)
0.3% 0.3% v8::internal::LoadIC::Load(v8::internal::InlineCacheState, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::String>)
0.3% 0.3% v8::internal::StoreIC::UpdateCaches(v8::internal::LookupResult*, v8::internal::InlineCacheState, v8::internal::Handle<v8::internal::JSObject>, v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::Object>)
0.3% 0.3% v8::internal::Heap::AllocateRawAsciiString(int, v8::internal::PretenureFlag)
0.3% 0.3% v8::internal::Heap::AllocateStringFromUtf8(v8::internal::Vector<char const>, v8::internal::PretenureFlag)
0.3% 0.3% v8::internal::Heap::IterateRSet(v8::internal::PagedSpace*, void (*)(v8::internal::HeapObject**))
0.3% 0.3% v8::internal::Heap::AllocateRawTwoByteString(int, v8::internal::PretenureFlag)
0.3% 0.3% v8::internal::Heap::RebuildRSets(v8::internal::PagedSpace*)
0.3% 0.3% v8::internal::Heap::UpdateRSet(v8::internal::HeapObject*)
0.3% 0.3% v8::internal::Heap::GetNumberStringCache(v8::internal::Object*)
0.3% 0.3% v8::internal::Operand::Operand(v8::internal::Register, int, v8::internal::RelocInfo::Mode)
0.3% 0.3% v8::internal::Assembler::j(v8::internal::Condition, v8::internal::Label*, v8::internal::Hint)
0.3% 0.3% v8::internal::FlattenString(v8::internal::Handle<v8::internal::String>)
0.3% 0.3% v8::internal::Factory::NewJSArrayWithElements(v8::internal::Handle<v8::internal::FixedArray>, v8::internal::PretenureFlag)
0.3% 0.3% v8::internal::StackFrameIterator::Advance()
0.3% 0.3% v8::internal::Bootstrapper::IsActive()
0.3% 0.3% v8::internal::FunctionLiteral::Accept(v8::internal::Visitor*)
0.3% 0.3% v8::internal::Assembler::and_(v8::internal::Register, int)
0.3% 0.3% v8::internal::Assembler::call(v8::internal::Handle<v8::internal::Code>, v8::internal::RelocInfo::Mode)
0.3% 0.3% v8::internal::Assembler::push(v8::internal::Immediate const&)
0.3% 0.3% v8::String::New(char const*, int)
0.3% 0.3% .L644
0.3% 0.3% .L51
0.3% 0.3% .L28