-
-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression in 2.2.3 #148
Comments
Can you verify if 2.3.3 has the same issue? |
Just tried it, looks like 2.3.3 is slow too. |
Btw just to be totally clear-- timeJacksonJava runs about the same speed in 2.2.2, 2.2.3, & 2.3.3 (~170MB/s on my machine). timeJacksonScala runs at about 170MB/s as well on 2.2.2, but ~1MB/s on 2.2.3+ |
I noticed a similar issue and the cause seems to be the changes for #89. In particular a profile of our application shows the time is being spent calling val testJson = {
val str = new StringBuilder
str.append("{\"values\":[")
(0 until 100000).foreach { i =>
str.append(s"[${i.toDouble}],")
}
str.append(s"[${0.0}]")
str.append("]}")
str.toString
}
def main(args: Array[String]): Unit = {
(0 until 10).foreach { i =>
val s = System.nanoTime
val m = deserialize[Map[String,AnyRef]](testJson)
val e = System.nanoTime
println(s"${m.size}, ${(e - s) / 1e6}ms")
}
} If I run with
If I run it without that flag I get:
|
I also noticed significant performance degradation with the jackson scala module library starting from 2.2.3, here are my tests: Recently I performed performance tests on one of our server side components and saw very significant impact on performance. After some investigations I have managed to isolate the culprit of the performance degradation, it is the Jackson library (for json parsing), here are the results: In my tests I use some relatively big json string ~11KBytes. def measure(repeat : Int, times : Int) = {
var start = 0L
var end = 0L
var diff = 0L
var sum = 0L
(1 to repeat).foreach { rep =>
start = System.currentTimeMillis()
(1 to times).par.foreach { i =>
val hh = JsonLayer.JSON_MAPPER.readValue(householdString("hh" + i * rep), classOf[Household])
//val st = JsonLayer.JSON_MAPPER.writeValueAsString(hh)
}
end = System.currentTimeMillis()
diff = (end - start)
sum += diff
logger.debug("Round " + rep + " took " + diff + " millis")
}
sum
}
@Test
def householdJsonPerformance() =
{
val repeat = 10
val times = 10000
var start = 0L
var end = 0L
var sum = 0L
var diff = 0L
// warmup
diff = measure(repeat, times)
logger.debug("Round warm up took " + diff + " millis")
Measure.time(this.logger, "performance household deserialize/serialize") {
sum = measure(repeat, times)
}
logger.debug("Average per run " + (sum/repeat) + " millis")
} Jackson libraries version: 2.4.3 jackson-core-2.4.3.jar jackson-annotations-2.4.3.jar jackson-databind-2.4.3.jarThe only difference between the tests is following jar were updated each time: jackson-module-scala_2.10-{version}.jar The results of the test above with different versions of the jars: 2.2.2
2.2.3
2.3.3
2.4.3
To summarize: There is performance degradation in jackson scala module 2_10 libraries starting from 2.2.3 version (2.2.2 version works fine) In my tests above version jackson-module-scala_2.10-2.2.2 work ~7.5 times faster that jackson-module-scala_2.10-2.2.3 When this bug will be fixed? |
@deromka Can you re-run your test using the jar from 2.4.3 but with this module instead of the DefaultScalaModule: new JacksonModule
with IteratorModule
with EnumerationModule
with OptionModule
with SeqModule
with IterableModule
with TupleModule
with MapModule
with SetModule
with ScalaClassIntrospectorModule
with UntypedObjectDeserializerModule {
override def getModuleName = "DefaultScalaModule"
} Version 2.2.3 added |
@deromka Nevermind. I just did something similar to your test and that module is definitely not to blame. |
No problem, I can run some additional tests On Tue, Nov 11, 2014 at 5:28 PM, Nate Bauernfeind notifications@github.com
|
I would strongly recommend taking a profiler snapshot (with sampling). Given magnitude of deviation, it should be relatively easy what is taking time here. |
Also: one other thing that may or may not be useful: this benchmark project: https://github.com/FasterXML/jackson-benchmarks is something I wrote for 2.4, and have used it for regular Java benchmarking, over most data formats Jackson supports. Perhaps it'd be possible to build a simplified version of problematic data type(s), and add a Scala test; so once we figure out and solve the current issue, we could perhaps guard against regression. |
Hi, I have done profiling on both versions - 2.2.2 and 2.4.3 2.2.2 screenshot 2.4.3 screenshot So as it seen in the hotspots the reflection methods consumes about 50% cpu java.lang.reflect.Method.getDeclaredAnnotations These are called from com.fasterxml.jackson.module.scala.deser.UntypedObjectDeserializer$$anonfun$deserializeAbstractType$1.appy Please check that On Tue, Nov 11, 2014 at 7:26 PM, Tatu Saloranta notifications@github.com
|
Thank you. I think that there are a few things in Version 2.4.4 of If you can build Scala module locally, defining: @Override
public boolean isCachable() { return true; } (in suitable Scala version) should help; so if it'd be easy enough to do, you may want to test to see if and how it would change performance. |
@cowtowncoder, with respect to per-deser-call construction of types, how immutable can I expect the results of In general, I've been assuming that since DeserializationContext is passed in to |
@christophercurrie Good question. The answer should be considered static, and may be fully cached. This is related to The main limitation is that deserializers themselves must be single-threaded, so most commonly deserializers resolve dependencies by implementing |
Follow up question: can I rely upon ObjectMapper to synchronize access to my UntypedObjectDeserializer, or do I need to synchronize my caching logic? |
Depends on what you mean; since deserializers need to be thread-safe after construction (usually by keeping them stateless), no synchronization is needed. The only places where state can be change is constructor and Maybe it would be easier to consider specific case. Are you think of specific implementation here? |
So, That said, I tried to override The default implementation for this works because it shortcuts contextualization if the list serializer isn't overridden. The unit test for overriding the list serializer don't fail, because the unit test I don't know if this is a bug, or just a fact of life of the UntypedObjectSerializer. But it seems problematic that it can be contextualized before it's resolved. You can clearly see this behavior by applying this patch and running the tests. The real reason I'm concerned is that I suspect that this shortcutting in contextualization means that overridden deserializers won't be consistently applied; some copies of the contextualized deserializer won't have the overrides, and some will. I don't have a test developed for this yet, but it's on my to-do list. |
Some work has been done on this issue in the 2.4 branch, which I need to release in the next few days; if any of you have the chance to run your local perf tests on 2.4.4-SNAPSHOT, your feedback would be appreciated. |
FYI, version 2.4.4 was released today. Hopefully it will improve things for you. |
@christophercurrie 2.4.4 runs well in our benchmarks, thanks! |
Thank you! My apologies that this took so long to resolve. |
Version 2.2.3 does untyped deserialization dramatically slower (about 150x in the benchmark I am using) than 2.2.2. I'm testing with a json object that has a few levels of nesting (maps within maps; some keys are lists too) with a loop that looks like this:
bytess is a
Seq[Array[Byte]]
where each byte array is a json object on the order of a few kb. jackson-module-scala 2.2.2 runs roughly the same speed as jackson itself, but 2.2.3 is much slower.The text was updated successfully, but these errors were encountered: