-
-
Notifications
You must be signed in to change notification settings - Fork 798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add StreamReadConstraints
limit for longest textual value to allow (default: 5M)
#863
Comments
cc @pjfanning Just creating placeholder, based on our discussions. Even at the risk of security mongerers reading this :) |
this one may not be that useful - users would be better off setting a max length on the JSON stream that they will accept |
@pjfanning I am not sure -- I guess I am considering bigger streaming use cases (where total content length cannot be easily limited), although granted those would probably not accept external untrusted input. |
StreamReadConstraints
limit for longest textual value to allowStreamReadConstraints
limit for longest textual value to allow (default: 1M)
StreamReadConstraints
limit for longest textual value to allow (default: 1M)StreamReadConstraints
limit for longest textual value to allow (default: 5M)
…Client. Needed because this is now enforced in Jackson 2.15. See https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.15 and FasterXML/jackson-core#863
I upgraded to Jackson 2.15 and ran into the error:
Searching for that error message led me here. It's clear this is an intentional change. It would be helpful to add instructions here on how to fix it. Am I able to configure that limit to be higher? Do I need to switch to a more efficient API? |
Thank you @pjfanning. That comment shows that we're able to set the max length higher with code like: objectMapper.getFactory()
.setStreamReadConstraints(StreamReadConstraints.builder().maxStringLength(10_000_000).build()) I'm able to effectively disable the feature by setting the max length to ObjectMapper objectMapper = ...;
StreamReadConstraints streamReadConstraints = StreamReadConstraints
.builder()
.maxStringLength(Integer.MAX_VALUE)
.build();
objectMapper.getFactory().setStreamReadConstraints(streamReadConstraints); I'm going to set the max length high to unblock the upgrade. It's not clear to me if this is a good idea or if this indicates a real performance problem in my code. Are there other options I ought to consider? |
@motlin this is your decision - Integer.MAX_VALUE is the right value if you don't want to apply a limit. |
The json I'm parsing has huge string values inside, so I think this is the correct thing to do. What confused me is that the error mentions StreamConstraintsException which made me think of Jackson's streaming apis. I'm using |
Ah. No, it refers to where the constraints are enforced. @pjfanning Remember when I mentioned it matters what exception says, wrt users trying to figure out how to change settings? :) |
Exceptions should not have essay length messages. Could we create a FAQ entry that describes what to do when you get this exception? We could even put a link in the exception message. The text of how to set these constraints is going to take 10-20 lines. |
Just to make clear: no one is asking for essay-length messages but rather useful pointer(s): for that FAQ + link sounds like a good idea (for example). Keywords in message help too. @motlin What do you think? What was the exact message if you still have access? (I know, it's in code too but just for convenience for readers here). |
Anonymized error below. I don't have strong opinions on what the error message ought to look like. As long as users can search for the error and find a pointer to the advice above about
|
Ok so message has no reference to any class or such. We could add a reference to |
Got surprised by this. Would have appreciated more info in Exception, yes! I am not scared of essay-length exceptions - it is better with a bit on the verbose side, than leaving users (devs) completely dumbfounded. Situation: I have a use case in https://mats3.io/ where I stick one JSON serialized DTO inside another DTO, the containing envelope "MatsTrace". Although this might not be seen as ideal from a performance perspective, I want these two to be completely separate (they are separately deser'ed, and the "outer" container doesn't know the "inner" classes, and the inner might even use another deser entirely), and it works just fine. The 5M limit bit pretty hard, and is quite small, and I really don't quite see the need for a default limit at all - when the previous version had no limit. One can also envision large XML, other markup documents, in addition to JSON (as with me) being carried inside strings, and 5 meg is not much. Should this not be a configurable constraint which default is non-constrained? Or at least have a much higher limit as default? I assume this is meant as a security measure. However, this gives the possibly unintended consequence: Jackson serializes an object just fine, but then refuses to deserialize its own output. This is not a good situation. I cannot be the only one being bitten by this - and it can come at rather inopportune times, where the data can slide through in staging, but then hit this .. rather arbitrary limit .. in prod. Cc @cowtowncoder |
@stolsvik We had long discussions on this, and were really hoping that more developers had tried out release candidates so we'd get feedback on possible issues. For what it is worth, the original setting was lower -- 1 MB -- but based on feedback by one of developers increased it to 5 MB (was considering 10 MB but his tests found maximum size below 4MB or so for production usage). As to why introduce any limits: this is a hot topic in security community and I think that in general it would make sense to start with some limits, even if high -- many would consider 5 meg Strings rather large fwtw, but of course if your use cases require higher it's not high. Given that 2.15.0 release is out, I am not sure there is much point in trying to tune the defaults. Problem exists for those who do need to process humongous Strings. But just in case: what would you think would be limit that works for your use cases? How big are biggest string values seen in production (if you have such data). |
@cowtowncoder Thanks. That is what I find the most problematic here: Jackson serializes an object just fine, and then you turn around, and it crashes on deserialization. I do not agree to you being too late with adjusting that limit. Judging by mvnrepository: https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core, it seems like rather few have gotten around to update yet, and I bet it is much worse inside organizations. A quick 2.15.1 would most probably hinder many heavy in-prod gotchas! PS: Wrt. to security concerns, does it not arguably make more sense to limit the total memory usage when deserializing? Have some kind of "global counter" for the current object? I mean, as a bad guy, I can possibly bypass this individual limit if I have access to a list or array of strings? Just include a ton of 4.99M strings? |
@stolsvik My point on "too late" is that there seem to be many devs not picking up the latest, but somehow get 2.15.0 and so on. But perhaps you are right. As to limit, 100 MB is bit high but then again since this is NOT the limit that has been specifically requested (max number limit and nesting have specific vulns disclosed) it'd be doable. As to total memory: things get complicated and I don't think keeping track of separate pieces is worth it. Conceptually I think combination of max-doc-length (not yet added; quite possible to be done outside Jackson until then) and longest-individual value are suitable building blocks to add service protection. So yeah, focus only on single values would be misplaced. On serialization vs deserialization, I don't know -- I don't that as a strong argument mostly because there are so many tools & common case being one where library/framework reading is different from one doing writing. So Jackson applying same limits for both is probably not all that important. YMMV. Now: could file a new issue for raising String value max length for 2.15.1 to.. whatever you think more reasonable? |
We have been hit by this error too and it's quite frustrating to have to override the limit separately for every objectMapper we create, it should at least be possible to override the limit at a global level. Aside from that, the 5MB limit seems too low, as mentioned elsewhere Strings of this size do not pose much security risk compared to numbers so the limit could safely be set higher. A large config file can easily be over 10MB in the problem domain I'm working in. |
@darrenoc3 Just to make sure: when you say config files, do you have config files with individual String values that big? I can see that Document limit, if imposed, should be much bigger, but individual String value of 10 MB+ would seem unusual. Having said that, I am still looking for suggestions on default limit to set. As I mentioned, 5 megs was based on feedback. Hence my suggestions of going with 20 megs (characters). |
The config files are not JSON themselves (legacy format), but they still get marshalled in and out of JSON as HTTP requests/responses. So in essence the string limit becomes the document limit for this kind of use case. |
Ah. Sort of like config files embedded in K8s YAML specs (or XML in JSON or legacy format) etc. As per my other comment, I am considering raising the default for ergonomic reasons. But would like to find a sweet spot fully knowing that no limit will work perfectly. |
I'd prefer to see the default stream limit being unlimited. Nice to be able to set it but should be the user's choice. My argument for this is that this change will break a lot of systems at runtime in production, including the Australian Government Search and Rescue system that I work on. I was lucky to notice it in a lower environment so we will be ok but this will hurt a lot of production systems out there. Note that I am happy to see that it was well described in the 2.15 release notes, that was on me to miss it as I review these things normally. |
And it is unlimited, as you wish! 2.15 does not limit maximum content length at all. Limits are only for JSON values of type Text/String (this issue) and Numbers. 2.16 will add this via #1046 but will default to "unlimited". |
Ah I needed be more precise. I was referring to JSON text values. We encounter them in JSONified emails from O365. Pretty easy to encounter >5MB fields in that context. |
@davidmoten As per #1014 the default was raised to 20 megs in 2.15.1 where it is planned to stay. For what that's worth. |
So, similar to #827 there should be some upper limit for length of
String
values. We can start with something rather long -- 5 megs -- since the handling of textual values is not nearly as sensitive to input length as that of numeric (in particular floating-point). And also because there are likely use cases that rely on ability to process sizable Strings (different from numbers once again).Different from number constraints we can and should apply limit during accumulation of textual value, into
TextBuffer
(either within it, or maybe more likely, when caller fetches new localchar[]
buffer). The idea being that we do want to avoid processing of the whole String before failing.This would probably mean that the limit applies to
char
length after entity resolution. That should be fine too; the goal is not to impose byte-accurate limits but to prevent overloading.Note: this is also distinct from total document length limit that we likely want to impose too, eventually.
EDIT: (21-Mar-2023) 2.15.0-rc1 had limit of 1M, rc2 planned to use 5M. Updated description and title accordingly.
The text was updated successfully, but these errors were encountered: