-
Notifications
You must be signed in to change notification settings - Fork 4
/
README
351 lines (254 loc) · 13.8 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
Universal Binary JSON Java Library
http://ubjson.org
About this project...
---------------------
This code base is actively under development and implements the latest
specification of Universal Binary JSON (Draft 8).
I/O is handled through the following core classes:
* UBJOutputStream
* UBJInputStream
* UBJInputStreamParser
Additionally, if you are working with Java's NIO and need byte[]-based
results, you can wrap any of the above I/O classes around one of the highly
optimized custom byte[]-stream impls:
* ByteArrayInputStream (optimized for reuse, not from JDK)
* ByteArrayOutputStream (optimized for reuse, not from JDK)
If you are working with NIO and want maximum performance by using (and reusing)
direct ByteBuffers along with the UBJSON stream impls, take a look at the:
* ByteBufferInputStream
* ByteBufferOutputStream
classes. You can wrap any ByteBuffer source or destination with this stream type,
then wrap that stream type with a UBJSON stream impl and utilize the full
potential of Java's NIO with Universal Binary JSON without giving yourself an
ulcer.
This allows you to re-use the streams over and over again in a pool of reusable
streams for high-performance I/O with no object creation and garbage collection
overhead; a perfect match for high frequency NIO-based communication.
All of the core I/O classes have been stable for a while, with tweaks to make the
performance tighter and the error messages more informative over the last few
months.
More Java-convenient reflection-based I/O classes are available in the
org.ubjson.io.reflect package, but they are under active development.
There are other efforts (like utilities) in other sub portions of the source
tree. This project intends to eventually contain a multitude of UBJSON
abstraction layers, I/O methods and utilities.
Changelog
---------
02-10-12
* Added ByteBuffer input and output stream impls as compliments to the
re-usable byte[] stream impls.
Provides a fast translation layer between standard Java stream-IO and the
new Buffer-based I/O in NIO (including transparent support for using
ultra-fast direct ByteBuffers).
* Optimized some of the read/write methods by removing unnecessary bounds
checks that are done by subsequent Input or Output stream impls themselves.
02-09-12
* Fixed bug with readHugeAsBigInteger returning an invalid value and not
treating the underlying bytes as a string-encoded value.
* Removed implicit buffer.flip() at the end of StreamDecoder; there is no
way to know what the caller had planned for the buffer before reading all
the data back out. Also the flip was in the wrong place and in the case of
an empty decode request (length=0) the flip would not have been performed,
providing the caller with a "full" buffer of nothing.
* Rewrote all readHugeXXX method impls; now huge's can be read in as a
simple Number (BigInteger or BigDecimal) as well as raw bytes and even
decoded chars. Additionally the methods can even accept and use existing
buffers to write into to allow for tighter optimizations.
* Rewrote all readStringXXX methods using the same optimizations and
flexibility that readHuge methods now use.
02-07-12
More Memory and CPU optimizations across all the I/O impls.
* StreamDecoder was rewritten to no longer create a ByteBuffer on every
invocation and instead re-use the same one to decode from on every single call.
* StreamDecoder now requires the call to pass in a CharBuffer instance to hold
the result of the decode operation. This avoids the creation of a CharBuffer
and allows for large-scale optimization by re-using existing buffers between
calls.
* StreamEncoder was rewritten to no longer create a ByteBuffer on every
invocation either and now re-uses the same single instance over and over
again.
* UBJOutputStream writeHuge and writeString series of methods were all
rewritten to accept a CharBuffer in the rawest form (no longer char[]) to stop
hiding the fact that the underlying encode operation required one.
This gives the caller an opportunity to cache and re-use CharBuffers over
and over again if they can; otherwise this just pushes the CharBuffer.wrap()
call up to the caller instead of hiding it secretly in the method impl under
the guise of accepting a raw char[] (that it couldn't use directly).
For callers that can re-use buffers, this will lead to big performance gains
now that were previously impossible.
* UBJInputStream added readHuge and readString methods that accept an existing
CharBuffer argument to make use of the optimizations made in the Stream encoder
and decoder impls.
01-15-12
Huge performance boost for deserialization!
StreamDecoder previously used separate read and write buffers for decoding
bytes to chars including the resulting char[] that was returned to the caller.
This design required at least 1 full array copy before returning a result in
the best case and 2x full array copies before returning the result in the
worst case.
The rewrite removed the need for a write buffer entire as well as ALL array
copies; in the best OR worse case they never occur anymore.
Raw performance boost of roughly 25% in all UBJ I/O classes as a result.
12-01-11 through 01-24-12
A large amount of work has continued on the core I/O classes (stream impls)
to help make them not only faster and more robust, but also more helpful.
When errors are encountered in the streams, they are reported along with the
stream positions. This is critical for debugging problems with corrupt
formats.
Also provided ByteArray I/O stream classes that have the potential to provide
HUGE performance boosts for high frequency systems.
Both these classes (ByteArrayInputStream and ByteArrayOutputStream) are
reusable and when wrapped by a UBJInputStream or UBJOutputStream, the top
level UBJ streams implicitly become reusable as well.
Reusing the streams not only saves on object creation/GC cleanup but also
allows the caller to re-use the temporary byte[] used to translate to and
from the UBJ format, avoiding object churn entirely!
This optimized design was chosen to be intentionally performant when combined
with NIO implementations as the ByteBuffer's can be used to wrap() existing
outbound buffers (avoiding the most expensive part of a buffer) or use
array() to get access to the underlying buffer that needs to be written to
the stream.
In the case of direct ByteBuffers, there is no additional overhead added
because the calls to get or put are required anyway to pull or push the
values from the native memory location.
This approach allows the fastest implementation of Universal Binary JSON
I/O possible in the JVM whether you are using the standard IO (stream)
classes or the NIO (ByteBuffer) classes in the JDK.
Some ancillary work on UBJ-based command line utilities (viewers, converters,
etc.) has begun as well.
11-28-11
* Fixed UBJInputStreamParser implementation; nextType correctly implements
logic to skip existing element (if called back to back) as well as validate
the marker type it encounters before returning it to the caller.
* Modified IObjectReader contract; a Parser implementation is required to
make traversing the UBJ stream possible without knowing what type of element
is next.
11-27-11
* Streamlined ByteArrayOutputStream implementation to ensure the capacity
of the underlying byte[] is never grown unless absolutely necessary.
* Rewrote class Javadoc for ByteArrayOutputStream to include a code snippet
on how to use it.
11-26-11
* Fixed major bug in how 16, 32 and 64-bit integers are re-assembled when
read back from binary representations.
* Added a numeric test to specifically catch this error if it ever pops up
again.
* Optimized reading and writing of numeric values in Input and Output
stream implementations.
* Optimized ObjectWriter implementation by streamlining the numeric read/write
logic and removing the sub-optimal force-compression of on-disk storage.
* Fixed ObjectWriter to match exactly with the output written by
UBJOutputStream.
* Normalized all testing between I/O classes so they use the same classes
to ensure parity.
11-10-11
* DRAFT 8 Spec Support Added.
* Added support for the END marker (readEnd) to the InputStreams which is
required for proper unbounded-container support.
* Added support for the END marker (writeEnd) to UBJOutputStream.
UBJInputStreamParser must be used for properly support unbounded-containers
because you never know when the 'E' will be encountered marking the end;
so the caller needs to pay attention to the type marker that nextType()
returns and respond accordingly.
* Added readHugeAsBytes to InputStream implementations, allowing the bytes
used to represent a HUGE to be read in their raw form with no decoding.
This can save on processing as BigInteger and BigDecimal do their own decoding
of byte[] directly.
* Added readHugeAsChars to InputStream implementations, allowing a HUGE
value to be read in as a raw char[] without trying to decode it to a
BigInteger or BigDecimal.
* Added writeHuge(char[]) to support writing out HUGE values directly from
their raw char[] form without trying to decode from a BigInteger or
BigDecimal.
* readArrayLength and readObjectLenght were modified to return -1 when an
unbounded container length is encountered (255).
* Fixed UBJInputStreamParser.nextType to correctly skip past any NOOP
markers found in the underlying stream before returning a valid next type.
* Normalized all reading of "next byte" to the singular
UBJInputStream.nextMarker method -- correctly skips over NOOP until end of
stream OR until the next valid marker byte, then returns it.
* Modified readNull behavior to have no return type. It is already designed
to throw an exception when 'NULL' isn't found, no need for the additional
return type.
* Began work on a simple abstract representation of the UBJ data types as
objects that can be assembled into maps and lists and written/read easily
using the IO package.
This is intended to be a higher level of abstraction than the IO streams,
but lower level (and faster) than the reflection-based work that inspects
user-provided classes.
* Refactored StreamDecoder and StreamEncoder into the core IO package,
because they are part of core IO.
* Refactored StreamParser into the io.parser package to more clearly denote
its relationship to the core IO classes. It is a slightly higher level
abstraction ontop of the core IO, having it along side the core IO classes
while .reflect was a subpackage was confusing and suggested that
StreamParser was somehow intended as a swapable replacement for
UBJInputStream which is not how it is intended to be used.
* Refactored org.ubjson.reflect to org.ubjson.io.reflect to more correctly
communicate the relationship -- the reflection-based classes are built on
the core IO classes and are just a higher abstraction to interact with UBJSON
with.
* Renamed IDataType to IMarkerType to follow the naming convention for the
marker bytes set forth by the spec doc.
10-14-11
* ObjectWriter rewritten and works correctly. Tested with the example test
data and wrote out the compressed and uncompressed formats files correctly
from their original object representation.
* Added support for reading and writing huge values as BigInteger as well
as BigDecimal.
* Added automatic numeric storage compression support to ObjectWriter - based
on the numeric value (regardless of type) the value will be stored as the
smallest possible representation in the UBJ format if requested.
* Added mapping support for BigDecimal, BigInteger, AtomicInteger and
AtomicLong to ObjectWriter.
* Added readNull and readBoolean to the UBJInputStream and
UBJInputStreamParser implementations to make the API feel complete and feel
more natural to use.
10-10-11
* com.ubjson.io AND com.ubjson.io.charset are finalized against the
Draft 8 specification.
* The lowest level UBJInput/OuputStream classes were tightened up to run as
fast as possible showing an 800ns-per-op improvement in speed.
* Profiled core UBJInput/OuputStream classes using HPROF for a few million
iterations and found no memory leaks and no performance traps; everything at
that low level is as tight as it can be.
* Stream-parsing facilities were moved out of the overloaded UBJInputStream
class and into their own subclass called UBJInputStreamParser which operates
exactly like a pull-parsing scenario (calling nextType then switching on the
value and pulling the appropriate value out of the stream).
* More example testing/benchmarking data checked into /test/java/com/ubjson/data
Will begin reworking the Reflection based Object mapping in the
org.ubjson.reflect package now that the core IO classes are finalized.
* Removed all old scratch test files from the org.ubjson package, this was
confusing.
09-27-11
* Initial check-in of core IO classes to read/write spec.
Status
------
Using the standard UBJInputStream, UBJInputStreamParser and UBJOutputStream
implementations to manually read/write UBJ objects is stable and tuned for
optimal performance.
Automatic mapping of objects to/from UBJ format via the reflection-based
implementation is not tuned yet. Writing is implemented, but not tuned for
optimal performance and reading still has to be written.
* org.ubjson.io - STABLE
* org.ubjson.io.parser - STABLE
* org.ubjson.io.reflect - ALPHA
* org.ubjson.model - BETA
License
-------
This library is released under the Apache 2 License. See LICENSE.
Description
-----------
This project represents (the official?) Java implementations of the
Universal Binary JSON specification: http://ubjson.org
Example
-------
Comming soon...
Performance
-----------
Comming soon...
Reference
---------
Universal Binary JSON Specification - http://ubjson.org
JSON Specification - http://json.org