Boon JSON parser seems to be the fastest

IT News 2014. 1. 10. 14:13

Boon JSON parser seems to be the fastest

I'll publish object serialization numbers later. Last I checked it was quite a bit faster than the rest.

Here are some benchmark numbers parsing various sample JSON files from json.org and a sample that a user of Boon JSON sent in.






To go from a 2K JSON String to a Map, Boon is 2x or so faster. Yes you say, but how is Boon at larger files? 


The above graph shows Boon, GSON and Jackson parsing a 1.7 MB string. Boon is up to twice as fast. 

Yes you say, but how does boon do at parsing byte[], reader, inputStream, etc.?  Pretty much Boon wins in every category for all files that I have tested, which were quite a few.

It took a bit of doing. Boon has an I/O lib which I employed to speed up the inputStream and reader support. Boon also has a relaxed mode JSON parser that allows no-quotes etc., it is just as fast as the strict parser. 

The above is not a complete list of tests. 

You don't have to take my word for it. The benchmarks are online. https://github.com/RichardHightower/json-parsers-benchmark.

Here are some numbers to go with the graphs.

12/25/13
1.7 MB JSON String

Benchmark                                      Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.s.BoonBenchmark.citmCatalog             thrpt   8         5    1      873.970       94.240    ops/s
i.g.j.s.GSONBenchmark.citmCatalog             thrpt   8         5    1      410.783      217.476    ops/s
i.g.j.s.JacksonASTBenchmark.citmCatalog       thrpt   8         5    1      294.690       47.593    ops/s
i.g.j.s.JacksonObjectBenchmark.citmCatalog    thrpt   8         5    1      305.787       29.107    ops/s
i.g.j.s.JsonSmartBenchmark.citmCatalog        thrpt   8         5    1      311.063       29.646    ops/s

2K JSON String
Benchmark                                 Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.s.BoonBenchmark.medium             thrpt   8         5    1   816416.973    13231.453    ops/s
i.g.j.s.GSONBenchmark.medium             thrpt   8         5    1   341148.250    18117.075    ops/s
i.g.j.s.JacksonASTBenchmark.medium       thrpt   8         5    1   263167.610   147495.795    ops/s
i.g.j.s.JacksonObjectBenchmark.medium    thrpt   8         5    1   282024.617     6922.138    ops/s
i.g.j.s.JsonSmartBenchmark.medium        thrpt   8         5    1   296944.993     7852.929    ops/s

1.7 MB JSON byte[]
Benchmark                                      Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.b.BoonBenchmark.citmCatalog             thrpt   8         5    1      628.710       91.286    ops/s
i.g.j.b.GSONBenchmark.citmCatalog             thrpt   8         5    1      439.203      120.003    ops/s
i.g.j.b.JacksonASTBenchmark.citmCatalog       thrpt   8         5    1      381.350       97.841    ops/s
i.g.j.b.JacksonObjectBenchmark.citmCatalog    thrpt   8         5    1      402.537        3.634    ops/s
i.g.j.b.JsonSmartBenchmark.citmCatalog        thrpt   8         5    1      341.940       18.847    ops/s

2K JSON byte[]
Benchmark                                 Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.b.BoonBenchmark.medium             thrpt   8         5    1   648162.887    18697.319    ops/s
i.g.j.b.GSONBenchmark.medium             thrpt   8         5    1   260145.827     5934.588    ops/s
i.g.j.b.JacksonASTBenchmark.medium       thrpt   8         5    1   289863.140    48969.875    ops/s
i.g.j.b.JacksonObjectBenchmark.medium    thrpt   8         5    1   289010.543    11205.881    ops/s
i.g.j.b.JsonSmartBenchmark.medium        thrpt   8         5    1   262873.957     3901.193    ops/s
1.7 MB JSON Inputstream
Benchmark                                                Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.inputStream.BoonBenchmark.citmCatalog             thrpt   8         5    1      626.907       31.450    ops/s
i.g.j.inputStream.GSONBenchmark.citmCatalog             thrpt   8         5    1      426.120       13.946    ops/s
i.g.j.inputStream.JacksonASTBenchmark.citmCatalog       thrpt   8         5    1      376.820      115.502    ops/s
i.g.j.inputStream.JacksonObjectBenchmark.citmCatalog    thrpt   8         5    1      360.850       89.648    ops/s


2K file JSON Inputstream
Benchmark                                           Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.inputStream.BoonBenchmark.medium             thrpt   8         5    1   218730.830     5262.596    ops/s
i.g.j.inputStream.GSONBenchmark.medium             thrpt   8         5    1   151255.407     4486.414    ops/s
i.g.j.inputStream.JacksonASTBenchmark.medium       thrpt   8         5    1   156512.527   107512.401    ops/s
i.g.j.inputStream.JacksonObjectBenchmark.medium    thrpt   8         5    1   160793.407     4056.790    ops/s

1.7 MB JSON Reader
Benchmark                                      Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.r.BoonBenchmark.citmCatalog             thrpt   8         5    1      615.313       63.716    ops/s
i.g.j.r.GSONBenchmark.citmCatalog             thrpt   8         5    1      411.847       18.978    ops/s
i.g.j.r.JacksonASTBenchmark.citmCatalog       thrpt   8         5    1      264.727      118.541    ops/s
i.g.j.r.JacksonObjectBenchmark.citmCatalog    thrpt   8         5    1      246.783       93.409    ops/s
i.g.j.r.JsonSmartBenchmark.citmCatalog        thrpt   8         5    1      151.097        3.502    ops/s

2k JSON Reader
Benchmark                                 Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.r.BoonBenchmark.medium             thrpt   8         5    1   185075.093     6528.567    ops/s
i.g.j.r.GSONBenchmark.medium             thrpt   8         5    1   134025.760     3385.134    ops/s
i.g.j.r.JacksonASTBenchmark.medium       thrpt   8         5    1   107676.323    60674.421    ops/s
i.g.j.r.JacksonObjectBenchmark.medium    thrpt   8         5    1   116903.500     3206.994    ops/s
i.g.j.r.JsonSmartBenchmark.medium        thrpt   8         5    1    77898.710     2434.773    ops/s
Other JSON.org examples:
webxml json.org example
Benchmark                                 Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.s.BoonBenchmark.webxml             thrpt   8         5    1   421016.033    13428.790    ops/s
i.g.j.s.GSONBenchmark.webxml             thrpt   8         5    1   143801.263     7870.384    ops/s
i.g.j.s.JacksonASTBenchmark.webxml       thrpt   8         5    1   125981.563    36753.717    ops/s
i.g.j.s.JacksonObjectBenchmark.webxml    thrpt   8         5    1   130069.577    25055.300    ops/s
i.g.j.s.JsonSmartBenchmark.webxml        thrpt   8         5    1   132422.153    10254.167    ops/s
Boon 3X faster
sgml json.org example
Benchmark                               Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.s.BoonBenchmark.sgml             thrpt   8         5    1  1846015.410   101291.991    ops/s
i.g.j.s.GSONBenchmark.sgml             thrpt   8         5    1   988186.433    35337.393    ops/s
i.g.j.s.JacksonASTBenchmark.sgml       thrpt   8         5    1   680502.597   289591.197    ops/s
i.g.j.s.JacksonObjectBenchmark.sgml    thrpt   8         5    1   709969.980    29621.959    ops/s
i.g.j.s.JsonSmartBenchmark.sgml        thrpt   8         5    1   796387.753    22697.397    ops/s
Boon 2x faster
actionLabel json.org example
Benchmark                                      Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.s.BoonBenchmark.actionLabel             thrpt   8         5    1  1109285.703    78440.576    ops/s
i.g.j.s.GSONBenchmark.actionLabel             thrpt   8         5    1   429742.283    10097.416    ops/s
i.g.j.s.JacksonASTBenchmark.actionLabel       thrpt   8         5    1   421132.630    10514.598    ops/s
i.g.j.s.JacksonObjectBenchmark.actionLabel    thrpt   8         5    1   403535.453    16382.734    ops/s
i.g.j.s.JsonSmartBenchmark.actionLabel        thrpt   8         5    1   453847.673    25607.331    ops/s
Boon over 2x faster
menu json.org example
Benchmark                               Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.s.BoonBenchmark.menu             thrpt   8         5    1  2582429.350   700873.986    ops/s
i.g.j.s.GSONBenchmark.menu             thrpt   8         5    1  1240234.083    22312.822    ops/s
i.g.j.s.JacksonASTBenchmark.menu       thrpt   8         5    1  1242132.793    19273.775    ops/s
i.g.j.s.JacksonObjectBenchmark.menu    thrpt   8         5    1  1141071.207    36489.605    ops/s
i.g.j.s.JsonSmartBenchmark.menu        thrpt   8         5    1  1463778.480    57490.408    ops/s
Boon 2x faster.
Benchmark                                 Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.s.BoonBenchmark.widget             thrpt   8         5    1  1485476.970    79222.003    ops/s
i.g.j.s.GSONBenchmark.widget             thrpt   8         5    1   810153.490    20079.953    ops/s
i.g.j.s.JacksonASTBenchmark.widget       thrpt   8         5    1   724349.650   284735.196    ops/s
i.g.j.s.JacksonObjectBenchmark.widget    thrpt   8         5    1   705271.907    42304.730    ops/s
i.g.j.s.JsonSmartBenchmark.widget        thrpt   8         5    1   728506.560    29680.028    ops/s
Boon is damn fast.

It has many modes to fit various mediums depending on your goals (small footprint, direct byte parse, etc.). Don't worry, boon is not hard to use. It just works.

Benchmark                                      Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.b.BoonAsciiBytes.actionLabel            thrpt   8         5    1   302902.677    21981.467    ops/s
i.g.j.b.BoonAsciiBytes.citmCatalog            thrpt   8         5    1      628.150       26.607    ops/s
i.g.j.b.BoonAsciiBytes.medium                 thrpt   8         5    1   320658.760    38751.800    ops/s
i.g.j.b.BoonAsciiBytes.menu                   thrpt   8         5    1  2081501.213   113660.611    ops/s
i.g.j.b.BoonAsciiBytes.sgml                   thrpt   8         5    1   998463.200    31916.216    ops/s
i.g.j.b.BoonAsciiBytes.small                  thrpt   8         5    1 11095898.987   534428.831    ops/s
i.g.j.b.BoonAsciiBytes.webxml                 thrpt   8         5    1   148348.463     5512.808    ops/s
i.g.j.b.BoonAsciiBytes.widget                 thrpt   8         5    1   879580.747    14598.011    ops/s
i.g.j.b.BoonBenchMarkLax.actionLabel          thrpt   8         5    1   806689.270    28745.917    ops/s
i.g.j.b.BoonBenchMarkLax.citmCatalog          thrpt   8         5    1      633.087       77.455    ops/s
i.g.j.b.BoonBenchMarkLax.medium               thrpt   8         5    1   569042.093    61404.916    ops/s
i.g.j.b.BoonBenchMarkLax.menu                 thrpt   8         5    1  2600248.763   105320.234    ops/s
i.g.j.b.BoonBenchMarkLax.sgml                 thrpt   8         5    1  1476412.973   284184.058    ops/s
i.g.j.b.BoonBenchMarkLax.small                thrpt   8         5    1 13336195.790  1442531.930    ops/s
i.g.j.b.BoonBenchMarkLax.webxml               thrpt   8         5    1   270060.157     6539.573    ops/s
i.g.j.b.BoonBenchMarkLax.widget               thrpt   8         5    1  1262768.937    51676.215    ops/s
i.g.j.b.BoonBenchMarkUTF8Bytes.actionLabel    thrpt   8         5    1   185209.077   670100.163    ops/s
i.g.j.b.BoonBenchMarkUTF8Bytes.citmCatalog    thrpt   8         5    1      379.917       30.037    ops/s
i.g.j.b.BoonBenchMarkUTF8Bytes.medium         thrpt   8         5    1   217107.220     5247.417    ops/s
i.g.j.b.BoonBenchMarkUTF8Bytes.menu           thrpt   8         5    1  1319969.417    79745.189    ops/s
i.g.j.b.BoonBenchMarkUTF8Bytes.sgml           thrpt   8         5    1   688184.650    34033.100    ops/s
i.g.j.b.BoonBenchMarkUTF8Bytes.small          thrpt   8         5    1  7486431.520  1228519.698    ops/s
i.g.j.b.BoonBenchMarkUTF8Bytes.webxml         thrpt   8         5    1   104078.393    15332.908    ops/s
i.g.j.b.BoonBenchMarkUTF8Bytes.widget         thrpt   8         5    1   526663.853   214399.644    ops/s
i.g.j.b.BoonCharArray.actionLabel             thrpt   8         5    1   407056.423   149970.346    ops/s
i.g.j.b.BoonCharArray.citmCatalog             thrpt   8         5    1      391.130       55.374    ops/s
i.g.j.b.BoonCharArray.medium                  thrpt   8         5    1   320601.040    83669.815    ops/s
i.g.j.b.BoonCharArray.menu                    thrpt   8         5    1  1686792.320   112046.346    ops/s
i.g.j.b.BoonCharArray.sgml                    thrpt   8         5    1  1052574.220    44541.919    ops/s
i.g.j.b.BoonCharArray.small                   thrpt   8         5    1  8071292.173   663678.327    ops/s
i.g.j.b.BoonCharArray.webxml                  thrpt   8         5    1   181207.910    32126.919    ops/s
i.g.j.b.BoonCharArray.widget                  thrpt   8         5    1   878541.030   137067.187    ops/s
i.g.j.b.BoonFastParser.actionLabel            thrpt   8         5    1   601141.330    77361.337    ops/s
i.g.j.b.BoonFastParser.citmCatalog            thrpt   8         5    1      429.987      198.559    ops/s
i.g.j.b.BoonFastParser.medium                 thrpt   8         5    1   462712.293   118751.410    ops/s
i.g.j.b.BoonFastParser.menu                   thrpt   8         5    1  1981728.817   239514.140    ops/s
i.g.j.b.BoonFastParser.sgml                   thrpt   8         5    1  1117030.450   209863.168    ops/s
i.g.j.b.BoonFastParser.small                  thrpt   8         5    1 10197156.600   169372.770    ops/s
i.g.j.b.BoonFastParser.webxml                 thrpt   8         5    1   230100.983    62048.894    ops/s
i.g.j.b.BoonFastParser.widget                 thrpt   8         5    1  1242538.033   169654.975    ops/s
i.g.j.b.BoonStringDirect.actionLabel          thrpt   8         5    1   461358.763    45184.611    ops/s
i.g.j.b.BoonStringDirect.citmCatalog          thrpt   8         5    1      332.883       25.544    ops/s
i.g.j.b.BoonStringDirect.medium               thrpt   8         5    1   323354.063    18819.168    ops/s
i.g.j.b.BoonStringDirect.menu                 thrpt   8         5    1  1668149.967    52797.831    ops/s
i.g.j.b.BoonStringDirect.sgml                 thrpt   8         5    1   933777.700    77093.442    ops/s
i.g.j.b.BoonStringDirect.small                thrpt   8         5    1  7111685.283   205942.968    ops/s
i.g.j.b.BoonStringDirect.webxml               thrpt   8         5    1   154376.677    50416.916    ops/s
i.g.j.b.BoonStringDirect.widget               thrpt   8         5    1   575450.757    45103.058    ops/s

출처 : http://rick-hightower.blogspot.kr/2013/12/boon-json-parser-seems-to-be-fastest.html

:     

TISTORY에 Login하려면 여기를 누르세요.


Oracle Trim Function

Oracle 2013. 11. 15. 18:39

TRIM

Syntax

Description of trim.gif follows
Description of the illustration trim.gif

Purpose

TRIM enables you to trim leading or trailing characters (or both) from a character string. If trim_character or trim_source is a character literal, then you must enclose it in single quotation marks.

  • If you specify LEADING, then Oracle Database removes any leading characters equal to trim_character.

  • If you specify TRAILING, then Oracle removes any trailing characters equal to trim_character.

  • If you specify BOTH or none of the three, then Oracle removes leading and trailing characters equal to trim_character.

  • If you do not specify trim_character, then the default value is a blank space.

  • If you specify only trim_source, then Oracle removes leading and trailing blank spaces.

  • The function returns a value with datatype VARCHAR2. The maximum length of the value is the length of trim_source.

  • If either trim_source or trim_character is null, then the TRIM function returns null.

Both trim_character and trim_source can be VARCHAR2 or any datatype that can be implicitly converted to VARCHAR2. The string returned is of VARCHAR2 datatype if trim_source is a character datatype and a LOB if trim_source is a LOB datatype. The return string is in the same character set as trim_source.

Examples

This example trims leading zeros from the hire date of the employees in the hr schema:

SELECT employee_id,
      TO_CHAR(TRIM(LEADING 0 FROM hire_date))
      FROM employees
      WHERE department_id = 60
      ORDER BY employee_id;

EMPLOYEE_ID TO_CHAR(T
----------- ---------
        103 3-JAN-90
        104 21-MAY-91
        105 25-JUN-97
        106 5-FEB-98
        107 7-FEB-99


LTRIM

Syntax

Description of ltrim.gif follows
Description of the illustration ltrim.gif

Purpose

LTRIM removes from the left end of char all of the characters contained in set. If you do not specify set, then it defaults to a single blank. If char is a character literal, then you must enclose it in single quotation marks. Oracle Database begins scanning char from its first character and removes all characters that appear in set until reaching a character not in set and then returns the result.

Both char and set can be any of the datatypes CHARVARCHAR2NCHARNVARCHAR2CLOB, or NCLOB. The string returned is of VARCHAR2 datatype if char is a character datatype, NVARCHAR2 if char is a national character datatype, and a LOB if char is a LOB datatype.


Examples

The following example trims the redundant first word from a group of product names in the oe.products table:

SELECT product_name, LTRIM(product_name, 'Monitor ') "Short Name"
   FROM products
   WHERE product_name LIKE 'Monitor%';

PRODUCT_NAME         Short Name
-------------------- ---------------
Monitor 17/HR        17/HR
Monitor 17/HR/F      17/HR/F
Monitor 17/SD        17/SD
Monitor 19/SD        19/SD
Monitor 19/SD/M      19/SD/M
Monitor 21/D         21/D
Monitor 21/HR        21/HR
Monitor 21/HR/M      21/HR/M
Monitor 21/SD        21/SD
Monitor Hinge - HD   Hinge - HD
Monitor Hinge - STD  Hinge - STD

RTRIM

Syntax

Description of rtrim.gif follows
Description of the illustration rtrim.gif

Purpose

RTRIM removes from the right end of char all of the characters that appear in set. This function is useful for formatting the output of a query.

If you do not specify set, then it defaults to a single blank. If char is a character literal, then you must enclose it in single quotation marks. RTRIM works similarly to LTRIM.

Both char and set can be any of the datatypes CHARVARCHAR2NCHARNVARCHAR2CLOB, or NCLOB. The string returned is of VARCHAR2 datatype if char is a character datatype, NVARCHAR2 if expr1 is a national character datatype, and a LOB if char is a LOB datatype.

Examples

The following example trims all the right-most occurrences of period, slash, and equal sign from a string:

SELECT RTRIM('BROWNING: ./=./=./=./=./=.=','/=.') "RTRIM example" FROM DUAL;
 
RTRIM exam
----------
BROWNING:

reference : Oracle® Database SQL Language Reference - Oracle11gR1


그동안 너무 RTRM과 LTRIM만 사용한듯.... 

:     

TISTORY에 Login하려면 여기를 누르세요.


Splitting a large file into smaller pieces

IT News 2013. 11. 14. 13:37

Splitting a large file into smaller pieces

If you have a large file and want to break it into smaller pieces, you can use the Unix split command. You can tell it what the prefix of each split file should be and it will then append an alphabet (or number) to the end of each name.

In the example below, I split a file containing 100,000 lines. I instruct split to use numeric suffixes (-d), put 10,000 lines in each split file (-l 10000) and use suffixes of length 3 (-a 3). As a result, ten split files are created, each with 10,000 lines.

$ ls
hugefile

$ wc -l hugefile
100000 hugefile

$ split -d -l 10000 -a 3 hugefile hugefile.split.

$ ls
hugefile                hugefile.split.005
hugefile.split.000      hugefile.split.006
hugefile.split.001      hugefile.split.007  
hugefile.split.002      hugefile.split.008
hugefile.split.003      hugefile.split.009
hugefile.split.004

$ wc -l *split*
 10000 hugefile.split.000
 10000 hugefile.split.001
 10000 hugefile.split.002
 10000 hugefile.split.003
 10000 hugefile.split.004
 10000 hugefile.split.005
 10000 hugefile.split.006
 10000 hugefile.split.007
 10000 hugefile.split.008
 10000 hugefile.split.009
100000 total


reference : http://fahdshariff.blogspot.kr/2011/10/splitting-large-file-into-smaller.html


UNIX에서는 위와 같이 간단하게 용량이 큰 파일을 분할할 수 있네요. 속도도 Application보다 훨씬 나은듯 합니다.


:     

TISTORY에 Login하려면 여기를 누르세요.