Want to show your appreciation?
Please a cup of tea.

Wednesday, November 28, 2012

Use GraphicsMagick to Convert Text To Image

Not sure if that is the best way to use GraphicsMagick, but with help of dot.png, command below generates an image with text on it. (type the command in one line)

gm convert dot.png -resize 300x500! -font arial -pointsize 38 -fill gray -draw "text 50 100 'NO IMAGE'" -draw "text 50 150 'AVAILABLE'" output.png

And you get this image:

output

Some further reading:

http://www.imagemagick.org/Usage/text/#mixed_font_lines

Saturday, November 17, 2012

Sharding Algorithm

Distributing the traffic evenly and consistently across all the servers is not difficult if the number of servers in the cluster is constant. But in the real world, you always need to take servers out of service for maintenance. The challenge of a good sharding algorithm is to avoid complete redistribution of requests.

Table below uses a simple modular algorithm. It divides and keys by number of servers in service, the remainder is the server that takes the request.

Key 396562 673665 115181 650428 804339 394035 280572 108093 938266 125314
5 nodes

2

0

1

3

4

0

2

3

1

4

4 nodes

2

1

1

0

3

3

0

1

2

2

You can notice that the if we have a 5 server (0-4) cluster and takes server 4 out of service. The requests are completely redistributed to the remaining 4 servers. We are aware of two different algorithms that provide consistency upon node change.

Look-up Ring Algorithm

Form a ring using an array that has significantly larger amount of elements that number of server nodes. For illustration purpose, we use 25 slots for 5 nodes, but the real world ratio should be much higher. The exact number can be determined by running simulation. Then randomly places the server node number in this array. In order to distribute the load evenly in normal mode, the algorithm to populate the ring need to make sure every node get same share of the slots.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
3 1 4 2 0 3 0 2 4 1 2 3 4 0 1 3 2 1 0 4 3 1 4 0 2

To determine which node gets which request, we divide the key (650428) by number of slots (25) and take the remainder (3). Use the remainder as index to get the server node number (2) in the array above. That server (2) is designated to serve the request. If the designated server node is out of service (OOS), uses the server node (0) designated by the next slot (4) in the array. The process continues until an server that is in service is found. Table below illustrates the process by showing the which server node is selected to serve the request of a set of test keys.

You can see that in the last row, when node 2 is out of service, it's load is distributed between node 0, 1 and 3. In the meantime, other requests are continue to be served by the same server node in the normal situation. That eliminates the need to completely redistribute the cache.

Key 396562 673665 115181 650428 804339 394035 280572 108093 938266 125314
MOD 25

12

15

6

3

14

10

22

18

16

14

Node selection Normal

4

3

0

2

1

2

4

0

2

1

Node 2 OOS

4

3

0

0

1

3

4

0

1

1

The advantage of using this algorithm is that the look-up speed is fast and consistent regardless of the number of server nodes that we have. The disadvantage is the need to maintain the look-up ring especially when new nodes are being added to the cluster.

Key+Node Hash Algorithm

This algorithm is to use a good hash algorithm, commonly MD5 or SHA-1. For each request, compute a value for each active node. The value is the hash of a string consisting of key and node (node number, node name or anything that uniquely identifies the node). The server yielded the largest hash value takes the request. Table below demonstrate the node selection process for a set of test keys. The hash algorithm used here is for illustration purpose only, it's neither MD5, nor SHA-1.

In the last row, you can see that when node 2 is out of service, it's load is distributed between node 0, 1 and 4. In the meantime, other requests are continue to be served by the same server node in the normal situation. That eliminates the need to completely redistribute the cache.

Node \ Key

396562 673665 115181 650428 804339 394035 280572 108093 938266 125314
Hash of Key concatenating Node Node 0

81526

40031

29723

53735

23911

34931

96088

43852

56076

38777

Node 1

5425

19393

93416

53022

51364

84920

51352

70016

26255

30336

Node 2

93129

26422

83633

65930

81901

87666

50754

32221

29866

7363

Node 3

40372

44005

22422

32105

80448

39727

33887

31331

82034

93235

Node 4

4337

89463

87164

64973

90511

14499

88153

11442

63305

29493

Node selection Normal

2

4

1

2

4

2

0

1

3

3

Node 2 OOS

0

4

1

4

4

1

0

1

3

3

The advantage of this algorithm is simple and low maintenance. Nodes can be easily added and removed from the cluster without any issue. The disadvantage is that overhead to calculate the hash value for each request. And the overhead increases when number of nodes in the cluster increases.

Saturday, October 27, 2012

Integrate Java and GraphicsMagick – im4java Performance

Introduction

In the post Integrate Java and GraphicsMagick – Conception, I have expressed the performance concern of im4java’s integration approach. In order to verify whether the issue actually exists, we implemented our first beta application using im4java to run the baseline test.

Application

This is a web application that dynamically resize the source images that is 700x700 or less to a smaller dimension. An HTTP request come in with the individual image and the target size specified, the web application locates the source image and resize it to the target size, then streams the image back to the requester, typically a browser.

The application simply use “gm convert <source> –scale <width>x<hight> <target>”, which is probably the fastest way to scale down an image. We have throttled to only run max 32 gm processes at any given time because we got the best throughput with that setting.

Test Environment

Hardware

RAM: 32GB

CPU: 32 cores - 4 AMD Opteron(TM) Processor 6274 (8 cores)

Test Tool

JMeter is used to execute the test. The machine runs the JMeter has identical hardware as the test machine. There is high speed network connection between them.

Test Setup

We used 200K unique sample image and each image are being request twice, one converts to 260x420 and another converts to 114x166.

On thread group is used to run multiple threads in parallel. Each thread represent one concurrent user. We run the test multiple times, with different number of concurrent users. Each thread will request image one after another, there is no delay whatsoever.

We measure the throughput in terms of total number of images resized within a second by the server.

Test Result

Table below listed the total throughput under different load.

Concurrent Users Total Throughput (images/second)
50 278
500 109
5000 19
20000 4

The throughput degrades significantly when number of concurrent user increases, it performs worst when we need it most under high load. The throughput isn’t idea even at 50 concurrent users considering it is running on such a powerful server.

Conclusion

It is clear that we need a better solution than this. We also want to make it clear that the bottleneck is not the im4java, as it simply constructs the command line and invoke the GM process. The major overhead is spawning the new process. The test simply proves that the overhead is huge for JVM to spawn a new process, especially when it has large amount of open file handlers for socket connect.

Finally, we set out to implement our proposed solution.

Other Posts in the “Integrate Java and GraphicsMagick” Series

  1. Conception
  2. im4java Performance
  3. Interactive or Batch Mode
  4. gm4java
  5. gm4java Performance

Monday, October 01, 2012

Integrate Java and GraphicsMagick – Conception

Introduction

Working on building a new Java application that will use GraphicsMagick (hereafter GM) to perform the image transformation when both preparing the master image and dynamically resizing the image on the fly for web serving of a large ecommerce website.

Technical Challenge

GM is a C/C++ application. It doesn’t come with native Java implementation. GM has two types of the integration solution currently available for Java.

One type of integration is to use JNI (Java Native Interface). JMagick is the most matured one in this kind. But the stability issue of the application server is the main concern using JNI integration. If you search the Google or the GM support forum, it is not uncommon to see discussions about GM crashing on certain malformed image files, and/or leaking memory in one way or another. This make it a poor choice for a large website needs 24/7 uptime.

Another type of integration is to use command line wrapper represented by im4Java. im4java is a very nicely implemented command line wrapper for both ImageMagick and GM. The only concern to this approach is the performance. When every command line operation requires starting of a new process, it introduces significant overhead.

How to integrate GM with Java in most reliable and scalable way is the challenge.

Proposed Solution

Due to the reliability issue of the JNI approach, we decided to focus on solutions that run GM out of JVM process. If we can run GM in some sort of batch or interactive mode, then Java application can continuously send GM commands without starting a new GM process every time and then shutting it down.

Further, multiple GM processes can be started and pooled, and recycled when necessary for concurrency, performance and reliability.

This solution requires changes to GM command line utility and development of new Java integration library. We set out to start the work and will contribute back to the community when it’s done.

Other Posts in the “Integrate Java and GraphicsMagick” Series

  1. Conception
  2. im4java Performance
  3. Interactive or Batch Mode
  4. gm4java
  5. gm4java Performance

Saturday, September 29, 2012

Basic AD User and Group Queries

Find User DN

Many command takes full user DN as option. Below command find user DN those names start with “Smith”. ‘*’ is the wildcard character and can be used anywhere in the string.
dsquery user -name "Smith*"

Find Group DN

Similar to user DN:
dsquery group -name "MyGroup*"

List Users In Group

Use command below to find all users in a given AD security group. Replace <groupDN> with actual group DN.
dsget group <groupDN> -members | find /i "cn=users"

List All Groups Who Is a Member Of

Use command below to find all the group a given AD name belongs to. Replace <DN> with actual user DN or group DN.
dsget user <DN> -memberof

Monday, September 24, 2012

International Country Codes

Shamelessly copied from http://www.indiandata.com/international_country_codes.html. It is hard to find it by just googling.

Country 2-Letter 3-Letter 4-Digit
Afghanistan AF AFG 4
Albania AL ALB 8
Algeria DZ DZA 12
American Samoa AS ASM 16
Andorra AD AND 20
Angola AO AGO 24
Anguilla AI AIA 660
Antarctica AQ ATA 10
Antigua And Barbuda AG ATG 28
Argentina AR ARG 32
Armenia AM ARM 51
Aruba AW ABW 533
Australia AU AUS 36
Austria AT AUT 40
Azerbaijan AZ AZE 31
Bahamas BS BHS 44
Bahrain BH BHR 48
Bangladesh BD BGD 50
Barbados BB BRB 52
Belarus BY BLR 112
Belgium BE BEL 56
Belize BZ BLZ 84
Benin BJ BEN 204
Bermuda BM BMU 60
Bhutan BT BTN 64
Bolivia BO BOL 68
Bosnia And Herzegowina BA BIH 70
Botswana BW BWA 72
Bouvet Island BV BVT 74
Brazil BR BRA 76
British Indian Ocean Territory IO IOT 86
Brunei Darussalam BN BRN 96
Bulgaria BG BGR 100
Burkina Faso BF BFA 854
Burundi BI BDI 108
Cambodia KH KHM 116
Cameroon CM CMR 120
Canada CA CAN 124
Cape Verde CV CPV 132
Cayman Islands KY CYM 136
Central African Republic CF CAF 140
Chad TD TCD 148
Chile CL CHL 152
China CN CHN 156
Christmas Island CX CXR 162
Cocos (Keeling) Islands CC CCK 166
Colombia CO COL 170
Comoros KM COM 174
Congo, Democratic Republic Of CD COD 180
Congo, People's Republic Of CG COG 178
Cook Islands CK COK 184
Costa Rica CR CRI 188
Cote D'ivoire CI CIV 384
Croatia HR HRV 191
Cuba CU CUB 192
Cyprus CY CYP 196
Czech Republic CZ CZE 203
Denmark DK DNK 208
Djibouti DJ DJI 262
Dominica DM DMA 212
Dominican Republic DO DOM 214
East Timor TL TLS 626
Ecuador EC ECU 218
Egypt EG EGY 818
El Salvador SV SLV 222
Equatorial Guinea GQ GNQ 226
Eritrea ER ERI 232
Estonia EE EST 233
Ethiopia ET ETH 231
Falkland Islands (Malvinas) FK FLK 238
Faroe Islands FO FRO 234
Fiji FJ FJI 242
Finland FI FIN 246
France FR FRA 250
France, Metropolitan FX FXX 249
French Guiana GF GUF 254
French Polynesia PF PYF 258
French Southern Territories TF ATF 260
Gabon GA GAB 266
Gambia GM GMB 270
Georgia GE GEO 268
Germany DE DEU 276
Ghana GH GHA 288
Gibraltar GI GIB 292
Greece GR GRC 300
Greenland GL GRL 304
Grenada GD GRD 308
Guadeloupe GP GLP 312
Guam GU GUM 316
Guatemala GT GTM 320
Guinea GN GIN 324
Guinea-Bissau GW GNB 624
Guyana GY GUY 328
Haiti HT HTI 332
Heard And Mc Donald Islands HM HMD 334
Honduras HN HND 340
Hong Kong HK HKG 344
Hungary HU HUN 348
Iceland IS ISL 352
India IN IND 356
Indonesia ID IDN 360
Iran IR IRN 364
Iraq IQ IRQ 368
Ireland IE IRL 372
Israel IL ISR 376
Italy IT ITA 380
Jamaica JM JAM 388
Japan JP JPN 392
Jordan JO JOR 400
Kazakhstan KZ KAZ 398
Kenya KE KEN 404
Kiribati KI KIR 296
Korea, North KP PRK 408
Korea, South KR KOR 410
Kuwait KW KWT 414
Kyrgyzstan KG KGZ 417
Lao People's Democratic Republic LA LAO 418
Latvia LV LVA 428
Lebanon LB LBN 422
Lesotho LS LSO 426
Liberia LR LBR 430
Libyan Arab Jamahiriya LY LBY 434
Liechtenstein LI LIE 438
Lithuania LT LTU 440
Luxembourg LU LUX 442
Macau MO MAC 446
Macedonia, The Former Yugoslav Republic Of MK MKD 807
Madagascar MG MDG 450
  MW MWI 454
Malaysia MY MYS 458
Maldives MV MDV 462
Mali ML MLI 466
Malta MT MLT 470
Marshall Islands MH MHL 584
Martinique MQ MTQ 474
Mauritania MR MRT 478
Mauritius MU MUS 480
Mayotte YT MYT 175
Mexico MX MEX 484
Micronesia, Federated States Of FM FSM 583
Moldova, Republic Of MD MDA 498
Monaco MC MCO 492
Mongolia MN MNG 496
Montserrat MS MSR 500
Morocco MA MAR 504
Mozambique MZ MOZ 508
Myanmar MM MMR 104
Namibia NA NAM 516
Nauru NR NRU 520
Nepal NP NPL 524
Netherlands NL NLD 528
Netherlands Antilles AN ANT 530
New Caledonia NC NCL 540
New Zealand NZ NZL 554
Nicaragua NI NIC 558
Niger NE NER 562
Nigeria NG NGA 566
Niue NU NIU 570
Norfolk Island NF NFK 574
Northern Mariana Islands MP MNP 580
Norway NO NOR 578
Oman OM OMN 512
Pakistan PK PAK 586
Palau PW PLW 585
Palestinian Territory, Occupied PS PSE 275
Panama PA PAN 591
Papua New Guinea PG PNG 598
Paraguay PY PRY 600
Peru PE PER 604
Philippines PH PHL 608
Pitcairn PN PCN 612
Poland PL POL 616
Portugal PT PRT 620
Puerto Rico PR PRI 630
Qatar QA QAT 634
Reunion RE REU 638
Romania RO ROU 642
Russian Federation RU RUS 643
Rwanda RW RWA 646
Saint Kitts And Nevis KN KNA 659
Saint Lucia LC LCA 662
Saint Vincent And The Grenadines VC VCT 670
Samoa WS WSM 882
San Marino SM SMR 674
Sao Tome And Principe ST STP 678
Saudi Arabia SA SAU 682
Senegal SN SEN 686
Seychelles SC SYC 690
Sierra Leone SL SLE 694
Singapore SG SGP 702
Slovakia SK SVK 703
Slovenia SI SVN 705
Solomon Islands SB SLB 90
Somalia SO SOM 706
South Africa ZA ZAF 710
South Georgia & South Sandwich Islands Gs SG SGS 239
Spain ES ESP 724
Sri Lanka LK LKA 144
St. Helena SH SHN 654
St. Pierre And Miquelon PM SPM 666
Sudan SD SDN 736
Suriname SR SUR 740
Svalbard And Jan Mayen Islands SJ SJM 744
Swaziland SZ SWZ 748
Sweden SE SWE 752
Switzerland CH CHE 756
Syrian Arab Republic SY SYR 760
Taiwan TW TWN 158
Tajikistan TJ TJK 762
Tanzania, United Republic Of TZ TZA 834
Thailand TH THA 764
Togo TG TGO 768
Tokelau TK TKL 772
Tonga TO TON 776
Trinidad And Tobago TT TTO 780
Tunisia TN TUN 788
Turkey TR TUR 792
Turkmenistan TM TKM 795
Turks & Caicos Islands TC TCA 796
Tuvalu TV TUV 798
Uganda UG UGA 800
Ukraine UA UKR 804
United Arab Emirates AE ARE 784
United Kingdom GB GBR 826
United States US USA 840
United States, Minor Islands UM UMI 581
Uruguay UY URY 858
Uzbekistan UZ UZB 860
Vanuatu VU VUT 548
Vatican City State VA VAT 336
Venezuela VE VEN 862
Vietnam VN VNM 704
Virgin Islands, British VG VGB 92
Virgin Islands, U.S. VI VIR 850
Wallis & Futuna Islands WF WLF 876
Western Sahara EH ESH 732
Yemen YE YEM 887
Yugoslavia YU YUG 891
Zambia ZM ZMB 894
Zimbabwe ZW ZWE 716