Sunday, April 07, 2013

Integrate Java and GraphicsMagick – gm4java Performance

Introduction

In the last two posts of this series, we enhanced GraphicsMagick (GM hereafter) to support interactive or batch mode, and we developed the Java library, gm4java, to integrate with GM. Now it’s time to give it a run!

Application

This is the same web application that we tested the performance of sole im4java implementation. And we also use identical test environment and test setup for an apple to apple comparison. For the completeness, I’ll repeat the test application, environment and setup here.

The web application dynamically resize the source images that is 700x700 or less to a smaller dimension. An HTTP request come in with the individual image and the target size specified, the web application locates the source image and resize it to the target size, then streams the image back to the requester, typically a browser.

The application still use the same GM’s convert command,  “convert <source> –scale <width>x<height> <target>”, to scale down an image. The difference this time is that, instead of completely depending on im4java, we use gm4java to start 32 GM processes in interactive mode, they are pooled by gm4java to executed the convert commands coming in. That effectively eliminates the overhead of starting, and then shutting down, a new GM process for every request.

In order to avoid problems caused by any potential memory leak, each GM process was recycled after 1000 execution of commands. (Info: we have also tested with recycling disabled, that improved the performance by 1-2% and we also did not observe any memory leakage. But in the end we decided to keep the recycling enabled)

Test Environment

We also use exact same test environment that we tested the im4java integration. Again, repeated below:

Hardware

RAM: 32GB

CPU: 32 cores - 4 AMD Opteron(TM) Processor 6274 (8 cores)

Test Tool

JMeter is used to execute the test. The machine runs the JMeter has identical hardware as the test machine. There is high speed network connection between them and we have confirmed that there is no other factor being bottleneck.

Test Setup

We used 200K unique sample image and each image are being request twice, one converts to 260x420 and another converts to 114x166.

On thread group is used to run multiple threads in parallel. Each thread represent one concurrent user. We run the test multiple times, with different number of concurrent users. Each thread will request image one after another, there is no delay whatsoever.

We measure the throughput in terms of total number of images resized within a second by the server.

Test Result

Table below listed the total throughput under different load, side by side with the test result from our im4java implementation test.

Concurrent users Total Throughput (images/second)
im4java gm4java
50 278 Didn’t test
500 109 1447
5000 19 1490
20000 4 1452

The throughput is significantly higher comparing to the im4java implementation, and it stays high under high load as expected.

Disclaimer:this is NOT an comparison between im4java and gm4java, it is a comparison between the traditional integration mechanism im4java uses to integrate with GM, and the new interactive mechanism that is implemented by gm4java. gm4java does not compete with im4java, rather they complement each other.

Conclusion

The test result proves that using gm4java and GM’s new interactive/batch mode feature can provide a reliable and scalable mean to integrate Java and GM. We have successfully implemented our proposed solution, and concluded that is the solution we need.

Other Posts in the “Integrate Java and GraphicsMagick” Series

  1. Conception
  2. im4java Performance
  3. Interactive or Batch Mode
  4. gm4java
  5. gm4java Performance

Saturday, March 23, 2013

Integrate Java and GraphicsMagick – gm4java

Introduction

Now we added interactive or batch mode support to GraphicsMagick (GM hereafter), we need to enable the Java side of the integration to complete the implementation of the proposed solution. Hence the gm4java library is born.

In this post, we’ll briefly discuss how gm4java is implemented, then put more focus on its features and API.

Update (3/30): gm4java is available from central maven repository http://search.maven.org/#browse|692293610

Implementation

The implementation of gm4java that we are going to discuss sounds to be complicated but the good news is you don’t have to deal with that because gm4java encapsulates the complexity and does the heavy lifting for you. If you are not interested in the detail, you can skip right to “Feature and API” section. But I believe having some understanding of what’s under the hood always makes me a better driver.

Source code of gm4java can be found at http://code.google.com/p/kennethxublogsource/source/browse/trunk/gm4java/ (update 4/15: moved to https://github.com/kennethxu/gm4java)

Interacting with GM Process

gm4java uses Java ProcessBuilder to spawn GM process in interactive mode. Code below illustrate how the GM process is started

   1:  process = new ProcessBuilder().command(command).redirectErrorStream(true).start();
   2:  outputStream = process.getOutputStream();
   3:  inputStream = process.getInputStream();

Note that a pair of the input and output stream are obtained from the process. gm4java uses them to communicate with the GM process, by sending commands to, and receiving feedback from it.

gm4java uses below GM batch options to start GM batch process.

        gm batch -escape windows -feedback on -pass OK -fail NG -prompt off
  • gm4java always use Windows style escape for the compatibility across the platforms
  • prompt is turned off because it is just noise in the machine communication.
  • feedback is turned on because that is how gm4java come to know
    • If GM had completed the execution of the command.
    • The result of the command, whether it was failed or succeeded.
  • gm4java choose OK/NG for pass/fail feedback text, as it is unlikely that the output of any command will produce such text alone in one line.

You now understood that gm4java relies on the proper batch option to operate correctly. Hence, you should never use “set” command directly in gm4java.

GM Process Pooling

Using one single GM batch process in a highly concurrent environment is obviously not enough. Managing multiple GM batch processes is a difficulty task to say the least. gm4java solves this problem by maintaining a pool of GM batch processes for you. It internally uses Apache Commons Pool but hides the complex detail from you so you don’t need to know the Commons Pool in order to use gm4java’s GM process pooling service.

Code Quality

gm4java has nearly 100% code coverage excluding simple getter and setters.

Features and API

The API of gm4java was elaborately designed to be simply. The public interface is very well documented by Javadoc. I’ll cover the basics of the API but I refer you to the Javadoc for further reading.

The use of gm4java is very much like using JDBC connection and connection pooling, which most of Java developers are very familiar with.

The primary interface of gm4java is GMConnection.

GMConnection Interface

Just like each JDBC Connection object represent a physical connection to database server, each GMConnection instance represent an interactive session with a GM process running in interactive mode until the connection is closed by invoking its close() method.

   1:  public interface GMConnection {
   2:      String execute(String command, String... arguments) throws GMException, GMServiceException;
   3:      String execute(List<String> command) throws GMException, GMServiceException;
   4:      void close() throws GMServiceException;
   5:  }

GMConnection has two overloaded execute methods, those are the methods you would use to execute GM commands. Each command passed to the execute methods is send to the interactive GM process for execution. The output of the command is then return as a String if it was executed successfully. Otherwise a GMException is thrown and the exception message contains the output of the GM command.

The two execute methods are essentially identical, having two of them is to give you the convenience of passing the command in one way or another.

In simply uses case, you can use varargs version for simplicity. e.g.

    String result = connection.execute("identify", "a.jpg");

But if you need to construct the command conditionally and dynamically, you’ll find the List version is more convenient.

You can use execute methods to run any GM command supported in interactive mode except “set” command. Currently gm4java doesn’t prevent you from doing it but the result is undetermined, mostly failure of all subsequent commands but can also hung. In future, gm4java will try to prevent you from being able to send “set” command.

You must call close() method after you are done with GMConnection, this is again similar to database connection, failure to do so can cause connection leaking. The close() method can effectively end the associated GM process, or just returns the GM process to the pool depends on the implementation of the GMConnection.

Below is the typical usage pattern to ensure connection is closed.

   1:  final GMConnection connection = ...
   2:  try {
   3:      // use connection to run one or many GM commands
   4:  } finally {
   5:      connection.close();
   6:  }

All three methods may throw GMServiceException. It doesn’t like GMException, which is only relevant to the specific GM command being executed, GMServiceException from these methods is usually a permanent condition. It indicates a communication error between gm4java and the interactive GM process, for example, you get this exception if the GM process crashed.

Lastly, please be warned that like JDBC Connection, GMConnection is NOT thread safe!

So far all sounds simple and easy, but GMConnection is an interface, where can we get a real instance of it? That’s is GMService.

GMService Interface

Continue the JDBC analog, GMService is like DataSource. GMService defines three methods, but the important one is the getConnection() method,

   1:  public interface GMService {
   2:      String execute(String command, String... arguments) throws GMException, GMServiceException;
   3:      String execute(List<String> command) throws GMException, GMServiceException;
   4:      GMConnection getConnection() throws GMServiceException;
   5:  }

The getConnection() method returns a GMConnection object associated to either a newly started GM interactive process, or one retrieved from a pool. Whether former or later depends on the actual implementations that we’ll cover in the following sections.

The two execute methods are convenient methods to execute one GM command. Both internally calls the getConnection() method to obtain a connection, execute the given GM command and immediately close the connection. Please see the Java API document for more detail on this.

All methods in GMService are guaranteed to be thread safe.

Although, GMService methods also throw GMServiceException, but it is not necessary, and usually is not, a permanent condition. This is because every time, it deals with a difference instance of GMConnection, hence a different GM process.

GMService is an interface, gm4java provides two concrete implementations of it.

SimpleGMService

SimpleGMService, telling by its name, is a simple implementation of GMService, Its getConnection method always starts a new GM interactive process and return an instance of GMConnection that is associated with that GM process. Closing of that GMConnection effectively stops the associated GM process. Because of this nature, applications use SimpleGMService to create GMConnection should make max use of the connection before closing it. Below is a typical usage pattern.

   1:  GMService service = new SimpleGMService();
   2:   
   3:  public void bar() {
   4:      GMConnection connection = service.getConnection();
   5:      try {
   6:          // use connection to run a lot of GM commands 
   7:      } finally {
   8:          connection.close();
   9:      }
  10:  }

The two execute methods in the SimpleGMService are provided for the purpose of completeness and should not be used in general. The implementation starts a new GM interactive process, run the given command and stop the GM process. That is actually slower than simply running the GM in the old single command mode.

SimpleGMService also has a property called gmPath. You’ll need to set this to full path of GM executable if it is not already in the command search path.

PooledGMService

PooledGMService is the real star in gm4java. Analog to a JDBC connection pool, PooledGMService is able to manage a pool of GM interactive process and distribute the GM commands across them. Hence it is capable of delivering high performance and scalability in a heavily concurrent environment.

Since it internally uses Apache Commons Pool, it can support all pooling features brought in by Commons Pool which is a highly configurable object pool implementation. PooledGMService must be constructed with an instance of GMConnectionPoolConfig object, which contains a list of properties to configure the pool.

The getConnection() method in PooledGMService returns an instance of GMConnection that is associated with a pooled GM interactive process. The close() methods of the returned GMConnection effectively returns the GM interactive process back to the pool for reuse.

For the applications that run many concurrent threads and each thread is just to run one GM command, the PooledGMService is especially helpful. Its two execute() methods become very useful now. Not only they simplify the code to be written, but also optimize internally to perform better. e.g., code below

   1:  final GMConnection connection = service.getConnection();
   2:  try {
   3:      return connection.execute(command, arguments);
   4:  } finally {
   5:      connection.close();
   6:  }

can be simplified to

    return service.execute(command, arguments);

and the later is more efficient.

im4java Integration

So far, gm4java gives a new way to communicating with GM process to get the work done efficiently. It requires you to know the GM command very well and pass each command and its parameters to the execute methods. For people who are not familiar with native GM commands, there can be a steep learning curve.

Thankfully, im4java did a great job of providing a Java friendly interface to construct the GM commands. So gm4java doesn’t need reinvent the wheel. We’ll let you use your familiar im4java interface to construct operations, then give it to gm4java to executed it efficiently.

GMBatchCommand is the bridge between im4java and gm4java. The usage of it is best illustrated by a few sample programs.

Execute a convert command

   1:  GMService service = ...;
   2:   
   3:  public void foo() {
   4:      GMBatchCommand command = new GMBatchCommand(service, "convert");
   5:   
   6:      // create the operation, add images and operators/options
   7:      IMOperation op = new IMOperation();
   8:      op.addImage("input.jpg");
   9:      op.resize(800, 600);
  10:      op.addImage("output.jpg");
  11:   
  12:      // execute the operation
  13:      command.run(op);
  14:  }

Execute identify command

   1:  GMService service = ...;
   2:   
   3:  public List<String> foo() {
   4:      GMBatchCommand command = new GMBatchCommand(service, "identify");
   5:   
   6:      IMOperation op = new IMOperation();
   7:      op.ping();
   8:      final String format = "%m\n%W\n%H\n%g\n%z";
   9:      op.format(format);
  10:      op.addImage();
  11:      ArrayListOutputConsumer output = new ArrayListOutputConsumer();
  12:      command.setOutputConsumer(output);
  13:   
  14:      command.run(op, SOURCE_IMAGE);
  15:   
  16:      ArrayList<String> cmdOutput = output.getOutput();
  17:      return cmdOutput;
  18:  }

Please note that there are limitations of GMBatchCommand, at this time it doesn’t support im4java asynchronous mode and doesn’t support BufferedImage.

We do recognize the integration between im4java and gm4java is still weak. But we also believe there exists strong synergy between im4java and gm4java. Together, we can provide next level of integration between Java and GM.

Future

While gm4java works fine being an independent library, it make perfect sense to have it merged with the im4java project for the reasons mentioned above. We have started the conversation and hopefully we can make it happen.

In next post of this series, I’ll present the new test result of the same web application that we have tested before. This time it uses gm4java.

Other Posts in the “Integrate Java and GraphicsMagick” Series

  1. Conception
  2. im4java Performance
  3. Interactive or Batch Mode
  4. gm4java
  5. gm4java Performance

Sunday, March 03, 2013

Resize Nook Tablet Partition

Disclaimer: Instruction here requires advanced knowledge of Android systems. Any mistake can render your Nook useless, and I takes no responsibility when that happens. You are on your own and you are warned.

I have 16GB Nook Tablet and wish the media partition can be much bigger for my own personal audio and video. And found that there are tools and tutorials available but they are scattered so I have it organized here.

Preparation

You need two microSD cards. One must be bigger than 2GB (Note: some 2GB card is not big enough so your best bet is to get a 4GB card) and all data on that card will be erased. Let’s call it boot card. Download the bootable Jellybean (Android 4.2.2) SD card image from here. Unzip it and use Win32 Image Writer to write it to your boot card.

Another microSD card should be big enough to backup all your data. Let’s call it backup card.

Boot Nook Tablet into Recovery Mode

Insert boot card into Nook Tablet and restart it. Hold n button when you see prompt of “Hold n for menu” at the lower part of the screen. Release it when you see “Boot Menu” shown in the middle of screen. Use volume up/down button to select SDC Recovery and press n button to start recovery.

Backup

Remove the boot card and insert your backup card. Follow the instructions on the CMW recovery utility to backup your nook. When backup is finished, remove the backup card.

Run Nook Tablet in Debug Mode

Insert the boot card to Nook Tablet and restart it. The Nook Tablet will boot into Jellybean (Android 4.2.2). Go to Settings->About tablet, tap the build number many times until you see “Developer mode is enabled”. Back to settings and now you can access Developer options. Enable Android debugging.

Install ADB

I have Windows. You can install the ADT bundle but since I already have eclipse installed, I downloaded SDK Tools for Windows found under “USE AN EXISTING IDE” section of Get the Android SDK page.

Start SDK Manager and install Android SDK Platform-tools. It will by default select latest Android version but you don’t have to install that for this purpose.

Install ADB Driver

Download the USB driver for Nook, connect the nook to computer with USB cable. You’ll get the message that driver is not found. Manually install the downloaded driver, select the Composite ADB driver.

Start ADB

Follow the instructions here to start ADB tool. Make sure that you can run “adb devices” to see your device. If your device is offline, check your Nook, if you see a prompt for permission, please allow the computer to connect.

Make sure you can run “adb shell”

Resize Partition

Follow the instructions in the post below:

http://forum.xda-developers.com/showpost.php?p=22157605&postcount=25

Restore

Use boot card to boot into recovery mode again. Then replace it with backup card and restore your Nook Table. Once restoring is finished, reboot your Nook and your mission is accomplished.

Sunday, February 03, 2013

Integrate Java and GraphicsMagick – Interactive or Batch Mode

Introduction

To overcome the performance problem with the im4java integration approach. We set out to implement our proposed solution. The first step is to develop a new GraphicsMagick (GM hereafter) feature: introducing the batch mode (or interactive mode) to GM.

After going through our internal prototyping, beta testing and production release. We finally have a patch submitted to official GM code base. (Update 3/19: Thanks to Bob Friesenhahn, GraphicsMagick 1.3.18 was officially released with batch support! This post was updated with the specs of the final 1.3.18 release.)

The Batch Command

Simply invoke the GM by running the “gm” command, you will notice the new “batch” command is now listed at the top of the command list.

C:\>gm
GraphicsMagick 1.3.18 2013-03-10 Q8 http://www.GraphicsMagick.org/
Copyright (C) 2002-2013 GraphicsMagick Group.
Additional copyrights and licenses apply to this software.
See http://www.GraphicsMagick.org/www/Copyright.html for details.
Usage: gm command [options ...]

Where commands include:
      batch - issue multiple commands in interactive or batch mode
  benchmark - benchmark one of the other commands
    compare - compare two images
  composite - composite images together
    conjure - execute a Magick Scripting Language (MSL) XML script
    convert - convert an image or sequence of images
       help - obtain usage message for named command
   identify - describe an image or image sequence
    mogrify - transform an image or sequence of images
    montage - create a composite image (in a grid) from separate images
       time - time one of the other commands
    version - obtain release version
   register - register this application as the source of messages

The new batch command is used to get GM into interactive mode or execute an batch GM command file. If a file is supply to the batch command as a parameter, GM executes the commands in the file in batch mode. Otherwise, GM enters interactive mode.

Any GM command you can run in POSIX shell or Windows command prompt can be executed in interactive or batch mode.

Interactive Mode

GM Interactive mode serves two purposes:

  1. You can test and get quick feedback of the commands that you plan to use in a GM batch file.
  2. Another process can control the GM via interactive mode to run multiple commands without restarting the GM command every time. This is the most important motivation of we developing this feature. It is also a topic that is big enough for a separate post.

In future, GM may also support using the result of the previous command as input of the next command.

Enter Interactive Mode

To start GM in interactive mode, just enter “gm batch” in command line. You get the GM default prompt: “"GM>”.

C:\>gm batch
GraphicsMagick 1.3.18 2013-03-10 Q8 http://www.GraphicsMagick.org/
Copyright (C) 2002-2013 GraphicsMagick Group.
Additional copyrights and licenses apply to this software.
See http://www.GraphicsMagick.org/www/Copyright.html for details.
GM>

In the interactive or batch mode, you can execute any and all GM commands except the “batch” command itself. You can use the same help command to get a list of commands supported, please note that you simply enter the command itself, i.e., to execute the help command, simply enter “help” instead of “gm help”.

You may also notice the new “set” command in the batch mode, this command is used to change the settings of the batch mode. We’ll cover the batch mode options and settings in the following sections.

GM> help
Usage: help command [options ...]

Where commands include:
  benchmark - benchmark one of the other commands
    compare - compare two images
  composite - composite images together
    conjure - execute a Magick Scripting Language (MSL) XML script
    convert - convert an image or sequence of images
       help - obtain usage message for named command
   identify - describe an image or image sequence
    mogrify - transform an image or sequence of images
    montage - create a composite image (in a grid) from separate images
        set - change batch mode option
       time - time one of the other commands
    version - obtain release version
   register - register this application as the source of messages

Invoke Commands

You can invoke any GM command in interactive mode as if you run them in the POSIX shell or Windows command prompt. The difference is that you don’t need to start the command with “gm”, you start with the command itself. For example, the convert command that is typically invoked like below

C:\>gm convert a.jpg –scale 400x300 b.jpg

can be executed in GM interactive mode without the preceding “gm”.

GM>convert a.jpg –scale 400x400 b.jpg

Just like how you enter GM command parameters before, in the interactive or batch mode, parameters are separated by one more multiple SPACE or TAB characters. If a parameter itself contains space or tab, you need to use escape characters. I’ll devote a separate section for GM command escape in the later of this post.

Exist Interactive Mode

To exist the interactive mode, you just need to enter the EOF character. In Mac, Linux and other Unix like systems, EOF character is Ctrl-D. In Windows, EOF character is Ctrl-Z. Ctrl-Z must be the only character in the line and immediately followed by new line character.

You can also use Ctrl-C to exit the interactive mode.

Options

GM batch mode can be customized by various optional parameters. Those options can be specified as shell command line options when entering batch mode. Options can also be changed using “set” command after entering the batch mode.

To get a list of available options, you can use “gm batch –help” or “gm help batch” from POSIX shell or Windows command prompt. Or use “set –help” or “help set” in the GM batch mode.

C:\>gm batch -help
GraphicsMagick 1.3.18 2013-03-10 Q8 http://www.GraphicsMagick.org/
Copyright (C) 2002-2013 GraphicsMagick Group.
Additional copyrights and licenses apply to this software.
See http://www.GraphicsMagick.org/www/Copyright.html for details.
Usage: gm batch [options ...] [file|-]

Where options include:
  -echo on|off         echo command back to standard out, default is off
  -escape unix|windows force use Unix or Windows escape format for command line
                       argument parsing, default is platform dependent
  -fail text           when feedback is on, output the designated text if the
                       command returns error, default is 'FAIL'
  -feedback on|off     print text (see -pass and -fail options) feedback after
                       each command to indicate the result, default is off
  -help                print program options
  -pass text           when feedback is on, output the designated text if the
                       command executed successfully, default is 'PASS'
  -prompt text         use the given text as command prompt. use text 'off' or
                       empty string to turn off prompt. default to 'GM> ' if
                       and only if batch mode was entered with no file argument
  -stop-on-error on|off
                       when turned on, batch execution quits prematurely when
                       any command returns error

Unix escape allows the use backslash(\), single quote(') and double quote(") in
the command line. Windows escape only uses double quote(").  For example,

    Orignal             Unix escape              Windows escape
    [a\b\c\d]           [a\\b\\c\\d]             [a\b\c\d]
    [Text with space]   [Text\ with\ space]      ["Text with space"]
    [Text with (")]     ['Text with (")']        ["Text with ("")"]
    [Mix: "It's a (\)"] ["Mix: \"It's a (\\)\""] ["Mix: ""It's a (\)"""]

Use '-' to read command from standard input without default prompt.

You can also check the current option setting by using “set” command without parameter:

GM> set
escape        : windows
fail          : FAIL
feedback      : off
stop-on-error : off
pass          : PASS
prompt        : GM>

Each options will be discussed in the following sections in detail.

Echo Option

By default echo is turned off. Enabling -echo option asks GM to echo back every command entered in the interactive or batch mode. This is mainly for the debug purpose to what exactly is being received by GM as command and its parameters.

Feedback Options

This consists of a group of options, –feedback, –pass and –fail. The later two options are only effective when the feedback option is turned on. By default, the feedback is turned off. When the feedback is turned on, GM outputs the text defined by -pass or -fail option followed by a new line after execution of each GM command in interactive or batch mode. If the command was executed successfully, GM output the text defined by –pass option, otherwise output the text defined by –fail option.

The default text is “PASS” for –pass option and “FAIL” for –fail option.

The feedback options doesn’t seem to be very useful when the GM interactive session is controlled by human, but it is essential if another process need to interact with GM process.

Prompt Option

Option –prompt allow you to define a different GM prompt in the interactive mode, or turn the prompt off completely. By default, prompt is off if a batch file (or –) is passed to the GM batch command, otherwise “GM>” is the GM prompt text.

Stop-on-error Option

By default –stop-on-error option is off. GM will make the best effort to execute every command. When a given command fails, it will output the error message and GM will continue to accept and execute next command. Turning –stop-on-error option on will force GM to exit whenever command line syntax error occurred or a command failed to execute. This option is more useful in batch mode instead of interactive mode.

Escape Option

By default, GM uses Unix escape style unless it on Windows systems, where it uses Windows escape by default. Option –escape allows you to GM to use the escape style specified. We’ll discuss the detail of command line escape in the follow sections.

Command Line Escape

GM supports two different command line escape styles. By default, it recognizes Windows escape style in Windows system and Unix escape style in all other platforms.

In the bottom section of the help output (repeated below), GM provided the brief description of the two escape systems and a few of examples.

Unix escape allows the use backslash(\), single quote(') and double quote(") in
the command line. Windows escape only uses double quote(").  For example,

    Orignal             Unix escape              Windows escape
    [a\b\c\d]           [a\\b\\c\\d]             [a\b\c\d]
    [Text with space]   [Text\ with\ space]      ["Text with space"]
    [Text with (")]     ['Text with (")']        ["Text with ("")"]
    [Mix: "It's a (\)"] ["Mix: \"It's a (\\)\""] ["Mix: ""It's a (\)"""]

Unix Escape Style

Unix style escape recognizes three special characters, backslash(\), single quote(') and double quote("). This style gives you many conveniences but sometime can also introduce confusions.

In all use cases, you can use backslash to escape SPACE, TAB, newline, single quote, double quote and the backslash character itself. But if a parameter contains many of those characters, using backslash can be become very verbose and difficult to understand. In this case quotes are come in handy.

A parameter contains special characters can be entered by surrounding it with single quotes, as long as the parameter doesn’t contain the single quote itself. Text within the single quotes are taken by GM as is.

Double quote is different from single quote is that its content can be further escaped. Within the double quotes, you no longer need to escape SPACE, TAB, new line and single quote, but you still need to use backslash to escape the double quote and backslash itself if your parameter happen to contain any of them.

Below listed some valid use of Unix escape style.

Original Backslash Single Quotes Double Quotes
a\b\c\d a\\b\\c\\d\\ 'a\b\c\d' "a\\b\\c\\d\\"
Text with space Text\ with\ space 'Text with space' "Text with space"
Text with (") Text\ with\ (\") 'Text with (")' "Text with (\")"
Mix: "It's a (\)" Mix:\ \"It\'s\ a\ (\\)\" 'Mix: "It'\''s a (\)"' "Mix: \"It's a (\\)""
Multi
lines
Multi\
lines
'Multi
lines'
"Multi
lines"

Windows Escape Style

Believe it or not, Windows escape style is much simpler. It only uses double quote (") characters. All you need to do is surround your parameter with double quotes. If the parameter itself contains double quote characters, use two continuous double quotes to represent every one double quote in the parameter. The batch help already provide good examples for Windows escape style.

Batch Mode

A text file contains GM commands is said to be a GM batch file. GM batch mode is activated by passing a GM batch file to the batch command. Bellow example illustrated how to use the batch mode. Batch mode is essentially the same as the interactive mode that we have discussed so far, except that the batch mode uses different default option. The file path parameter signifies the batch mode and activates the batch mode defaults.

C:\>del a-small.jpg b-tiny.jpg

C:\>type test.gm
# Test is a test GraphicsMagick batch command file.
convert a.jpg -scale 400x300 a-small.jpg
convert b.jpg -scale 200x150 b-tiny.jpg

C:\>gm batch test.gm

C:\>dir /b a-small.jpg b-tiny.jpg
a-small.jpg
b-tiny.jpg

Comments

GM ignores any characters between # character and newline, except that the # character is escaped. Also, an empty line or line contains only comments is simply skipped by GM.

# Test is a test GraphicsMagick batch command file.
convert a.jpg -scale 400x300 a-small.jpg # this is also a comment
convert b.jpg -scale 200x150 b-tiny.jpg "#this is NOT a comment"

Associate .gm Extension in Windows

It would be convenient to associate .gm file extension to GM. Copy and past the text below to a file named “gm.reg”, correct the installation location of the GM then save it. Double click on gm.reg file to import the registry settings. You are all set.

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\gm_auto_file]
@="GraphicsMagick Batch File"

[HKEY_CLASSES_ROOT\gm_auto_file\DefaultIcon]
@="C:\\Program Files (x86)\\GraphicsMagick-Q8\\gm.exe,0"

[HKEY_CLASSES_ROOT\gm_auto_file\shell]

[HKEY_CLASSES_ROOT\gm_auto_file\shell\open]

[HKEY_CLASSES_ROOT\gm_auto_file\shell\open\command]
@="\"C:\\Program Files (x86)\\GraphicsMagick-Q8\\gm.exe\" batch \"%1\""

(Note: Ideally this should be included as part of the Windows installer.)

Now you can double click on the test.gm file and it will run all the commands. You can also invoke test.gm from command line like below.

C:\>del a-small.jpg b-tiny.jpg

C:\>type test.gm
# Test is a test GraphicsMagick batch command file.
convert a.jpg -scale 400x300 a-small.jpg
convert b.jpg -scale 200x150 b-tiny.jpg

C:\>test.gm

C:\>dir /b a-small.jpg b-tiny.jpg
a-small.jpg
b-tiny.jpg

Executable GM Batch Script in Unix-like Systems

You can also use

shebang to make a GM batch script file executable in Unix-like systems.

$ rm a-small.jpg b-tiny.jpg

$ cat test.gm
#!/usr/bin/gm batch
# Test is a test GraphicsMagick batch command file.
convert a.jpg -scale 400x300 a-small.jpg
convert b.jpg -scale 200x150 b-tiny.jpg

$ chmod a+x test.gm

$ ./test.gm

$ ls a-small.jpg b-tiny.jpg
a-small.jpg
b-tiny.jpg

Cross Platform Script

For a GM batch script that is intended to be executed on multiple platforms, the shebang interpreter directive should always be included in the script.

Also, you must explicitly specify the escape style with set command unless your script is compatible with both escape styles. As a good practice, you should always set the escape style in the beginning of the script.

#!/usr/bin/gm batch
# Test is a test GraphicsMagick batch command file.
set -escape windows
convert dot.png -resize 300x500! -font arial -pointsize 38 -draw "text 50 100 'NO IMAGE'" out.png
convert a.jpg -scale 400x300 a-small.jpg
convert b.jpg -scale 200x150 b-tiny.jpg

Pipeline Command

If you want to run GM batch commands that are output by another program through pipeline. You’ll need to use “-” in place of the file parameter. Otherwise the interactive mode option will be used and you’ll see prompts are echoed back to the screen.

C:\>type test.gm | gm batch
GraphicsMagick 1.3.18 2013-03-10 Q8 http://www.GraphicsMagick.org/
Copyright (C) 2002-2013 GraphicsMagick Group.
Additional copyrights and licenses apply to this software.
See http://www.GraphicsMagick.org/www/Copyright.html for details.
GM> GM> GM> GM> GM> GM>

C:\>type test.gm | gm batch -

C:\>

Summary

This is all about GM batch! In next few posts of this series, I’ll unleash the full power of GM interactive or batch mode.

Other Posts in the “Integrate Java and GraphicsMagick” Series

  1. Conception
  2. im4java Performance
  3. Interactive or Batch Mode
  4. gm4java
  5. gm4java Performance

Wednesday, November 28, 2012

Use GraphicsMagick to Convert Text To Image

Not sure if that is the best way to use GraphicsMagick, but with help of dot.png, command below generates an image with text on it. (type the command in one line)

gm convert dot.png -resize 300x500! -font arial -pointsize 38 -fill gray -draw "text 50 100 'NO IMAGE'" -draw "text 50 150 'AVAILABLE'" output.png

And you get this image:

output

Some further reading:

http://www.imagemagick.org/Usage/text/#mixed_font_lines

Saturday, November 17, 2012

Sharding Algorithm

Distributing the traffic evenly and consistently across all the servers is not difficult if the number of servers in the cluster is constant. But in the real world, you always need to take servers out of service for maintenance. The challenge of a good sharding algorithm is to avoid complete redistribution of requests.

Table below uses a simple modular algorithm. It divides and keys by number of servers in service, the remainder is the server that takes the request.

Key 396562 673665 115181 650428 804339 394035 280572 108093 938266 125314
5 nodes

2

0

1

3

4

0

2

3

1

4

4 nodes

2

1

1

0

3

3

0

1

2

2

You can notice that the if we have a 5 server (0-4) cluster and takes server 4 out of service. The requests are completely redistributed to the remaining 4 servers. We are aware of two different algorithms that provide consistency upon node change.

Look-up Ring Algorithm

Form a ring using an array that has significantly larger amount of elements that number of server nodes. For illustration purpose, we use 25 slots for 5 nodes, but the real world ratio should be much higher. The exact number can be determined by running simulation. Then randomly places the server node number in this array. In order to distribute the load evenly in normal mode, the algorithm to populate the ring need to make sure every node get same share of the slots.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
3 1 4 2 0 3 0 2 4 1 2 3 4 0 1 3 2 1 0 4 3 1 4 0 2

To determine which node gets which request, we divide the key (650428) by number of slots (25) and take the remainder (3). Use the remainder as index to get the server node number (2) in the array above. That server (2) is designated to serve the request. If the designated server node is out of service (OOS), uses the server node (0) designated by the next slot (4) in the array. The process continues until an server that is in service is found. Table below illustrates the process by showing the which server node is selected to serve the request of a set of test keys.

You can see that in the last row, when node 2 is out of service, it's load is distributed between node 0, 1 and 3. In the meantime, other requests are continue to be served by the same server node in the normal situation. That eliminates the need to completely redistribute the cache.

Key 396562 673665 115181 650428 804339 394035 280572 108093 938266 125314
MOD 25

12

15

6

3

14

10

22

18

16

14

Node selection Normal

4

3

0

2

1

2

4

0

2

1

Node 2 OOS

4

3

0

0

1

3

4

0

1

1

The advantage of using this algorithm is that the look-up speed is fast and consistent regardless of the number of server nodes that we have. The disadvantage is the need to maintain the look-up ring especially when new nodes are being added to the cluster.

Key+Node Hash Algorithm

This algorithm is to use a good hash algorithm, commonly MD5 or SHA-1. For each request, compute a value for each active node. The value is the hash of a string consisting of key and node (node number, node name or anything that uniquely identifies the node). The server yielded the largest hash value takes the request. Table below demonstrate the node selection process for a set of test keys. The hash algorithm used here is for illustration purpose only, it's neither MD5, nor SHA-1.

In the last row, you can see that when node 2 is out of service, it's load is distributed between node 0, 1 and 4. In the meantime, other requests are continue to be served by the same server node in the normal situation. That eliminates the need to completely redistribute the cache.

Node \ Key

396562 673665 115181 650428 804339 394035 280572 108093 938266 125314
Hash of Key concatenating Node Node 0

81526

40031

29723

53735

23911

34931

96088

43852

56076

38777

Node 1

5425

19393

93416

53022

51364

84920

51352

70016

26255

30336

Node 2

93129

26422

83633

65930

81901

87666

50754

32221

29866

7363

Node 3

40372

44005

22422

32105

80448

39727

33887

31331

82034

93235

Node 4

4337

89463

87164

64973

90511

14499

88153

11442

63305

29493

Node selection Normal

2

4

1

2

4

2

0

1

3

3

Node 2 OOS

0

4

1

4

4

1

0

1

3

3

The advantage of this algorithm is simple and low maintenance. Nodes can be easily added and removed from the cluster without any issue. The disadvantage is that overhead to calculate the hash value for each request. And the overhead increases when number of nodes in the cluster increases.

Saturday, October 27, 2012

Integrate Java and GraphicsMagick – im4java Performance

Introduction

In the post Integrate Java and GraphicsMagick – Conception, I have expressed the performance concern of im4java’s integration approach. In order to verify whether the issue actually exists, we implemented our first beta application using im4java to run the baseline test.

Application

This is a web application that dynamically resize the source images that is 700x700 or less to a smaller dimension. An HTTP request come in with the individual image and the target size specified, the web application locates the source image and resize it to the target size, then streams the image back to the requester, typically a browser.

The application simply use “gm convert <source> –scale <width>x<hight> <target>”, which is probably the fastest way to scale down an image. We have throttled to only run max 32 gm processes at any given time because we got the best throughput with that setting.

Test Environment

Hardware

RAM: 32GB

CPU: 32 cores - 4 AMD Opteron(TM) Processor 6274 (8 cores)

Test Tool

JMeter is used to execute the test. The machine runs the JMeter has identical hardware as the test machine. There is high speed network connection between them.

Test Setup

We used 200K unique sample image and each image are being request twice, one converts to 260x420 and another converts to 114x166.

On thread group is used to run multiple threads in parallel. Each thread represent one concurrent user. We run the test multiple times, with different number of concurrent users. Each thread will request image one after another, there is no delay whatsoever.

We measure the throughput in terms of total number of images resized within a second by the server.

Test Result

Table below listed the total throughput under different load.

Concurrent Users Total Throughput (images/second)
50 278
500 109
5000 19
20000 4

The throughput degrades significantly when number of concurrent user increases, it performs worst when we need it most under high load. The throughput isn’t idea even at 50 concurrent users considering it is running on such a powerful server.

Conclusion

It is clear that we need a better solution than this. We also want to make it clear that the bottleneck is not the im4java, as it simply constructs the command line and invoke the GM process. The major overhead is spawning the new process. The test simply proves that the overhead is huge for JVM to spawn a new process, especially when it has large amount of open file handlers for socket connect.

Finally, we set out to implement our proposed solution.

Other Posts in the “Integrate Java and GraphicsMagick” Series

  1. Conception
  2. im4java Performance
  3. Interactive or Batch Mode
  4. gm4java
  5. gm4java Performance