[Durham] Calling C code from Java

Sun Sep 9 10:46:18 UTC 2012

Hi Martin,

On 04/09/2012 22:32, Oliver Burnett-Hall wrote:
> On 4 September 2012 15:03, Martin Ward <martin at gkc.org.uk> wrote:
>>
>> I am looking into converting Assembler and COBOL code into Java:
>> for which there are two main problems:
>
> Martin,
>
>  From the problems you've got, and the possible solutions you are
> suggesting, it sounds to me that you're still trying to program in a
> procedural language, rather than getting into the Java mindset.
>
>> (1) Data types: in assembler, any data can be used in any operation
>> regardless of type. In COBOL, all data is typed, but an area of memory
>> can be "redefined" as two or more different types (similar to C's union).
>> In Java, all data has a single type and cannot be recast or unioned.
>> The workaround for this is to determine the "main" type for each data item,
>> and declare it as that type, then develop conversion functions
>> for when the program wants to apply a different operation on a type
>> (eg convert a number to a string of bytes, or vice-versa).
>
> A more OO solution to this would be to create different classes for
> each of the possible types, and then either define polymorphic
> functions that accept arguments of each type, or make the classes
> implement the same interface so that they have the same methods and
> accessor functions.
>
> There will be an increased overhead from using classes instead of
> primitive types but, unless you're in a highly performance critical
> section of code, it usually doesn't make a great deal of difference.
>
There's not much I can add to Olly's suggestions. Some of your decisions 
will undoubtedly come down to how much interoperability with legacy 
assembler/COBOL you need to retain.

I don't think you would get away with trying to get a byte for byte 
mapping between java and legacy representations of types for in-memory 
representations. Sure you would be able to manage it whilst calling 
between languages using jna. If you can live with just converting when 
calling between the two then using a polymorphic library that will do 
type conversions as required might work.

If you need to interoperate on byte representation in-memory then the 
other option to consider would be developing a jni interface that 
represents your datatypes. From that you could convert the business 
logic into java. Note I haven't tried this so I'm going from theory here 
not practical experience.

>> (2) Pointers: specifically pointer arithmetic. Java only has a limited
>> form of pointers (references) with no pointer arithmetic or way to get
>> the address of a data item, or convert a number to an address.
>> The workaround here would be to pass a reference to the data to
>> a C function which returns an integer. The integer being simply
>> the address that the function was passed. Dereferencing a pointer
>> stored in an integer would word in the same way: pass an integer
>> to a C function which returns a pointer to the data (with the pointer
>> being simply the integer which was passed).
>> One problem here is that Java may not lay out data in memory
>> in an exact sequence: for assembler to C conversion we had
>> to put all the data into one large struct which was defined as packed.
>> This ensured that an offset from the address of one data item
>> would pick up the right data. In Java this may be more difficult.
>> Another problem is garbage collection: the Java garbage collector
>> might move data around and invalidate all the "fake pointers".
>> We might have to define any data we want to take the address of
>> on the "C side" of the boundary.
>
> Java certainly doesn't do pointer arithmetic and what you're
> suggesting sounds very scary and error prone.
>
> There are a couple of ways of accessing memory, such as the
> ByteBuffer, but I know almost nothing about these.  Why are you
> needing to do this type of memory manipulation?  Is there any way you
> could avoid getting so close to the bare metal?
>

Truly scary to try to get that right.

Given that I have some past insights into what you (Martin) may be 
trying to do then...

Assuming that you can get rid of the legacy code altogether then why not 
do an initial simple conversion from legacy to new using a virtual 
machine/memory class that basically uses an array as backing store. Your 
translated code would then access that backing store at runtime.

If you needed interop with legacy code you might even be able to develop 
a JNI version that uses the real memory as backing store. But I'd do 
that later once the concept is proven.

Then when you start refactoring the code you can develop high level 
object abstractions that hold their own state but also reference the 
original backing store to (a) check that no-one else has changed the 
data i.e. your refactoring was a bit wrong; and (b) provide interop into 
you can safely drop the old representation.

A lot of this assumes that performance/memory usage isn't an issue.

There's some ideas anyway. A lot of it really does depend on how your 
new code needs to interoperate.

Regards

Richard