2015-08-07

JyNI (final) status update

Because I was accepted as a speaker for the JVM language summit (about the JyNI-project, see http://openjdk.java.net/projects/mlvm/jvmlangsummit/agenda.html), I had to finnish my GSOC-project a bit earlier, i.e. already today. I took the last two weeks off to work full time on the project and Garbage Collection is finally working as proposed. Unfortunately I could not investigate the optional goal of ctypes support, but I will follow up on this as soon as possible.


Modifications to Jython


To make this work, some modifications to Jython were required.

New GC-flags


1) FORCE_DELAYED_FINALIZATION
If activated, all Jython-style finalizers are not directly processed, but cached for a moment and processed right after the ordinary finalization process.

2) FORCE_DELAYED_WEAKREF_CALLBACKS
Rather similar to FORCE_DELAYED_FINALIZATION, but delays callbacks of Jython-style weak references.

JyNI always activates FORCE_DELAYED_WEAKREF_CALLBACKS and if there are native objects that can potentially cause a PyObject resurrection (JyNI-GC needs this sometimes), JyNI also activates FORCE_DELAYED_FINALIZATION.

FORCE_DELAYED_WEAKREF_CALLBACKS allows to restore weak references pointing to the resurrected object. This is done in a thread-safe manner and if someone calls a weak-refs get-method while the weakref is in a pending state it blocks until it was restored or finally cleared.

FORCE_DELAYED_FINALIZATION allows JyNI to prevent Jython-style finalizers from running, in case their objects were resurrected subsequently to an object-resurrection by JyNI.

This way the object-resurrection can be performed without any notable impact on Jython-level. (Raw Java-weak references would still break and also Java-finalizers would run too early, which is why PyObjects must not implement raw Java-finalizers.)


Empty PyTuple


When working with Jython I took the opportunity and unified Py.EmptyTuple and PyTuple.EMPTY_TUPLE. These were two singleton-constants for the same purpose. JyNI also has a native counterpart constant for empty tuples, but until now it was not clear to which of the named Jython constants it should be bound.


JyNI's dependence on these features implies that JyNI requires Jython >2.7.0 from now on. I aim to sync JyNI 2.7-alpha3-release with Jython 2.7.1, so that JyNI's >2.7.0-requirement is fulfillable.


Garbage collection


The more advanced GC-behavior is tested in test_JyNI_gc.py.
test_gc_list_modify_update demonstrates the case where the native reference graph is modified by an object that properly reports the modification to JyNI.
To test the edgy case of gc with silently modified native reference graph I added the listSetIndex-method to DemoExtension. This method modifies native reference graph without reporting it to JyNI. test_gc_list_modify_silent verifies that JyNI properly detects this issue and performs resurrection as needed.
Further it tests that Jython-style weak references that point to the resurrected object stay valid.


JyNI-support for weak references


Support for weak references is implemented, but not yet fully stable. Tests are needed and more debugging work. For now the code is included in JyNI, but not active. I will follow up on this -like on ctypes- as soon as possible.

2015-07-25

JyNI status update

While for midterm evaluation the milestone focused on building the mirrored reference graph and detecting native reference leaks as well as cleaning them up I focused on updating the reference graph since then. Also I turned the GC-demo script into gc-unittests, see test_JyNI_gc.py.

32 bit (Linux) JNI issue


For some reason test_JyNI_gc fails on 32 bit Linux due to seemingly (?) a JNI-bug. JNI does not properly pass some debug-info to Java-side, and causes a JVM crash. I spent over a day desperately trying several workarounds and double and triple checked correct JNI usage (the issue would also occur on 64 bit Linux if something was wrong here). The issue persists for Java 7 and 8, building JyNI with gcc or clang. The only way to avoid it seems to be passing less debug info to Java-side in JyRefMonitor.c. Strangely the issue also persists when the debug info is passed via a separate method call or object. However it would be hard or impossible to turn this into a reasonably reproducible JNI-bug report. For now I decided not to spend more time on this issue and remove the debug info right before alpha3 release. Until that release the gc-unittests are not usable on 32 bit Linux. Maybe I will investigate this issue further after GSOC and try to file an appropriate bug report.

Keeping the gc reference-graph up to date


I went through the C-source code of various CPython builtin objects and identified all places where the gc-reference graph might be modified. I inserted update-code to all these places, but it was only explicitly tested for PyList so far. All unittests and also the Tkinter demo still run fine with this major change.

Currently I am implementing detection of silent modification of the reference graph. While the update code covers all JyNI-internal calls that modify the graph, there might be modifications via macros performed by extension code. To detect these, let's go into JyGC_clearNativeReferences in gcmodule. This is getting enhanced by code that checks the objects-to-be-deleted for consistent native reference counts. All counts should be explainable within this subgraph. If there are unexplainable reference counts, this indicates unknown external links, probably created by an extension via some macro, e.g. PyList_SET_ITEM. In this case we'll update the graph accordingly. Depending of the object type we might have to resurrect the corresponding Java object. I hope to get this done over the weekend.

2015-06-28

Midterm evaluation

The midterm-evaluation milestone is as follows:
Have JyNI detect and break reference-cycles in native objects backed by Java-GC. This must be done by Java-GC in order to deal with interfering non-native PyObjects. Further this functionality must be monitorable, so that it can transparently be observed and confirmed.

Sketch of some issues

The issues to overcome for this milestone were manifold:
  • The ordinary reference-counting for scenarios that actually should work without GC contained a lot of bugs in JyNI C-code. This had to be fixed. When I wrote this code initially, the GC-concept was still an early draft and in many scenarios it was unclear whether and how reference-counting should be applied. Now all this needed to be fixed (and there are probably still remaining issues of this type)
  • JNI defines a clear policy how to deal with provided jobject-pointers. Some of them must be freed explicitly. On the other hand some might be freed implicitly by the JVM - without your intention, if you don't get it right. Also on this front vast clean-up in JyNI-code was needed, also to avoid immortal trash.
  • JyNI used to keep alive Java-side-PyObjects that were needed by native objects indefinitely.
    Now these must be kept alive by the Java-copy of the native reference-graph instead. It was hard to make this mechanism sufficiently robust. Several bugs caused reference-loss and had to be found to make the entire construct work. On the other hand some bugs also caused hard references to persist, which kept Java-GC from collecting the right objects and triggering JyNI's GC-mechanism.
  • Issues with converting self-containing PyObjects between native side and Java-side had to be solved. These were actually bugs unrelated to GC, but still had to be solved to achieve the milestone.
  • A mechanism to monitor native references from Java-side, especially their malloc/free actions had to be established.
    Macros to report these actions to Java/JyNI were inserted into JyNI's native code directly before the actual calls to malloc or free. What made this edgy is the fact that some objects are not freed by native code (which was vastly inherited from CPython 2.7), but cached for future use (e.g. one-letter strings, small numbers, short tuples, short lists). Acquiring/returning an object from/to such a cache is now also reported as malloc/free, but specially flagged. For all these actions JyNI records timestamps and maintains a native object-log where one can transparently see the lifetime-cycle of each native object.
  • The original plan to explore native object's connectivity in the GC_Track-method is not feasible because for tuples and lists this method is usually called before the object is populated.
    JyNI will have a mechanism to make it robust of invalid exploration-attempts, but this mechanism should not be used for normal basic operation (e.g. tuple-allocation happens for every method-call) but only for edgy cases, e.g. if an extension defines its own types, registers instances of them in JyNI-GC and then does odd stuff with them.
    So now GC_track saves objects in a todo-list regarding exploration and actual exploration is performed at some critical JyNI-operations like on object sync-on-init or just before releasing the GIL. It is likely that this strategy will have to be fine-tuned later.

Proof of the milestone

To prove the achievement of the explained milestone I wrote a script that creates a reference-cycle of a tuple and a list such that naive reference-counting would not be sufficient to break it. CPython would have to make use of its garbage collector to free the corresponding references.
  1. I pass the self-containing tuple/list to a native method-call to let JyNI create native counterparts of the objects.
  2. I demonstrate that JyNI's reference monitor can display the corresponding native objects ("leaks" in some sense).
  3. The script runs Java-GC and confirms that it collects the Jython-side objects (using a weak reference).
  4. JyNI's GC-mechanism reports native references to clear. It found them, because the corresponding JyNI GC-heads were collected by Java-GC.
  5. Using JyNI's reference monitor again, I confirm that all native objects were freed. Also those in the cycle.

The GC demonstration-script


import time
from JyNI import JyNI
from JyNI import JyReferenceMonitor as monitor
from JyNI.gc import JyWeakReferenceGC
from java.lang import System
from java.lang.ref import WeakReference
import DemoExtension

#Note:
# For now we attempt to verify JyNI's GC-functionality independently from
# Jython concepts like Jython weak references or Jython GC-module.
# So we use java.lang.ref.WeakReference and java.lang.System.gc
#
to monitor and control Java-gc.

JyNI.JyRefMonitor_setMemDebugFlags(1)
JyWeakReferenceGC.monitorNativeCollection = True

l = (123, [0, "test"])
l[1][0] = l
#We create weak reference to l to monitor collection by Java-GC:
wkl = WeakReference(l)
print "weak(l): "+str(wkl.get())

# We pass down l to some native method. We don't care for the method itself,
# but conversion to native side causes creation of native PyObjects that
# correspond to l and its elements. We will then track the life-cycle of these.
print "make l native..."
DemoExtension.argCountToString(l)

print "Delete l... (but GC not yet ran)"
del l
print "weak(l) after del: "+str(wkl.get())
print ""
# monitor.list-methods display the following format:
# [native pointer]{'' | '_GC_J' | '_J'} ([type]) #[native ref-count]: [repr] *[creation time]
# _GC_J means that JyNI tracks the object
# _J means that a JyNI-GC-head exists, but the object is not actually treated by GC
# This can serve monitoring purposes or soft-keep-alive (c.f. java.lang.ref.SoftReference)
# for caching.
print "Leaks before GC:"
monitor.listLeaks()
print ""

# By inserting this line you can confirm that native
# leaks would persist if JyNI-GC is not working:
#JyWeakReferenceGC.nativecollectionEnabled = False

print "calling Java-GC..."
System.gc()
time.sleep(2)
print "weak(l) after GC: "+str(wkl.get())
print ""
monitor.listWouldDeleteNative()
print ""
print "leaks after GC:"
monitor.listLeaks()

print ""
print "===="
print "exit"
print "===="


It is contained in JyNI in the file JyNI-Demo/src/JyNIRefMonitor.py

Instructions to reproduce this evaluation

  1. You can get the JyNI-sources by calling
    git clone https://github.com/Stewori/JyNI
    Switch to JyNI-folder:
    cd JyNI
  2. (On Linux with gcc) edit the makefile (for OSX with llvm/clang makefile.osx) to contain the right paths for JAVA_HOME etc. You can place a symlink to jython.jar (2.7.0 or newer!) in the JyNI-folder or adjust the Jython-path in makefile.
  3. Run make (Linux with gcc)
    (for OSX with clang use make -f makefile.osx)
  4. To build the DemoExtension enter its folder:
    cd DemoExtension
    and run setup.py:
    python setup.py build
    cd ..
  5. Confirm that JyNI works:
    ./JyNI_unittest.sh
  6. ./JyNI_GCDemo.sh

Discussion of the output


Running JyNI_GCDemo.sh:

JyNI: memDebug enabled!
weak(l): (123, [(123, [...]), 'test'])
make l native...
Delete l... (but GC not yet ran)
weak(l) after del: (123, [(123, [...]), 'test'])

Leaks before GC:
Current native leaks:
139971370108712_GC_J (list) #2: "[(123, [...]), 'test']" *28
139971370123336_J (str) #2: "test" *28
139971370119272_GC_J (tuple) #1: "((123, [(123, [...]), 'test']),)" *28
139971370108616_GC_J (tuple) #3: "(123, [(123, [...]), 'test'])" *28

calling Java-GC...
weak(l) after GC: None

Native delete-attempts:
139971370108712_GC_J (list) #0: -jfreed- *28
139971370123336_J (str) #0: -jfreed- *28
139971370119272_GC_J (tuple) #0: -jfreed- *28
139971370108616_GC_J (tuple) #0: -jfreed- *28

leaks after GC:
no leaks recorded

====
exit
====
Let's briefly discuss this output. We created a self-containing tuple called l. To allow it to self-contain we must put a list in between. Using a Java-WeakReference, we confirm that Java-GC collects our tuple. Before that we let JyNI's reference monitor print a list of native objects that are currently allocated. We refer to them as "leaks", because all native calls are over and there is no obvious need for natively allocated objects now. #x names their current native ref-count. It explains as follows (observe that it contains a cycle):
139971370108712_GC_J (list) #2: "[(123, [...]), 'test']"

This is l[1]. One reference is from JyNI to keep it alive, the second one is from l.

139971370123336_J (str) #2: "test"

This is l[1][1]. One reference is from JyNI to keep it alive, the second one is from l[1].


139971370119272_GC_J (tuple) #1: "((123, [(123, [...]), 'test']),)"
This is the argument-tuple that was used to pass l to the native method. The reference is from JyNI to keep it alive.
139971370108616_GC_J (tuple) #3: "(123, [(123, [...]), 'test'])" 
This is l. One reference is from JyNI to keep it alive, the second one is from the argument-tuple (139971370108616)and the third one is from l[1]. Thus it builds a reference-cycle with l[1].

After running Java-GC (and giving it some time to finnish) we confirm that our weak reference to l was cleared. And indeed, JyNI's GC-mechanism reported some references to clear, all reported leaks among them. Finally another call to JyNI's reference monitor does not list leaks any more.

Check that this behavior is not self-evident

In JyNI-Demo/src/JyNIRefMonitor.py go to the section:

# By inserting this line you can confirm that native
# leaks would persist if JyNI-GC is not working:
#JyWeakReferenceGC.nativecollectionEnabled = False

Change it to
# By inserting this line you can confirm that native
# leaks would persist if JyNI-GC is not working:

JyWeakReferenceGC.nativecollectionEnabled = False

Run JyNI_GCDemo.sh again. You will notice that the native leaks persist.

Next steps

The mechanism currently does not cover all native types. While many should already work I expect that some bugfixing and clean-up will be required to make this actually work. With the demonstrated reference-monitor-mechanism the necessary tools to make this debugging straight forward are now available.

After fixing the remaining types and providing some tests for this, I will implement an improvement to the GC-mechanism that makes it robust against silent modification of native PyObjects (e.g. via macros). And provide tests for this.

Finally I will add support for the PyWeakReference builtin type. As far as time allows after that I'll try to get ctypes working.

2015-06-19

GSoC status-update for 2015-06-19

I finally completed the core-GC routine that explores the native PyObject reference-connectivity graph and reproduces it on Java-side. Why mirror it on Java side? Let me comprehend the reasoning here. Java performs a mark-and-sweep GC on its Java-objects, but there is no way to extend this to native objects. On the other hand using CPython's reference-counting approach for native objects is not always feasible, because there are cases where a native object must keep its Java-counterpart alive (JNI provides a mechanism for this), allowing it to participate in an untracable reference cycle. So we go the other way round here, and let Java-GC track a reproduction of the native reference connectivity-graph. Whenever we observe that it deletes a node, we can discard the underlying native object. Keeping the graph up to date is still a tricky task, which we will deal with in the second half of GSoC.

The native reference-graph is explored using CPython-style traverseproc mechanism, which is also implemented by extensions that expect to use GC at all. To mirror the graph on Java-side I am distinguishing 8 regular cases displayed in the following sketch. These cases deal with representing the connection between native and managed side of th JVM.

In the sketch you can see that native objects have a so-called Java-GC-head assigned that keeps alive the native object (non-dashed arrow), but is only weakly reachable from it (dashed arrow). The two left-most cases deal with objects that only exist natively. The non-GC-case usually needs no Java-GC-head as it cannot cause reference cycles. Only in GIL-free mode we would still track it as a replacement for reference-counting. However GIL-free mode is currently a vague consideration and out of scope for this GSoC-project. Case 3 and 4 from left deal with objects where Jython has no corresponding type and JyNI uses a generic PyCPeer-object - a PyObject-subclass forwarding the magic methods to native side. PyCPeer in both variants serves also as a Java-GC-head. CStub-cases refer to situations where the native object needs a Java-object as backend. In these cases the Java-GC-head must not only keep alive other GC-heads, but also the Java-backend. Finally in mirror-mode both native and managed representations can be discarded independently from each other at any time, but for performance reasons we try to softly keep alive the counterparts for a while. On Java-side we can use a soft reference for this.

PyList is a special case in several ways. It is a mutable type that can be modified by C-macros at any time. Usually we move the java.util.List backend to native side. For this it is replacing by JyList - a List-implementation that is backed by native memory, thus allowing C-macros to work on it. The following sketch illustrates how we deal with this case.


It works roughly equivalent to mirror mode but with the difference that the Jython-PyList must keep alive its Java-GC-head. For a most compact solution we build the GC-head functionality into JyList.

Today I finished the implementation of the regular cases, but testing and debugging still needs to be done. I can hopefully round this up for midterm evaluation and also include the PyList-case.

2015-06-05

Establish a proper referencing-paradigm (as preparation for GC-implementation)

In the last days I tidied up a lot reference-related stuff and clarified what reference-types the type-conversion methods deliver. Before starting with the actual GC-work, much clean-up regarding ordinary reference-counting needed to be done, especially related to singletons and pre-allocated small integers and length-one strings/unicode-objects (characters). Both CPython and Jython do pre-allocate such frequently used small objects and I felt these caches should be linked to improve conversion-performance. Now the integer-conversion methods check whether the integers are small pre-allocated objects and if so establishes a permanent linkage between the CPython- and Jython-occurrence of the converted object.

More important: I established some rules regarding referencing and worked out their JyNI-wide appliance.

- JyObjects (i.e. JyNI-specific pre-headers of native PyObjects) store a link to a corresponding jobject (as of the very beginning of JyNI-development). It is now specified that this is always of JNI-type WeakGlobalReference. Accordingly I re-declared the JyObject-struct to use jweak-type instead of jobject.

- The conversion-method
jobject JyNI_JythonPyObject_FromPyObject(PyObject* op)
now is specified to return a JNI-LocalReference. If this is obtained from the jweak stored in a JyObject, it is checked to be still alive. If so, a LocalReference is created to keep it alive until the caller is done with it. If the caller wants to store it, he must create a GlobalReference from it, see
http://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/functions.html#weak_global_references
for more details on JNI-references. The macros JyNIClearRef and JyNIToGlobalRef are now provided as efficient helpers in this context.

- The conversion-method
PyObject* JyNI_PyObject_FromJythonPyObject(jobject jythonPyObject)
now consistently always returns a *new* reference, i.e. the caller must decref it after using it. (GIL-free mode will be another story, but will most likely just ignore ref-counting, so the decref command won't harm this mode.)

These rules clarify the usage of the conversion-functions. Fixing all calls to these methods JyNI-wide to comply with the new semantics (having to use decref or JyNIClearRef etc) is still in progress. However having this referencing-paradigm specified and cleaned up is a crucial initial step to implement a proper garabge collction in JyNI.

2015-05-22

Getting started...

So I am at Gsoc now and really happy that everything worked out so well. Huge thanks to Jim Baker for making this possible!


Now let me introduce you to my project. I am the creator of the JyNI-project, see www.jyni.org for more details. Gsoc sponsors the development of the next important milestone: GC-support and hopefully also support for the ctypes-extension. But what am I talking here... just take look at my Gsoc-abstract:


JyNI is a compatibility layer with the goal to enable Jython to use native CPython extensions like NumPy or SciPy. It already supports a fair part of Python's C-API and is for instance capable of running basic TKinter code (currently linux-only). However, a main show-stopper for a production-release is its lack of garbage collection. The gap between Jython- (i.e. Java-)gc and CPython-gc goes far beyond the difference between reference-counting- and mark-and-sweep-based approaches. Even more important is the philosophy regarding gc in native interfaces. While CPython exposes its gc-mechanism in native API, allowing native extensions to leverage it by following some instructions (i.e. perform reference counting and gc-registration), Java's native interface (JNI) leaves memory management completely to the native code. As a preparation for this proposal I worked out a concept how native gc can be emulated for JyNI in a way that is (almost 100%) fully compatible with CPython's native gc. That means a native extension written for CPython would run without modification on JyNI, having gc-emulation behave consistently to Jython gc. This includes weak referencing, finalization and object resurrection (aspects that are famously known to make clean gc support so hard). JyNI-support for weak references includes providing the PyWeakReference builtin-type, which is currently the main show-stopper to support the original native ctypes-extension. Thus, as an optional/secondary goal I propose to complete JyNI-support of the ctypes-extension. This is proposed softly, under the assumption that no further, still unrecognized hard show-stoppers for that extension come up.



In a few days bonding period will be over and things will get serious. Since I did not have to learn much Jython- and JyNI-internals any more, I could use this period to close some open endings in JyNI which were actually unrelated to the Gsoc project, but -however- having this done feels much better now. I also fixed two related Jython-issues. Finally I reasoned about the Gsoc-project and think it would be a crucial debugging-feature if JyNI could dump a memory-allocation history of all its native references at any time (with Java GC-runs in time-line please!). To achieve this, I am currently writing the ReferenceMonitor-class, which will also expose this information in a way that allows to write gc-related unittests. (As this monitoring functionality was not considered in the timeline, it would be good to get it done before coding officially starts.)

Recently Jim Baker - mentor of this project - managed to get two JRuby developers into an email-discussion with us and we gathered some interesting details how JRuby handles/plans to handle C-extension API. Thanks for this interesting thread!

2015-03-23

Welcome

Hello developers, friends and other curious visitors,

welcome to my blog!


I am currently getting the GSoC proposal in line. Keep fingers crossed, it will work out nicely in time...