I finally completed the core-GC routine that explores the native PyObject reference-connectivity graph and reproduces it on Java-side. Why mirror it on Java side? Let me comprehend the reasoning here. Java performs a mark-and-sweep GC on its Java-objects, but there is no way to extend this to native objects. On the other hand using CPython's reference-counting approach for native objects is not always feasible, because there are cases where a native object must keep its Java-counterpart alive (JNI provides a mechanism for this), allowing it to participate in an untracable reference cycle. So we go the other way round here, and let Java-GC track a reproduction of the native reference connectivity-graph. Whenever we observe that it deletes a node, we can discard the underlying native object. Keeping the graph up to date is still a tricky task, which we will deal with in the second half of GSoC.
The native reference-graph is explored using CPython-style traverseproc mechanism, which is also implemented by extensions that expect to use GC at all. To mirror the graph on Java-side I am distinguishing 8 regular cases displayed in the following sketch. These cases deal with representing the connection between native and managed side of th JVM.
In the sketch you can see that native objects have a so-called Java-GC-head assigned that keeps alive the native object (non-dashed arrow), but is only weakly reachable from it (dashed arrow). The two left-most cases deal with objects that only exist natively. The non-GC-case usually needs no Java-GC-head as it cannot cause reference cycles. Only in GIL-free mode we would still track it as a replacement for reference-counting. However GIL-free mode is currently a vague consideration and out of scope for this GSoC-project. Case 3 and 4 from left deal with objects where Jython has no corresponding type and JyNI uses a generic PyCPeer-object - a PyObject-subclass forwarding the magic methods to native side. PyCPeer in both variants serves also as a Java-GC-head. CStub-cases refer to situations where the native object needs a Java-object as backend. In these cases the Java-GC-head must not only keep alive other GC-heads, but also the Java-backend. Finally in mirror-mode both native and managed representations can be discarded independently from each other at any time, but for performance reasons we try to softly keep alive the counterparts for a while. On Java-side we can use a soft reference for this.
PyList is a special case in several ways. It is a mutable type that can be modified by C-macros at any time. Usually we move the java.util.List backend to native side. For this it is replacing by JyList - a List-implementation that is backed by native memory, thus allowing C-macros to work on it. The following sketch illustrates how we deal with this case.
It works roughly equivalent to mirror mode but with the difference that the Jython-PyList must keep alive its Java-GC-head. For a most compact solution we build the GC-head functionality into JyList.
Today I finished the implementation of the regular cases, but testing and debugging still needs to be done. I can hopefully round this up for midterm evaluation and also include the PyList-case.
No comments:
Post a Comment