- Python Slot Wrapper
- Slot Wrapper Python Online
- Slot Wrapper Python Download
- Python Wrapper Example
- How To Use Python Wrapper
PEP: | 579 |
---|---|
Title: | Refactoring C functions and methods |
Author: | Jeroen Demeyer <J.Demeyer at UGent.be> |
BDFL-Delegate: | Petr Viktorin |
Status: | Final |
Type: | Informational |
Created: | 04-Jun-2018 |
Post-History: | 20-Jun-2018 |
Contents
- Issues
Types expose slot wrappers to Python. Python objects place Python functions in type slots How do they do similar things? They are not equivalent! ʮ POF MJLF UIF PUIFS ʯ. Python is fundamentally not designed to be faster because it leaks a lot of stuff that’s inherently slow that real world code depends on. That’s mutable interpreter frames, global interpreter locks, shared global state, type slots, the C ABI. The only way to speed it up would be. If I understand the purpose of these slot wrapper functions, that should give a free speed up to all types implemented at the Python level, particularly numeric types (e.g. Fractions.Fraction) and container/iterator types (speeding up len and next respectively). I think it is reasonable to restrict the self argument of method descriptors and slot wrapper descriptors to real instances of the type. The called method can't cope with the value anyway (in the general case). Alternative Python implementations like Jython and PyPy already enforce this. Attached is a patch against default branch that enforces.
This PEP describes design issues addressed in PEP 575, PEP 580, PEP 590(and possibly later proposals).
As noted in PEP 1:
Informational PEPs do not necessarily represent a Python communityconsensus or recommendation, so users and implementers are free toignore Informational PEPs or follow their advice.
While there is no concensus on whether the issues or the solutions inthis PEP are valid, the list is still useful to guide further design.
This meta-PEP collects various issues with CPython's existing implementationof built-in functions (functions implemented in C) and methods.
Fixing all these issues is too much for one PEP,so that will be delegated to other standards track PEPs.However, this PEP does give some brief ideas of possible fixes.This is mainly meant to coordinate an overall strategy.For example, a proposed solution may sound too complicatedfor fixing any one single issue, but it may be the best overallsolution for multiple issues.
This PEP is purely informational:it does not imply that all issues will eventuallybe fixed, nor that they will be fixed using the solution proposed here.
It also serves as a check-list of possible requested featuresto verify that a given fix does not make thoseother features harder to implement.
The major proposed change is replacing PyMethodDefby a new structure PyCCallDefwhich collects everything needed for calling the function/method.In the PyTypeObject structure, a new field tp_ccalloffsetis added giving an offset to a PyCCallDef * in the object structure.
NOTE: This PEP deals only with CPython implementation details,it does not affect the Python language or standard library.
Python Slot Wrapper
This lists various issues with built-in functions and methods,together with a plan for a solution and (if applicable)pointers to standards track PEPs discussing the details.
1. Naming
The word 'built-in' is overused in Python.From a quick skim of the Python documentation, it mostly refersto things from the builtins module.In other words: things which are available in the global namespacewithout a need for importing them.This conflicts with the use of the word 'built-in' to mean 'implemented in C'.
Solution: since the C structure for built-in functions and methods is alreadycalled PyCFunctionObject,let's use the name 'cfunction' and 'cmethod' instead of 'built-in function'and 'built-in method'.
2. Not extendable
The various classes involved (such as builtin_function_or_method)cannot be subclassed:
This is a problem because it makes it impossible to add featuressuch as introspection support to these classes.
If one wants to implement a function in C with additional functionality,an entirely new class must be implemented from scratch.The problem with this is that the existing classes likebuiltin_function_or_method are special-cased in the Python interpreterto allow faster calling (for example, by using METH_FASTCALL).It is currently impossible to have a custom class with the same optimizations.
Solution: make the existing optimizations available to arbitrary classes.This is done by adding a new PyTypeObject field tp_ccalloffset(or can we re-use tp_print for that?)specifying the offset of a PyCCallDef pointer.This is a new structure holding all information needed to calla cfunction and it would be used instead of PyMethodDef.This implements the new 'C call' protocol.
For constructing cfunctions and cmethods, PyMethodDef arrayswill still be used (for example, in tp_methods) but that willbe the only remaining purpose of the PyMethodDef structure.
Additionally, we can also make some function classes subclassable.However, this seems less important once we have tp_ccalloffset.
Reference: PEP 580
3. cfunctions do not become methods
A cfunction like repr does not implement __get__ to bindas a method:
In this example, one would have expected that x.meth() returnsrepr(x) by applying the normal rules of methods.
This is surprising and a needless differencebetween cfunctions and Python functions.For the standard built-in functions, this is not really a problemsince those are not meant to used as methods.But it does become a problem when one wants to implement anew cfunction with the goal of being usable as method.
Again, a solution could be to create a new class behaving justlike cfunctions but which bind as methods.However, that would lose some existing optimizations for methods,such as the LOAD_METHOD/CALL_METHOD opcodes.
Solution: the same as the previous issue.It just shows that handling self and __get__should be part of the new C call protocol.
For backwards compatibility, we would keep the existing non-bindingbehavior of cfunctions. We would just allow it in custom classes.
Reference: PEP 580
4. Semantics of inspect.isfunction
Currently, inspect.isfunction returns True only for instancesof types.FunctionType.That is, true Python functions.
A common use case for inspect.isfunction is checking for introspection:it guarantees for example that inspect.getfile() will work.Ideally, it should be possible for other classes to be treated asfunctions too.
Solution: introduce a new InspectFunction abstract base classand use that to implement inspect.isfunction.Alternatively, use duck typing for inspect.isfunction(as proposed in [2]):
5. C functions should have access to the function object
The underlying C function of a cfunction currentlytakes a self argument (for bound methods)and then possibly a number of arguments.There is no way for the C function to actually access the Pythoncfunction object (the self in __call__ or tp_call).This would for example allow implementing theC call protocol for Python functions (types.FunctionType):the C function which implements calling Python functionsneeds access to the __code__ attribute of the function.
This is also needed for PEP 573where all cfunctions require access to their 'parent'(the module for functions of a module or the defining classfor methods).
Solution: add a new PyMethodDef flag to specifythat the C function takes an additional argument (as first argument),namely the function object.
References: PEP 580, PEP 573
6. METH_FASTCALL is private and undocumented
The METH_FASTCALL mechanism allows calling cfunctions and cmethodsusing a C array of Python objects instead of a tuple.This was introduced in Python 3.6 for positional arguments onlyand extended in Python 3.7 with support for keyword arguments.
However, given that it is undocumented,it is presumably only supposed to be used by CPython itself.
Solution: since this is an important optimization,everybody should be encouraged to use it.Now that the implementation of METH_FASTCALL is stable, document it!
As part of the C call protocol, we should also add a C API function
Reference: PEP 580
7. Allowing native C arguments
A cfunction always takes its arguments as Python objects(say, an array of PyObject pointers).In cases where the cfunction is really wrapping a native C function(for example, coming from ctypes or some compiler like Cython),this is inefficient: calls from C code to C code are forced to usePython objects to pass arguments.
Analogous to the buffer protocol which allows access to C data,we should also allow access to the underlying C callable.
Solution: when wrapping a C function with native arguments(for example, a C long) inside a cfunction,we should also store a function pointer to the underlying C function,together with its C signature.
Argument Clinic could automatically do this by storinga pointer to the 'impl' function.
8. Complexity
There are a huge number of classes involved to implementall variations of methods.This is not a problem by itself, but a compounding issue.
For ordinary Python classes, the table below gives the classesfor various kinds of methods.The columns refer to the class in the class __dict__,the class for unbound methods (bound to the class)and the class for bound methods (bound to the instance):
kind | __dict__ | unbound | bound |
---|---|---|---|
Normal method | function | function | method |
Static method | staticmethod | function | function |
Class method | classmethod | method | method |
Slot method | function | function | method |
This is the analogous table for extension types (C classes):
kind | __dict__ | unbound | bound |
---|---|---|---|
Normal method | method_descriptor | method_descriptor | builtin_function_or_method |
Static method | staticmethod | builtin_function_or_method | builtin_function_or_method |
Class method | classmethod_descriptor | builtin_function_or_method | builtin_function_or_method |
Slot method | wrapper_descriptor | wrapper_descriptor | method-wrapper |
There are a lot of classes involvedand these two tables look very different.There is no good reason why Python methods should betreated fundamentally different from C methods.Also the features are slightly different:for example, method supports __func__but builtin_function_or_method does not.
Since CPython has optimizations for calls to most of these objects,the code for dealing with them can also become complex.A good example of this is the call_function function in Python/ceval.c.
Solution: all these classes should implement the C call protocol.Then the complexity in the code can mostly be fixed bychecking for the C call protocol (tp_ccalloffset != 0)instead of doing type checks.
Furthermore, it should be investigated whether some of these classes can be mergedand whether method can be re-used also for bound methods of extension types(see PEP 576 for the latter,keeping in mind that this may have some minor backwards compatibility issues).This is not a goal by itself but just something to keep in mindwhen working on these classes.
9. PyMethodDef is too limited
The typical way to create a cfunction or cmethod in an extension moduleis by using a PyMethodDef to define it.These are then stored in an array PyModuleDef.m_methods(for cfunctions) or PyTypeObject.tp_methods (for cmethods).However, because of the stable ABI (PEP 384),we cannot change the PyMethodDef structure.
So, this means that we cannot add new fields for creating cfunctions/cmethodsthis way.This is probably the reason for the hack that__doc__ and __text_signature__ are stored in the same C string(with the __doc__ and __text_signature__ descriptors extractingthe relevant part).
Solution: stop assuming that a single PyMethodDef entryis sufficient to describe a cfunction/cmethod.Instead, we could add some flag which means that one of the PyMethodDeffields is instead a pointer to an additional structure.Or, we could add a flag to use two or more consecutive PyMethodDefentries in the array to store more data.Then the PyMethodDef array would be used only to constructcfunctions/cmethods but it would no longer be used after that.
10. Slot wrappers have no custom documentation
Right now, slot wrappers like __init__ or __lt__ only have verygeneric documentation, not at all specific to the class:
The same happens for the signature:
As you can see, slot wrappers do support __doc__and __text_signature__.The problem is that these are stored in struct wrapperbase,which is common for all wrappers of a specific slot(for example, the same wrapperbase is used for str.__eq__ and int.__eq__).
Solution: rethink the slot wrapper class to allow docstrings(and text signatures) for each instance separately.
Slot Wrapper Python Online
This still leaves the question of how extension modulesshould specify the documentation.The PyTypeObject entries like tp_init are just function pointers,we cannot do anything with those.One solution would be to add entries to the tp_methods arrayjust for adding docstrings.Such an entry could look like
11. Static methods and class methods should be callable
Instances of staticmethod and classmethod should be callable.Admittedly, there is no strong use case for this,but it has occasionally been requested (see for example [1]).
Making static/class methods callable would increase consistency.First of all, function decorators typically add functionality or modifya function, but the result remains callable. This is not true for@staticmethod and @classmethod.
Second, class methods of extension types are already callable:
Third, one can see function, staticmethod and classmethodas different kinds of unbound methods:they all become method when bound, but the implementation of __get__is slightly different.From this point of view, it looks strange that function is callablebut the others are not.
Solution:when changing the implementation of staticmethod, classmethod,we should consider making instances callable.Even if this is not a goal by itself, it may happen naturallybecause of the implementation.
[1] | Not all method descriptors are callable(https://bugs.python.org/issue20309) |
[2] | Duck-typing inspect.isfunction()(https://bugs.python.org/issue30071) |
This document has been placed in the public domain.
Source: https://github.com/python/peps/blob/master/pep-0579.rstThere are a large number of structures which are used in the definition ofobject types for Python. This section describes these structures and how theyare used.
All Python objects ultimately share a small number of fields at the beginningof the object’s representation in memory. These are represented by thePyObject
and PyVarObject
types, which are defined, in turn,by the expansions of some macros also used, whether directly or indirectly, inthe definition of all other Python objects.
PyObject
¶All object types are extensions of this type. This is a type whichcontains the information Python needs to treat a pointer to an object as anobject. In a normal “release” build, it contains only the object’sreference count and a pointer to the corresponding type object.Nothing is actually declared to be a PyObject
, but every pointerto a Python object can be cast to a PyObject*
. Access to themembers must be done by using the macros Py_REFCNT
andPy_TYPE
.
PyVarObject
¶Slot Wrapper Python Download
This is an extension of PyObject
that adds the ob_size
field. This is only used for objects that have some notion of length.This type does not often appear in the Python/C API.Access to the members must be done by using the macrosPy_REFCNT
, Py_TYPE
, and Py_SIZE
.
PyObject_HEAD
¶This is a macro used when declaring new types which represent objectswithout a varying length. The PyObject_HEAD macro expands to:
See documentation of PyObject
above.
PyObject_VAR_HEAD
¶This is a macro used when declaring new types which represent objectswith a length that varies from instance to instance.The PyObject_VAR_HEAD macro expands to:
See documentation of PyVarObject
above.
Py_TYPE
(o)¶This macro is used to access the ob_type
member of a Python object.It expands to:
Py_REFCNT
(o)¶This macro is used to access the ob_refcnt
member of a Pythonobject.It expands to:
Py_SIZE
(o)¶This macro is used to access the ob_size
member of a Python object.It expands to:
PyObject_HEAD_INIT
(type)¶This is a macro which expands to initialization values for a newPyObject
type. This macro expands to:
PyVarObject_HEAD_INIT
(type, size)¶This is a macro which expands to initialization values for a newPyVarObject
type, including the ob_size
field.This macro expands to:
Python Wrapper Example
PyCFunction
¶Type of the functions used to implement most Python callables in C.Functions of this type take two PyObject*
parameters and returnone such value. If the return value is NULL, an exception shall havebeen set. If not NULL, the return value is interpreted as the returnvalue of the function as exposed in Python. The function must return a newreference.
PyCFunctionWithKeywords
¶Type of the functions used to implement Python callables in C that takekeyword arguments: they take three PyObject*
parameters and returnone such value. See PyCFunction
above for the meaning of the returnvalue.
PyMethodDef
¶Structure used to describe a method of an extension type. This structure hasfour fields:
Field | C Type | Meaning |
---|---|---|
ml_name | char * | name of the method |
ml_meth | PyCFunction | pointer to the Cimplementation |
ml_flags | int | flag bits indicating how thecall should be constructed |
ml_doc | char * | points to the contents of thedocstring |
The ml_meth
is a C function pointer. The functions may be of differenttypes, but they always return PyObject*
. If the function is not ofthe PyCFunction
, the compiler will require a cast in the method table.Even though PyCFunction
defines the first parameter asPyObject*
, it is common that the method implementation uses thespecific C type of the self object.
The ml_flags
field is a bitfield which can include the following flags.The individual flags indicate either a calling convention or a bindingconvention. Of the calling convention flags, only METH_VARARGS
andMETH_KEYWORDS
can be combined. Any of the calling convention flagscan be combined with a binding flag.
METH_VARARGS
¶This is the typical calling convention, where the methods have the typePyCFunction
. The function expects two PyObject*
values.The first one is the self object for methods; for module functions, it isthe module object. The second parameter (often called args) is a tupleobject representing all arguments. This parameter is typically processedusing PyArg_ParseTuple()
or PyArg_UnpackTuple()
.
METH_KEYWORDS
¶Methods with these flags must be of type PyCFunctionWithKeywords
.The function expects three parameters: self, args, and a dictionary ofall the keyword arguments. The flag is typically combined withMETH_VARARGS
, and the parameters are typically processed usingPyArg_ParseTupleAndKeywords()
.
METH_NOARGS
¶Methods without parameters don’t need to check whether arguments are given ifthey are listed with the METH_NOARGS
flag. They need to be of typePyCFunction
. The first parameter is typically named self and willhold a reference to the module or object instance. In all cases the secondparameter will be NULL.
METH_O
¶Methods with a single object argument can be listed with the METH_O
flag, instead of invoking PyArg_ParseTuple()
with a 'O'
argument.They have the type PyCFunction
, with the self parameter, and aPyObject*
parameter representing the single argument.
These two constants are not used to indicate the calling convention but thebinding when use with methods of classes. These may not be used for functionsdefined for modules. At most one of these flags may be set for any givenmethod.
METH_CLASS
¶The method will be passed the type object as the first parameter ratherthan an instance of the type. This is used to create class methods,similar to what is created when using the classmethod()
built-infunction.
METH_STATIC
¶The method will be passed NULL as the first parameter rather than aninstance of the type. This is used to create static methods, similar towhat is created when using the staticmethod()
built-in function.
One other constant controls whether a method is loaded in place of anotherdefinition with the same method name.
METH_COEXIST
¶The method will be loaded in place of existing definitions. WithoutMETH_COEXIST, the default is to skip repeated definitions. Since slotwrappers are loaded before the method table, the existence of asq_contains slot, for example, would generate a wrapped method named__contains__()
and preclude the loading of a correspondingPyCFunction with the same name. With the flag defined, the PyCFunctionwill be loaded in place of the wrapper object and will co-exist with theslot. This is helpful because calls to PyCFunctions are optimized morethan wrapper object calls.
How To Use Python Wrapper
PyMemberDef
¶Structure which describes an attribute of a type which corresponds to a Cstruct member. Its fields are:
Field | C Type | Meaning |
---|---|---|
name | char * | name of the member |
type | int | the type of the member in theC struct |
offset | Py_ssize_t | the offset in bytes that themember is located on thetype’s object struct |
flags | int | flag bits indicating if thefield should be read-only orwritable |
doc | char * | points to the contents of thedocstring |
type
can be one of many T_
macros corresponding to various Ctypes. When the member is accessed in Python, it will be converted to theequivalent Python type.
Macro name | C type |
---|---|
T_SHORT | short |
T_INT | int |
T_LONG | long |
T_FLOAT | float |
T_DOUBLE | double |
T_STRING | char * |
T_OBJECT | PyObject * |
T_OBJECT_EX | PyObject * |
T_CHAR | char |
T_BYTE | char |
T_UBYTE | unsigned char |
T_UINT | unsigned int |
T_USHORT | unsigned short |
T_ULONG | unsigned long |
T_BOOL | char |
T_LONGLONG | long long |
T_ULONGLONG | unsigned long long |
T_PYSSIZET | Py_ssize_t |
T_OBJECT
and T_OBJECT_EX
differ in thatT_OBJECT
returns None
if the member is NULL andT_OBJECT_EX
raises an AttributeError
. Try to useT_OBJECT_EX
over T_OBJECT
because T_OBJECT_EX
handles use of the del
statement on that attribute more correctlythan T_OBJECT
.
flags
can be 0 for write and read access or READONLY
forread-only access. Using T_STRING
for type
impliesREADONLY
. Only T_OBJECT
and T_OBJECT_EX
members can be deleted. (They are set to NULL).
댓글