PECL_Gen - the PHP extension generator


Table of Contents
1. Introduction
1.1. What is it?
1.2. Features
1.3. How to use it
2. The XML description
2.1. Basics
2.2. Release information
2.3. Dependencies
2.3.1. --with...
2.3.2. Libraries
2.3.3. Header files
2.4. Custom code
2.5. Functions
2.5.1. Public functions
2.5.2. Internal functions
2.6. Constants
2.7. php.ini parameters and internal variables
2.8. Resources
2.8.1. Resource creation and destruction
2.9. Classes
2.10. Streams
2.11. config.m4 fragments
2.12. Makefile fragments
2.13. Tests
3. XML input parsing
3.1. Includes
3.1.1. External entities
3.1.2. XInclude
3.1.3. code tags
List of Examples
2-1. Extension basics
2-2. Release information
2-3. License
2-4. A logo image
2-5. Dependencies
2-6. --with...
2-7. Library dependencies
2-8. Header file dependencies
2-9. MINIT()
2-10. MINFO()
2-11. PHP Constants
2-12. Globals
2-13. Globals
2-14. Resources
2-15. Resource creation
2-16. Resource destruction
2-17. config.m4 additions
2-18. Makefile fragments

Chapter 1. Introduction

1.1. What is it?

PECL_Gen is a tool that can automatically create the basic framework for a PHP extension from a rather simple XML specification file.

It also supports the simpler (but less powerful) prototype file format as used by the shell script ext_skel that is distributed togehter with the PHP source code.

PECL_Gen, unlike the older ext_skel solution, is a 100% PHP 5 bases solution and does not require any external tools like awk or sed. It only uses PHP functions that are always enabled in a default build so it should be usable on any platform that PHP itself runs on.

The code generated by PECL_Gen is designed to work with both the PHP 4 and PHP 5 extension APIs, PHP 5 is only required for the conversion of the XML spec file to C/C++ code.


1.2. Features

pecl-gen tries to support as many extension writing aspects as possible. This currently includes code and documentation generation for:

functions
constants
php.ini configuration directives
resource types
per-thread global variables

pecl-gen also generates config.m4 configuration files for Unix-like build environments, VisualStudio *.dsp project files for Windows and the package.xml files needed for PEAR/PECL packaging.

DocBook XML documentation templates suitable for inclusion in the PHP or PEAR manual are generated in the hope that it will ease the task of documenting extension.

Test script templates for automatic regression testing are already created but due to some missing features within the "make test" infrastructure it is not possible to use the generated tests with standalone PECL extensions right now.


1.3. How to use it

There are currently three different modes of operation for pecl-gen. In its default mode it can create a complete ready-to-compile extension from an XML description (documented in the next chapter). In ext_skel compatibility mode it generates the extension from some command line parameters and an optional function prototype file and in immediate mode it just takes a function prototype from command line and writes a C code skeleton for just that function to standard output.

ext_skel compatibility and immediate mode are not documented here, please refer to the original ext_skel documentation instead.

Below you find a hardcopy of the pecl-gen --help output:


pecl-gen [-h] [--extname=name] [--proto=file] [--skel=dir] [--stubs=file] 
  [--no-help] [--xml[=file]] [--full-xml] [--function=proto] [specfile.xml]

  -h|--help          this message
  --force            overwrite existing directories
	--function         create a function skeleton from a proto right away
  --version          show version info

  the following options are inherited from ext_skel:

  --extname=module   module is the name of your extension 
  --proto=file       file contains prototypes of functions to create
  --xml              generate xml documentation to be added to phpdoc-cvs

	these wait for functionality to be implemented and are ignored for now ...
  --stubs=file       generate only function stubs in file
  --no-help          don't try to be nice and create comments in the code
                     and helper functions to test if the module compiled

  these are accepted for backwards compatibility reasons but not used ...
  --full-xml         generate xml documentation for a self-contained extension
                     (this was also a no-op in ext_skel) 
  --skel=dir         path to the skeleton directory
                     (skeleton stuff is now self-contained)
   

Chapter 2. The XML description

2.1. Basics

The top level container tag describing an extension is the extension tag. The name of the extension is given in the name attribute. The extension name has to be a valid C name as it is used as both the extensions directory name and the base name for several C symbols within the generated C code.

The tags summary and description should be added at the very top of your extensions. The summary should be a short one-line description of the extension while the actually description can be as detailed as you like. Both are later used to generate the package.xml file and the documentation for your extension. The summary line is also put into the phpinfo() output of your extension.

Example 2-1. Extension basics


<extension name="sample">
 <summary>A sample PHP extension</summary>
 <description>
  This is a sample extension specification
  showing how to use PECL_Gen for
  extension generation.
 </description>
 ...

    

2.2. Release information

The release information for your extension should include the extension authors and maintainers, the version number, state and release date, the chosen license and maybe a change log describing previous releases. It is also possible to specify an image file to be used as a product logo with the phpinfo() output block for the extension.

The maintainers, release and changelog tags specifications are identical to those in the PEAR package.xml specification so please refer to the PEAR documentation here.

Example 2-2. Release information


...
  <maintainers>
    <maintainer>
      <user>hholzgra</user>
      <name>Hartmut Holzgraefe</name>
      <email>hartmut@php.net</email>
      <role>lead</role>
    </maintainer>
  </maintainers>

  <release>
    <version>1.0</version>
    <date>2002-07-09</date>
    <state>stable</state>
    <notes>
     The sample extension is now stable
    </notes>
  </release>

  <changelog>
    <release>
      <version>0.5</version>
      <date>2002-07-05</date>
      <state>beta</state>
      <notes>First beta version</notes>
    <release>
    <release>
      <version>0.1</version>
      <date>2002-07-01</date>
      <state>alpha</state>
      <notes>First alpha version</notes>
    <release>
  </changelog>
...

    

The license tag is a little more restrictive as its package.xml counterpart as it is used to decide which license text should actually be written to the LICENSE. For now you have to specify either PHP, BSD or LGPL, any other value is taken as 'unknown'.

Example 2-3. License


...
  <license>PHP</license>
...

    

A logo to be used within the extensions phpinfo() block can be specified using the logo tag. The actual file name of the logo image has to be given in the scr=... attribute, Its MIME type defaults to image/gif unless a different type is specified using the mimetype=... attribute. Automatic MIME type detection is planned for a future release.

Example 2-4. A logo image


...
  <logo src="sample_logo.gif" mimetype="image/gif" />
...

    

2.3. Dependencies

Dependencies are specified within the deps environment. Within the deps itself it is possible to set the programming language and target platforms using the language=... and platform=... attributes.

Supported languages are C (lang="c") and C++ (language="cpp"). The language selection does not influence code generation itself (pecl-gen always generates C code) but the way extensions are compiled and linked. C++ should only be selected to interface to external C++ libraries.

Supported platforms are currently Unix-like systems (platform="unix"), Microsoft Windows (platform="win32") or both (platform="all").

with, lib and header tags may be used within the deps section to add configure switches and library and header file dependencies.

Example 2-5. Dependencies


...
  <deps language="cpp" platform="win32">
  ...
  </deps>
...

    

2.3.1. --with...

When building an extension on Unix-like systems or within the Cygwin environment under Windows the configure script will try to figure out where external libraries and header files needed by an extension are installed on the build system. Using a "with" option it is possible to specify where to actually look for libraries and headers. This way it is possible to override search paths if things are not installed in the default system paths or to specify the exact version of a package to be used if multiple versions are installed on the target system.

The with tag takes three attributes: name=... for the actual name of the "with" option, testfile for the relative path of a file to check for while running the configure script and a list of default paths to check if no path is given as an argument to the "with" option in defaults.

Name and defaults are set to the extension base name and "/usr:/usr/local" if no values are given. The testfile attribute is mandatory.

Textual data enclosed by the with is used to describe the "with" option in the output of configure --help calls.

Example 2-6. --with...

 
 
... 
    <with defaults='/usr:/usr/local' testfile='include/sample.h'>sample install</with> 
... 

      

2.3.2. Libraries

Needed external libraries are specified using the lib tag. The name=... attribute is mandatory and takes the library base name. A library dependency by the name "sample" is actually referring to a library file named libsample.a for a static or libsample.so for a dynamic library on Unix-like systems or to sample.DLL on Windows.

It is possible to specify the name of a function symbol expected to be provided by the library using the function=... attribute. This function symbol is being looked for when configure is run for the extension. This way it is possible to verify that the right version of a library was found. With VisualStudio on windows it is not possible to perform this check, in this case the library is just added to the project file.

Example 2-7. Library dependencies


...
    <lib name="sample_u" platform="unix"  function="sample_v2" />
    <lib name="sample_w" platform="win32" />
    <lib name="sample"   platform="all" />
...

     

2.3.3. Header files

It is possible to specify header files needed by the extension using the header. Any headers specified have to exist in the include path set for compiling (see also the section on --with above). #include statements for the specified headers are the last ones to be put into the generated code unless you set the prepend="yes" attribute to have it put in front of the other #includes.

By default header files are searched for in the include subdirectory of the path given in with. If a different relative path needs to be used it can be defined using the path attribute.

Example 2-8. Header file dependencies


...
    <header name="include_me_first.h" prepend="yes" />
    <header name="sample.h" />
    <header name="foobar.h" path="include/foo/bar" />    
...

     

2.4. Custom code

Custom code may be added to your extension source files using the code tags. The role=... and position=... tags specify the actual place in there generated source files where your code should be inserted.

Possible roles are 'code' (default) for the generated C or C++ code file and 'header' header file. Possible positions are 'top' and 'bottom' (default) for insertion near the beginning or end of the generated file.


2.5. Functions

Three different kinds of functions may be defined using the function tag: public, internal and private functions. Public functions are functions you want to make available at the PHP code level, internal functions are C functions to be used by the PHP extension API and private functions are static C helper functions to be used within your extension.

Public function names should by convention be prefixed with the extension name followed by an underscore, internal functions are one of MINIT, MSHUTDOWN, RINIT, RSHUTDOWN or MINFO, and private functions may have any legal C function name (unless you experience duplicate symbol errors while compiling or linking the extension).


2.5.1. Public functions

The definition of a public PHP function requires the attributes role="public" and name=... and at least the proto tag to be set.

The function name may be any valid C name. To comply to PHP coding conventions a public function provided by an extension should always be prefixed by the extension name though.

The function prototype specified using the proto tag is parsed to extract the return type, the function name and the argument list. The function name in the prototype has to match the name attribute given in the function.

Valid types to be used for arguments and the return type are:

bool
int
float
string
array
object
mixed
callback
resource [typename]
stream

Argument names in prototypes are not prepended by a $ sign by convention.

Function documentation should be given using the summary tag for a one line description and the description tag for a more detailed description. Both are copied to the generated DocBook XML documentation for that function. Within description DocBook tags may be used. Be aware though that while pecl-gen accepts this validating XML parsers may complain when reading/validating an extension specification file.

Skeleton code for parameter parsing and result passing is generated if no code fragment is specified for a function. A code section is inserted right after the generated parameter parsing code. Setting a return value is up to the code fragment if any is given, adding a template doesn't make sense in this case.

Note

Maybe some stuff regarding actual coding should be added here?


2.5.2. Internal functions

The definition of an internal function requires just the role="internal" and name=... attributes. The name can only be one of the following:

MINIT

The module initialization function. This is called once at startup of a PHP server module or standalone (CLI or CGI) binary.

Example 2-9. MINIT()


...
    <function role="internal" name="MINIT">
      <code>
<![CDATA[
  int dummy = 42;
    
  dummy = dummy;
]]>
      </code>
    </function>
...
         
MSHUTDOWN

The module shutdown function. This is called once when the PHP server module or standalone binary is properly terminated. It may not be called on program crashes or other critical errors.

RINIT

The request shutdown function. This is called by PHP server modules before actually executing a PHP script request or once right after MINIT() for standalone binaries (CGI or CLI).

RSHUTDOWN

The request shutdown function. This is called by PHP server modules after execution of PHP code has been finished or terminated. Is called even if critical PHP errors occurred but you can not rely on it being called on critical errors or crashes on the C level.

MINFO

The phpinfo() handler for this extension. It will be called whenever phpinfo() is invoked or when a standalone PHP binary is called with the -i command line option.

The default code generated when no code section is given includes the extension name, summary line and release version and date, the optional logo image if specified, and the global and actual values of all php.ini directives specified.

Example 2-10. MINFO()


...
    <function role='internal' name='MINFO'>
      <code>
<![CDATA[
  php_info_print_table_start();
  php_info_print_table_header(2, "test", "table");
  php_info_print_table_end();
]]>
      </code>
    </function>
...
         

code sections for the internal functions may be written as if they were C function bodies, including local variable definitions.


2.6. Constants

PHP constants are defined using constant tags within the constants environment.

The actual constant name, type and value are specified using the name=..., type=... and value=... attributes. The constant name has to be a valid C name. PHP constant names should use uppercase letters only by convention. Possible types are "string", "int" and "float", the possible values depend on the type. For "int" and "float" you may use either numeric strings or the names of C constants (either true ANSI C/C++ constants or values #defined using the C preprocessor. "string" values are always used "as is", no constants may be used here.

It is sufficient to specify a constant name only if a C integer constant should be available under the same name in PHP, too.

A descriptive text may be given as content of the constant tag. This text will be used when generation the DocBook XML documentation.

Example 2-11. PHP Constants


...
 <constants>
  <constant name="SAMPLE_INT"   type="int"    value="42">
   A sample integer constant.
  </constant>
  <constant name="SAMPLE_FLOAT" type="float"  value="3.14">
   A sample floating point constant.
  </constant>
  <constant name="SAMPLE_FLOAT" type="float"  value="M_PI">
   A sample floating point constant using a #defined constant
  </constant>
  <constant name="SAMPLE_STRING" type="string" value="Hello World!">
   A sample string constant.
  </constant>
  <constant name="MY_CONST">
    A shortcat for #defined integer constants
  </constant>
 </constants>
...

    

2.7. php.ini parameters and internal variables

An extension may define variables that are global to either the complete extension or to a specific request. True globals that are global to the complete extensions do not need any registration so they can be defined using C code within the global code tag.

Module globals that are only global to a single request need to be managed to ensure thread safety and initialization on request initialization. php.ini directive values are also stored as module globals but need some additional definitions.

All global definitions have to be put into a globals environment. Simple module globals are defined using the global tag. php.ini directives are defined using the phpini tag.

A global definition requires the name=... and type=... attributes to be set as valid C names and types. Which C types are allowed depends on what type definitions have been included from header files. The available types are not known when pecl-gen parses the XML specification so that types are only checked for valid name format here. Specifying a type that is not a basic C type or defined in any included file will lead to error messages when compiling the generated extension code later.

Initial values may be specified using the value=... attribute. This feature should only be used for simple numeric values, anything more complex should better be initialized within the extensions RINIT() function.

php.ini directives may be defined using the phpini within a globals environment. To define a php.ini directive you have to specify its name, type and default value using the name=..., type=... and value=... attributes.

Valid directive names are C variable names. The actual directive name is the extension name followed by a single dot '.' and the specified name. Valid directive types are bool, int, float and string.

Directive default values are passed to the engine as strings, so you may not use any C constants or preprocessor macros here. The default value strings are parsed by the OnUpdate handler registered for that directive. No value checking takes place during extension code generation or compilation, this is done by the registered OnUpdate handler at runtime during request initialization. The OnUpdate handler defaults to the appropriate internal OnUpdatetype handler unless you specify a different handler using the onupdate=... attribute.

The directive value may be changed at any time unless you specify an access=... attribute. Possible values are:

system

may only be set globally in php.ini or the web server configuration

perdir

may be changed in local .htaccess files

user

may be changed by PHP code

all

may be changed by anyone at any time

The content data of phpini tags is used to generate documentation for the defined directive. global definitions may also include content data but it is for internal documentation only, it is not used in DocBook XML generation (yet).

Example 2-12. Globals


...
 <globals>
  <global name="sample_int"    type="int"    value="42" />
  <global name="sample_float"  type="float"  value="3.14" />
  <global name="SAMPLE_STRING" type="char *" />

  <phpini name="my_int" type="int" value="42" onupdate="OnUpdateLong" access="all">
   Definition for directive "sample.my_int"
  </phpini>
 </globals>
...

    

Access to the modul globals and ini parameters is provided in a thread safe manner through the EXTNAME_G() macro (replace EXTNAME with the upper cased name of your extension).

Example 2-13. Globals


<extension name="foobar">
 ...
 <globals>
  <global name="sample_int"    type="int"    value="42" />
 </globals>
 ...
 <function ...>
  ...
  <code>
   ...
   int foo = FOOBAR_G(sample_int); // get global value 
   ...
   FOOBAR_G(sample_init) = 42; // set global value
   ...
  </code>
 </function>

    

2.8. Resources

You may define PHP resource types within a resources environment. For each resource you have to specify the name=... and payload=... attributes. The name has to be a valid C name and the payload has to be a valid C type specifier. The payload type can only be checked for the correctness of its form as the actual type definitions from included header files are not known to the extension generator when it generates the extension code.

The actual resource data structure carries a pointer to the payload type. You may specify that PHP shall allocate and free the actual payload by setting the alloc=... attribute to "yes". If the payload is allocated by a library function or by yourself you should set alloc=... to "no" (the default value).

Resources are destructed when the last variable reference refering to them is unset or at request shutdown. If your resource payload needs to be cleaned up as well you have to add an appropriate C code snippet that takes care of this using the destruct tag. Within the destructor snippet you may refer to the allocated payload using the resource pointer variable.

You don't need to take care of destruction yourself if your resource payload is allocated by PHP (alloc="yes") and needs no further cleanup work besides releasing the allocated memory.

Example 2-14. Resources


...
  <resources>
    <resource name="sample_resource" payload="float" alloc="yes">
      <description>
        A simple floating point resource
      </description>
      <!-- no <destruct> needed due to the alloc attribute -->
    </resource>

    <resource name="sample_struct" payload="struct foobar" alloc="no">
      <description>
        A foobar resource managed by an external foobar lib.
      </description>
      <destruct>
        foobar_release(resource);
      </destruct>
    </resource>
  </resources>
...

    

2.8.1. Resource creation and destruction

The creation of resource instances is not defined within resource. This is a task to be handled by public PHP functions instead.

Example 2-15. Resource creation


  <function name="foo_open">
   <proto>resource foo foo_open(string path)</proto>
   <code>
    return_res = foo_open(path);

    if (!return_res) RETURN_FALSE;
   </code>
  </function>

     

Resources are freed by simply removing them from the resource list that they are in. The resources destructor function is automaticly called on list removal.

Example 2-16. Resource destruction


  <function name="foo_close">
   <proto>void foo_close(resource foo foores)</proto>
   <code>
     zend_list_delete(Z_LVAL_P(foores));
   </code>
  </function>

     

2.9. Classes

OO support is planned for a future release but not implemented yet. OO code generation may be limited to PHP 5 extensions as unlike the rest of the extension API the OO implementation changed a lot between PHP 4 and 5.


2.10. Streams

Stream filter and wrapper support is experimental and not yet added to the released code base.


2.11. config.m4 fragments

Additional configure checks can be added to the generated config.m4 file used by Unix/Cygwin builds using the configm4 tag. Using the 'position' attribute it is possible to specify whether the additional code is to be added at the top or bottom of the config.m4 file.

Example 2-17. config.m4 additions


<configm4>
  AC_CHECK_PROG(RE2C, re2c, re2c)
  PHP_SUBST(RE2C)
</configm4>

    

2.12. Makefile fragments

Makefile rules may be added using the makefile for Unix/Cygwin builds. Using this it is possible to add dependencies or build rules in addition to the default and auto generated rules.

Example 2-18. Makefile fragments


<makefile>
$(builddir)/scanner.c: $(srcdir)/scanner.re
  $(RE2C) $(srcdir)/scanner.re > $@
</makefile>

    

2.13. Tests

For now PECL_Gen generates one test case skeleton for every function prototype.

Support for customizable test script generation is planned for a future release.


Chapter 3. XML input parsing

3.1. Includes

The XML parser used by PECL_Gen supports inclusion of additional source files using three different ways:

  • external entities

  • a subset of XInclude

  • the source attribute of code tags